in the last days I have been working with markov chain for a multi touch (data driven) attribution model, I have found too much important information at the macro level, for example, the ChannelAttribution package gives me the attribution of each one of the channels of the process to achieve a conversion (either TV, search or call-center) but this attribute is done taking into account all customer journeys, and also the elimination effects for each channel. My question is the following, at a micro level of the analysis, can I obtain at the customer level, which was the channel that most attributed to their purchase decision? That is, which is the channel that had the greatest impact for each one of them customers to make their purchase? it does not matter if a conversion was not made or not.
For example, I imagine an output like the following:
| Curtomer ID | Channel Atribution by curtomer | Conversion |
|---|---|---|
| 1 | TV | Conversion |
| 2 | TV | Conversion |
| 3 | Search | Non-Conversion |
| 4 | Call-center | Conversion |
| 5 | TV | Non-Conversion |
| 6 | Call-center | Conversion |
I would be grateful, also sorry for my English I hope to be clear.
sorry for the late reply, maybe this will be helpful for someone else.
The first thing you'll need to do is get your data into shape, specifically a long shape. I've built a sample below for the first 3 Customer IDs in your output table:
|Customer ID|Channel |Conversion | |----------|------------|--------------| |1 |TV |Conversion | |1 |TV |Conversion | |1 |Call Centre |Conversion | |2 |TV |Conversion | |2 |TV |Conversion | |3 |Search |Conversion | |3 |Search |Non-conversion| |3 |Call Centre |Non-conversion| Notice that if look at the most popular channel for each Customer ID, that it will correspond to the 'Channel Atribution by curtomer' field in your output able?
You can do this by:
There is some duplication on the conversion and count fields, ignore for now.
Which corresponds to your imagined output.
Tie breakers
It will happen that a Customer ID will have two or more equal number of channels. For instance two TV and two Search. How do we manage this? If you really must have one row per customer, then depending on what you're planning on doing, you'll need to either:
Build some priority ranking logic where rules dictate which channel is counted as the attribution.
Build some logic that randomly attributes the channel in the case of tie breakers.
I hope that helps, I've kept the answer code free but had R/Python in the back of my mind. It could possible be implemented in Excel, but someone far smarter than I would need to contribute that answer.