Counterfactual Adjusted Shapley Value algorithm for online advertising attribution
Source
“Shapley Meets Uniform: An Axiomatic Framework for Attribution in Online Advertising”, paper available on http://www-2.rotman.utoronto.ca/userfiles/brownbags/operations/files/paper_Raghav_Singal.pdf
Background
- One of the central challenges in online advertising is attribution: assessing the contribution of individual advertiser actions (emails, display ads, etc.) to eventual conversion
- The value of each channel is an important input to media mix optimization, helps build an understanding of the customer journey and justify the marketing spending. However, despites the importance of the attribution problem, there is a lack of systematic approach to tackle this problem
- Attribution is linked to “counterfactual”: i.e what would have happened if no action ads were taken
- CASV algorithm proposes a Markovian model for the user journey through the conversion funnel, in which ad actions may have disparate impacts at different stages
- CASV stands for Counterfactual Adjusted Shapley Value and is based on Shapley value and Markov chain
Concept
First, we use Markov Chain to capture the user journey with a system of state — action values. Then, we create 3 matrices: an initial state vector, an action matrix and a transition matrix. Once done, we calculate a new metric for attribution in the Markovian model of user behavior.
One of the advantage of the CASV algorithm is that it takes into account the counterfactual, i.e what would have happened if no marketing actions were taken (on the contrary of Shapley Value).
Definitions
State: user state at a given time: “At every period, the user is in one of the finitely many states”. Example of user states:
- Smartphone: the user connected to your website using smartphone
- PC: the user connected to your website using PC
- App: the user connected using some phone application
- Quit: the user won’t use your service anymore. This state can be decided if the user was inactive after X days for example
- Conversion: the user converted, i.e bought the product / service you were trying to sell him/her
Action: represent one of the marketing touchpoint interacting with the user: “the advertiser observes the state and takes an action”. Examples include:
- Marketing email campaign: if the user opened a marketing campaign email
- Banners: if the user saw a banner
- Google ads
- SNS
- None: no marketing action were taken at that time
- Quit: company decide they are not interested in this user anymore
Basically, “states” are the different activities from the user point of view, while “actions” are the marketing touchpoints from the business point of view.
Transition probabilities: the probability that a user moves from a State X to a State Y given the advertiser took an action at State X
Capturing the user behavior
We create a chain of user state — action going side by side. The links of the chain are different point in time. The notion of time here can be fixed or flexible. A fixed time would be if one day = one link. A flexible timing would be if we define that if the user didn’t connect with our services for 10 days straight, we say the user state is none.
A representation of a chain could be: [(PC, None), (PC, Google Ads), (SP, email campaign), (Conversion, None)]
To simplify, let’s say each link represents a day and the first link is June 1st:
- On June 1st, the user connected to the business website using his PC. No marketing actions were taken
- On June 2nd, the user connected to the business website using his PC, he also saw one of the business ads on Google thanks to the marketing team action
- On June 3rd: the user connected to the business using a smartphone and opened an email marketing campaign that the marketing team sent him
- On June 4th: the user converted to the service the business was proposing him
The chain stops after a predetermined period (like 30 days) or once the user converted or quit.
Preparing the dataset
Once you decided on what conversion type you want to capture, you will have to define the different states and actions you want to select as well as the time constraint. Default ones to include are user state “quit” and “conversion” as well as actions “quit” and “none”. A priori, the more detailed is it, the better. Then, choose a random number of users and extract the data.
A good way to prepare the data is to have 4 columns: user_id, date/time, state, action. You then sort by user_id + date/time, so that it looks like this:

Creating the matrices
Initial State Vector: state probability as of the first day/time. The vector length is the number of different user state category you defined and the sum is 1. To take back our previous example, it would be as of June 1st, 68% of user state was PC, 12% was SP, etc.
Action Matrix: it stands for the action probability of each state. It is a single matrix where rows length is the number of unique state categories you have and columns is the same thing for actions. To make it easier, row 1 should be “quit” state and row 2 “conversion”. Column 1 is “quit” and column 2 is “none”. The matrix answers the question: for a given user state, what is the probability of having this action? For instance, when the users log on my website using smartphone, 10% of the related action is “email marketing campaign”. Therefore, each rows sum up to 1
Transition matrices: state transition matrices for each action. In order to prepare theses matrices, you will have to add a column “next state” to the dataset. It just corresponds to the user next state for a given row and state.
- For each marketing action, what is the probability that a user moves from one state to another
- Example: if I clicked on a banner while logging from my PC today, what is the probability that tomorrow I will log in using a smartphone
Therefore, the number of matrices you will have will be equal to the number of distinct marketing action categories. Number of rows and columns will be the number of unique state categories you have. Each row should be equal to 1. Rows are the current state and the columns are the next state.
The algorithm

Step 1: Create a matrix filled with 0 of size S (unique state length) by A (unique action length). Note that conversion / quit / none are excluded
Step 2: For each different user state that exist (quit or conversion excluded), do:
Step 3: sample a path
- you should have a predetermined number of sample, for example 10,000
- the path length is predetermined and stopped at conversion or quit, for example 1,000
- example of a sample path of length 3: [(2,1), (3,1), (2,2)]. The first number is the state and the second number the action. So if state 2 means PC and action 1 means no action, this first path’s link is “user connected to PC and there was no marketing action”
- each path is randomly created from the probability of our matrices, because we know: given a user state, what is the probability of the related action. And, given a user state and an action, what is the probability of the next user state. And so on…
- a good way to create sample path is to put all your matrices as cumulative. So for example, an initial vector of [0.1, 0.5, 0.3, 0.1] becomes [0.1, 0.6, 0.9, 1] and you do this for each row of each matrices. Then you use the combination of random() function to get a float number between 0 an 1 and bisect_left() function from bisect package which will tell you the related index of the matrix you check
- first choose a random state from initial state (using the cumulative probabilities). Then, choose a random action based on the state that was selected (and based on the cumulative probabilities). Then do a loop as long as state is not quit or converted (and path length is less than that you want) where you append the path with a new state — action using cumulative transition matrix and the cumulative action matrix
- you would need to start a loop for all the different sample paths you are testing
Step 4: if your sample path converted then…
Step 5: for each state — action from the sample path, do:
Step 6: update the counterfactual shapley value. The right part of the equation is:

So you would update the cell value (s, a) of the matrix from step 1 with the given equation. S+ here refers to all states with conversion and quit included.
Step 7 and after: self explanatory
Output
- You get a matrix S * A.
- A positive value indicates the probability increase in conversion for a given state + action
- A negative value indicates that no action would work better
Thoughts
- Flexible time probably works better so that you have no “none” on user state side
- The more details we have about the various marketing actions, the better it is
- Not sure marketing teams enjoy having their actions compared to counterfactual