Causal Inference and estimating the counterfactual

Marc Deveaux
6 min readNov 29, 2021
Photo by frank mckenna on Unsplash

My notes from the following paper written by Hal R. Varian https://www.pnas.org/content/pnas/113/27/7310.full.pdf

The critical step (and the gold standard!) in any causal analysis is estimating the counterfactual, a prediction of what would have happened in the absence of the treatment

The issue of confounding variables

When doing a regression, we need to be aware of the omitted features which are classified as confounding variables. Those variables affect both y and x.

For example, let’s say you want to predict the number of sunburn based on the rate of ice cream consumption. Your regression model wouldn’t give a good estimate of the causal effect, because the confounding variable “hot temperature” (which affect both y and x) is not taken into account.

It is worth noting that this is not a missing variable issue, as those variables are taking care of by the regression’s error term. The problem here is that the missing variable “hot temperature” affects both the outcome and the predictor.

Few examples:

  • Build a model predicting capita sales in city (y) based on the ad spent for a surf movie in the same city (x). Based on it, predict how sales would respond to a contemplated change in ad spend. In this case, the confounding (and omitted) variable is “interest in surf”: marketers who spent the ad budget knew that and choose to advertise more where the interest is high (Honolulu) and less where it is low (Fargo). So the data is “biased”
  • Those who have good jobs tend to have health care; therefore a regression of health care on income will show a positive effect but the direction of the causality is unclear

This problem is frequent in economics and marketing analysis, and when doing regression we should not assume that the predictors we do not observe (the error term) are orthogonal (means uncorrelated) to the predictors we do observe.

Therefore, to come back to the surf movie example, the ideal data is the one where an incompetent adviser would allocated ad expenses randomly across cities. Only when the ad expenses are truly random can we stop worrying about confounding variables because the predictors will automatically be orthogonal to the error term.

Estimating causal effects

The best way to estimate causal effects is controlled experiments: you apply a “treatment” to some set of “subject” and observes some “outcomes”. The outcomes for the treated subjects can be compared with the outcomes for the untreated subjects (the control group)to determine the causal effect of the treatment on the subjects.

Sometimes we are interested in how the treatment affected those who were actually treated but were not necessarily randomly chosen for the treatment. In this case (named “impact of the treatment on the treated”), if we can’t do a controlled experiments, we have 2 modeling possibilities:

  • “selection on observables”: build a predictive model on who received treatment
  • “selection on unobservables”: find natural experiments that are “as good as random”, so that we avoid the confounding variables problem explain earlier

Controlled experiments are not always feasible, so other techniques exist: natural experiments, instrumental variables, regression discontinuity and difference in differences.

Basic identity of casual inference

The observed outcome of a treatment can be decomposed as:

  • Outcome for treated — Outcome for untreated = [Outcome for treated — Outcome for treated if not treated] + [Outcome for treated if not treated — Outcome for untreated]
  • This is equal to: Impact of treatment on treated + selection bias

Here, “Outcome for treated if not treated” is the counterfactual and [Outcome for treated if not treated — Outcome for untreated] is equal to 0 if the assignment was random.

How to get the counterfactual?

The counterfactual is a predictive model developed using data from before the experiment was run. So we train the model on past data and the prediction (on the same period the experiment is run) is the counterfactual. In order to do this we can use the train-test-treat-compare (TTTC) method.

This approach is to get the “impact of the treatment on the treated” and it is a generalization of the treatment-control approach.

Note that when building the predictive model, we don’t want to use predictors that may be affected by the treatment (otherwise you have a confounding variable problem). For example, during the “holiday season”, we often observe an increase in ad spend and an increase in sales. Therefore the “holiday season” is a confounding variable, and a simple regression of spend on sales would give a misleading estimate. Solution: pull the confounding out of the error term and model the “holiday season” as an additional predictor.

Regression discontinuity

We need to be sure to understand the data-generating process when trying to develop a model of who was selected for the treatment. One common selection rule is to use a threshold. In this case, observations close to, but just below, a threshold should be similar those close to, but just above, the threshold. Therefore, we can compare subjects close to the threshold but on the 2 different sides. See this article for examples: https://www.betterevaluation.org/en/evaluation-options/regressiondiscontinuity

Natural experiments

Sometimes you can find “natural experiments” which are as good as random. The paper gives the example of the Super Bowl, where companies pay adverts a long time in advance, while not knowing which city teams will be in final. Also, when your city team is playing, we know that you will have a 10–15% increase in the audience. As a result, 2 essentially randomly chosen cities will experience a 10% increase in ad impressions for the movie titles shown during the Super Bowl. You can then estimate the counterfactual: what would have been the boost without the plus 10% additional ad impressions in those cities.

Instrumental variables

“Instrumental variable” are variables that are thought to be independent of potential confounders. “The method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible.[…] Intuitively, IVs are used when an explanatory variable of interest is correlated with the error term, in which case ordinary least squares and ANOVA give biased results. A valid instrument (i.e an instrument that moves X around independently of any movement in the error term) induces changes in the explanatory variable but has no independent effect on the dependent variable, allowing a researcher to uncover the causal effect of the explanatory variable on the dependent variable” (https://en.wikipedia.org/wiki/Instrumental_variables_estimation).

Difference in Differences

In this case, we have two groups, the treated and the untreated, and two time periods, before treatment and after treatment. We also have a number of predictors that may affect the observed values of the outcome for each group. The goal is to estimate a predictive model of what the outcome would be for the treated group if it were not treated. To accomplish this goal, one can use a model of the observed outcomes of the untreated group in the posttreatment period.

Example with campaign sales

  • Sta = sales after ad campaign for treated groups
  • Stb = sales before ad campaign for treated group
  • Sca = sales after ad campaign for control groups
  • Scb = sales before ad campaign for control group

The counterfactual is based on the assumption than that the (unobserved) change in purchases by the treated would be the same as the (observed) change in purchases by the control group.

To get the impact of the ad campaign, we then compare the predicted counterfactual sales to the actual sale

  • effect of treatment on treated = (Sta — Stb) — (Sca — Scb)

In practice, it is pretty similar to the TTTC process.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Marc Deveaux
Marc Deveaux

No responses yet

Write a response