Causal inference under incentives: an annotated reading list

by ML@CMU

30 August 2024

share this:

By Keegan Harris

Causal inference is the process of determining whether and how a cause leads to an effect, typically using statistical methods to distinguish correlation from causation. Learning causal relationships from data is an important task across a wide variety of domains ranging from healthcare and drug development, to online advertising and e-commerce. As a result, there has been much work in the literature on economics, statistics, computer science, and public policy on designing algorithms and methodologies for causal inference.

While most of the focus has been on questions which are statistical in nature, one must also take game-theoretic incentives into consideration when doing causal inference about strategic individuals who have a preference over the treatment they receive. For example, it may be hard to infer causal relationships in randomized control trials when there is non-compliance by participants in the study (i.e. when participants do not adhere to the treatment they are assigned). More generally, causal learning may be difficult whenever individuals are free to self-select their own treatments and there is sufficient heterogeneity between individuals with different preferences. Even when compliance can be enforced, individuals may strategize by modifying the attributes they present to the causal inference process in order to be assigned a more desirable treatment.

This annotated reading list is intended to serve as a brief summary of work on causal inference in the presence of strategic agents. While this list is not comprehensive, we hope that it will be a useful starting point for members of the machine learning community to learn more about this exciting research area at the intersection of causal inference and game theory.

The reading list is organized as follows: (1, 3) study non-compliance in randomized trials, (2-4) focus on instrumental variable methods, (4-6) consider incentive misalignment between the individual running the causal inference procedure and the subjects of the procedure, (7,8) study cross-unit interference, and (9,10) are about synthetic control methods.

[Robins 1998]: This paper provides an overview of methods to correct for non-compliance in randomized trials (i.e., non-adherence by trial participants to the treatment assignment protocol).
[Angrist et al. 1996]: This seminal paper outlines the concept of instrumental variables (IVs) and describes how they can be used to estimate causal effects. An IV is a variable that affects the treatment variable but is unrelated to the outcome variable except through its effect on the treatment. IV methods leverage the fact that variation in IVs is independent of any confounding in order to estimate the causal effect of the treatment.
[Ngo et al. 2021]: Unlike prior work on non-compliance in clinical trials, this work leverages tools from information design to reveal information about the effectiveness of the treatments in such a way that participants become incentivized to comply with the treatment recommendations over time.
[Harris et al. 2022]: This paper studies the problem of making decisions about a population of strategic agents. The authors make the novel observation that the assessment rule deployed by the principal is a valid instrument, which allows them to apply standard methods for instrumental variable regression to learn causal relationships in the presence of strategic behavior.
[Miller et al. 2020]: This paper considers the problem of strategic classification, where a principal (i.e. decision maker) makes decisions about a population of strategic agents. Given knowledge of the principal’s deployed assessment rule, the agents may strategically modify their observable features in order to receive a more desirable assessment (e.g., a better interest rate on a loan). The authors are the first to show that designing good incentives for agent improvement (i.e. encouraging strategizing in a way which actually benefits the agent) is at least as hard as orienting edges in the corresponding causal graph.
[Wang et al. 2023]: Incentive misalignment between patients and providers may occur when average treated outcomes are used as quality metrics. Such misalignment is generally undesirable in healthcare domains, as it may lead to decreased patient welfare. To mitigate this issue, this work proposes an alternative quality metric, the total treatment effect, which accounts for counterfactual untreated outcomes. The authors show that rewarding the total treatment effect maximizes total patient welfare.
[Wager and Xu 2021]: Motivated by applications such as ride-sharing and tuition subsidies, this work studies settings in which interventions on one unit (e.g. a person or product) may have effects on others (i.e., cross-unit interference). The authors focus on the problem of setting supply-side payments in a centralized marketplace. They use a mean-field modeling-based approach to model the cross-unit interference, and design a class of experimentation schemes which allow them to optimize payments without disturbing the market equilibrium.
[Li et al. 2023]: Like [Wager and Xu 2021], this paper studies the effects of cross-unit interference, although the interference considered here comes from congestion in a service system. As a result, the interference considered here is dynamic, in contrast to the static interference considered in the previous entry.
[Abadie and Gardeazabal 2003]: This is the first paper on synthetic control methods (SCMs), a popular technique for estimating counterfactuals from longitudinal data. In the SCM setup, there is a pre-intervention time period during which all units are under control, followed by a post-intervention time period when all units undergo exactly one intervention (either the treatment or control). Given a test unit (who was given the treatment) and a set of donor units (who remained under control), SCMs use the pre-treatment data to learn a relationship (usually linear or convex) between the test and donor units. This relationship is then extrapolated to the post-intervention time period in order to estimate the counterfactual trajectory for the test unit under control.
[Ngo et al. 2023]: A common assumption in the literature on SCMs is that of “overlap”: the outcomes for the test unit can be written as a combination (e.g., linear or convex) of the donor units. This work sheds light on this often overlooked assumption and shows that (i) when units select their own treatments and (ii) there is sufficient heterogeneity between units who prefer different treatments, then overlap does not hold. Like [Ngo et al. 2021], the authors use tools from information design and multi-armed bandits to incentivize units to explore different treatments in a way which ensures that the overlap condition will gradually become satisfied over time.

[Editor’s note: this article is cross-posted in SIGecom Exchanges 22.1.]

References:

This article was initially published on the ML@CMU blog and appears here with the authors’ permission.

ML@CMU

AIhub is supported by:

Forthcoming machine learning and AI seminars: March 2026 edition

Lucy Smith 02 Mar 2026

A list of free-to-attend AI-related seminars that are scheduled to take place between 2 March and 30 April 2026.

monthly digest

AIhub monthly digest: February 2026 – collective decision making, multi-modal learning, and governing the rise of interactive AI

Lucy Smith 27 Feb 2026

Welcome to our monthly digest, where you can catch up with AI research, events and news from the month past.

Causal inference under incentives: an annotated reading list

Related posts :

Forthcoming machine learning and AI seminars: March 2026 edition

AIhub monthly digest: February 2026 – collective decision making, multi-modal learning, and governing the rise of interactive AI

The Good Robot podcast: the role of designers in AI ethics with Tomasz Hollanek

Reinforcement learning applied to autonomous vehicles: an interview with Oliver Chang

The Machine Ethics podcast: moral agents with Jen Semler

Extending the reward structure in reinforcement learning: an interview with Tanmay Ambadkar

The Good Robot podcast: what makes a drone “good”? with Beryl Pong

Relational neurosymbolic Markov models

↑