Causal Inference Method Selector

How to choose a causal inference method: the basics.

Choose causal inference methods from study design, identification strategy, and business objective.

This tool uses a decision-tree backbone centered on identification structure, but it returns multiple viable methods with assumptions and follow-up checks rather than forcing a single branch.

Overview flowchart for choosing causal inference methods across experiments, observational designs, thresholds, instruments, and rollout settings. — Overview map for the selector. The interactive tool below expands each branch into method recommendations, package suggestions, and exportable robustness checklists.

Study Setup

Answer the questions that matter for identification. The tool will adapt the later questions to your design.

Data design

Start with whether treatment assignment was randomized or not.

Primary goal

Pick the main causal question, not every downstream analysis you may run later.

Design Signals

Pre-treatment outcome history available?

Examples: pre-period spend, trips, clicks, or repeated baseline outcome measurements.

Randomization quality

Use this only for experimental settings.

Interference or marketplace spillovers across units?

Examples: shared driver supply, seller liquidity, auction budgets, inventory competition, or social-network spillovers.

Assignment and treatment received differ?

Example: assigned users do not always adopt the feature, or encouragement differs from uptake.

Recommended Methods

The tool shows a primary recommendation, strong fallbacks, and identification warnings.

ExperimentalEstimate the main average treatment effect

Suggested workflow

Use the first method as the working analysis plan, then benchmark it against a strong fallback or diagnostic.

Start hereRandomized experiment with covariate-adjusted analysis

Randomization is the primary source of identification.

Next moveCheck balance, attrition, and treatment leakage

Validate this first before adding more estimator complexity.

Key assumptionDefend this explicitly

Treatment assignment is randomized and sufficiently implemented

Best fit

Randomized experiment with covariate-adjusted analysis

Experimental

Use intention-to-treat as the baseline estimate, with regression adjustment or stratification for precision and imbalance control.

Why it fits

Randomization is the primary source of identification.

Critical assumptions

Treatment assignment is randomized and sufficiently implemented
No material interference or spillovers across units unless explicitly modeled
Outcome measurement and variance estimation match the assignment unit

Pros

Strongest identification strategy when assignment really is random.
Easy to explain to product, ops, and leadership stakeholders.
Clean fit for launch, pricing, and guardrail decisions.

Cons

Can be expensive, slow, or operationally disruptive to run well.
Spillovers, attrition, or leakage can quietly break identification.
A single average effect can hide meaningful segment heterogeneity.

What to validate next

Check balance, attrition, and treatment leakage
Cluster standard errors if assignment was clustered
Report ITT before any treatment-on-treated analysis

Representative industry use cases

Spotify Engineering: experimentation platformPlatformized product experimentation with managed configuration, metric catalogs, and consistent analysis for many concurrent tests.
Wayfair Tech Blog: geo experimentsMarket-level randomized experiments to measure incrementality when user-level assignment is infeasible.

Popular book references

Trustworthy Online Controlled ExperimentsCh. 2, 'Running and Analyzing Experiments: An End-to-End Example.'
Causal Inference for Data ScienceCh. 1, 'Introducing causality,' including A/B testing and RCT basics.
Mostly Harmless EconometricsCh. 2, 'The Experimental Ideal.'

Suggested packages

StatsmodelsUse for regression adjustment, robust standard errors, and baseline econometric estimators.
PyFixestUse for clustered or high-dimensional fixed-effects regressions when experiments are run over panels, markets, or repeated outcomes.
linearmodelsUse for absorbed fixed effects and panel-robust inference when randomized experiments are analyzed at user, geo, or time-cell level.

Also consider: CUPED, effect among compliers (CACE / LATE), heterogeneity models.

Why this is not a rigid one-path decision tree

Many applied problems support more than one defensible method.
Identification assumptions matter more than the algorithm name.
Practitioners often need a primary method plus a robustness check, not a single branch answer.
The best workflow is usually design first, estimator second, diagnostics third.

This selector therefore uses a decision-tree backbone but returns method cards with fit, assumptions, and what to validate next.

Methods covered

Randomized experiment analysis with covariate adjustment
Switchback experiments for interference-heavy marketplaces or networks
CUPED / pre-period variance reduction
Effect among compliers (CACE / LATE) via IV for noncompliance
Heterogeneous treatment effect models such as causal forests, uplift models, and meta-learners
Mediation analysis
Matching and propensity-score weighting
Doubly robust estimators such as AIPW and double machine learning
Difference-in-differences and event-study style designs
Interrupted time series and synthetic control
Regression discontinuity design
Instrumental variables for observational settings