Causal Inference Method Selector

How to choose a causal inference method: the basics.

Choose causal inference methods from study design, identification strategy, and business objective.

This tool uses a decision-tree backbone centered on identification structure, but it returns multiple viable methods with assumptions and follow-up checks rather than forcing a single branch.

Overview flowchart for choosing causal inference methods across experiments, observational designs, thresholds, instruments, and rollout settings.
Overview map for the selector. The interactive tool below expands each branch into method recommendations, package suggestions, and exportable robustness checklists.

Study Setup

Answer the questions that matter for identification. The tool will adapt the later questions to your design.

Start with whether treatment assignment was randomized or not.

Pick the main causal question, not every downstream analysis you may run later.

Design Signals

Examples: pre-period spend, trips, clicks, or repeated baseline outcome measurements.

Use this only for experimental settings.

Examples: shared driver supply, seller liquidity, auction budgets, inventory competition, or social-network spillovers.

Example: assigned users do not always adopt the feature, or encouragement differs from uptake.

Recommended Methods

The tool shows a primary recommendation, strong fallbacks, and identification warnings.

ExperimentalEstimate the main average treatment effect

Suggested workflow

Use the first method as the working analysis plan, then benchmark it against a strong fallback or diagnostic.

Start hereRandomized experiment with covariate-adjusted analysis

Randomization is the primary source of identification.

Next moveCheck balance, attrition, and treatment leakage

Validate this first before adding more estimator complexity.

Key assumptionDefend this explicitly

Treatment assignment is randomized and sufficiently implemented

Best fit

Randomized experiment with covariate-adjusted analysis

Experimental

Use intention-to-treat as the baseline estimate, with regression adjustment or stratification for precision and imbalance control.

Why it fits

  • Randomization is the primary source of identification.

Critical assumptions

  • Treatment assignment is randomized and sufficiently implemented
  • No material interference or spillovers across units unless explicitly modeled
  • Outcome measurement and variance estimation match the assignment unit

Pros

  • Strongest identification strategy when assignment really is random.
  • Easy to explain to product, ops, and leadership stakeholders.
  • Clean fit for launch, pricing, and guardrail decisions.

Cons

  • Can be expensive, slow, or operationally disruptive to run well.
  • Spillovers, attrition, or leakage can quietly break identification.
  • A single average effect can hide meaningful segment heterogeneity.

What to validate next

  • Check balance, attrition, and treatment leakage
  • Cluster standard errors if assignment was clustered
  • Report ITT before any treatment-on-treated analysis

Representative industry use cases

Popular book references

Suggested packages

  • StatsmodelsUse for regression adjustment, robust standard errors, and baseline econometric estimators.
  • PyFixestUse for clustered or high-dimensional fixed-effects regressions when experiments are run over panels, markets, or repeated outcomes.
  • linearmodelsUse for absorbed fixed effects and panel-robust inference when randomized experiments are analyzed at user, geo, or time-cell level.

Also consider: CUPED, effect among compliers (CACE / LATE), heterogeneity models.

Why this is not a rigid one-path decision tree

  • Many applied problems support more than one defensible method.
  • Identification assumptions matter more than the algorithm name.
  • Practitioners often need a primary method plus a robustness check, not a single branch answer.
  • The best workflow is usually design first, estimator second, diagnostics third.

This selector therefore uses a decision-tree backbone but returns method cards with fit, assumptions, and what to validate next.

Methods covered

  • Randomized experiment analysis with covariate adjustment
  • Switchback experiments for interference-heavy marketplaces or networks
  • CUPED / pre-period variance reduction
  • Effect among compliers (CACE / LATE) via IV for noncompliance
  • Heterogeneous treatment effect models such as causal forests, uplift models, and meta-learners
  • Mediation analysis
  • Matching and propensity-score weighting
  • Doubly robust estimators such as AIPW and double machine learning
  • Difference-in-differences and event-study style designs
  • Interrupted time series and synthetic control
  • Regression discontinuity design
  • Instrumental variables for observational settings