<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Causal Inference | Pak Shing Ho</title><link>https://www.pakshingho.com/tag/causal-inference/</link><atom:link href="https://www.pakshingho.com/tag/causal-inference/index.xml" rel="self" type="application/rss+xml"/><description>Causal Inference</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>© 2026 by PAK SHING HO. All rights reserved.</copyright><lastBuildDate>Sun, 03 May 2026 00:00:00 +0000</lastBuildDate><image><url>https://www.pakshingho.com/images/icon_hu0b7a4cb9992c9ac0e91bd28ffd38dd00_9727_512x512_fill_lanczos_center_2.png</url><title>Causal Inference</title><link>https://www.pakshingho.com/tag/causal-inference/</link></image><item><title>Statistical Inference vs. Causal Inference: Why Pearl Says They Are Different Languages</title><link>https://www.pakshingho.com/post/statistical-inference-vs-causal-inference/</link><pubDate>Sun, 03 May 2026 00:00:00 +0000</pubDate><guid>https://www.pakshingho.com/post/statistical-inference-vs-causal-inference/</guid><description>&lt;p>In April 2026, social media reignited a long-standing debate that economists, statisticians, and causal researchers have wrestled with for decades:&lt;/p>
&lt;p>&lt;strong>Are statistical inference and causal inference actually the same thing?&lt;/strong>&lt;/p>
&lt;p>The discussion drew in none other than Turing Award winner Judea Pearl, pioneer of Bayesian networks and modern causal inference, who resurfaced his seminal 2009 review to clarify the distinction.&lt;/p>
&lt;p>His central message remains deeply provocative:&lt;/p>
&lt;p>&lt;strong>Statistical inference and causal inference are fundamentally different mathematical languages.&lt;/strong>&lt;/p>
&lt;hr>
&lt;h2 id="correlation-is-not-causation--but-thats-only-the-beginning">Correlation is not causation — but that’s only the beginning&lt;/h2>
&lt;p>Every statistics textbook reminds us that correlation does not imply causation. But Pearl’s argument goes much further.&lt;/p>
&lt;p>Traditional statistical inference focuses on the joint distribution:&lt;/p>
&lt;p>$$
P(X, Y, Z, \ldots)
$$&lt;/p>
&lt;p>This framework allows us to estimate associations, conditional probabilities, and predictive relationships.&lt;/p>
&lt;p>But if we ask:&lt;/p>
&lt;blockquote>
&lt;p>“What happens to $Y$ if we forcibly change $X$?”&lt;/p>
&lt;/blockquote>
&lt;p>Then standard statistical distributions alone cannot answer that question.&lt;/p>
&lt;p>As Pearl emphasizes:&lt;/p>
&lt;blockquote>
&lt;p>Distribution functions contain no information about how distributions themselves would change under external intervention.&lt;/p>
&lt;/blockquote>
&lt;p>This is not a data shortage problem.&lt;/p>
&lt;p>It is a language problem.&lt;/p>
&lt;p>Causal information is not inherently encoded in observational data alone.&lt;/p>
&lt;hr>
&lt;h2 id="pearls-ladder-of-causation-three-levels-of-reasoning">Pearl’s Ladder of Causation: Three levels of reasoning&lt;/h2>
&lt;p>Pearl organizes causal reasoning into three hierarchical layers:&lt;/p>
&lt;h3 id="1-association--seeing">1. Association — “Seeing”&lt;/h3>
&lt;p>$$
P(Y\mid X)
$$&lt;/p>
&lt;p>&lt;strong>Example:&lt;/strong>&lt;br>
Among smokers, what is the probability of lung cancer?&lt;/p>
&lt;p>This is the domain of:&lt;/p>
&lt;ul>
&lt;li>Traditional statistics&lt;/li>
&lt;li>Machine learning prediction&lt;/li>
&lt;li>Correlation analysis&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h3 id="2-intervention--doing">2. Intervention — “Doing”&lt;/h3>
&lt;p>$$
P(Y\mid do(X))
$$&lt;/p>
&lt;p>&lt;strong>Example:&lt;/strong>&lt;br>
If we force someone to smoke, what happens to their lung cancer risk?&lt;/p>
&lt;p>This is the domain of:&lt;/p>
&lt;ul>
&lt;li>Randomized controlled trials&lt;/li>
&lt;li>Difference-in-Differences&lt;/li>
&lt;li>Instrumental Variables&lt;/li>
&lt;li>Regression Discontinuity&lt;/li>
&lt;/ul>
&lt;p>This is where causal effect lives.&lt;/p>
&lt;hr>
&lt;h3 id="3-counterfactuals--imagining">3. Counterfactuals — “Imagining”&lt;/h3>
&lt;p>&lt;strong>Example:&lt;/strong>&lt;br>
For a healthy non-smoker, if they had smoked, would they have developed lung cancer?&lt;/p>
&lt;p>This is the domain of:&lt;/p>
&lt;ul>
&lt;li>Individual causal attribution&lt;/li>
&lt;li>Legal responsibility&lt;/li>
&lt;li>Personalized treatment effects&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;strong>Key insight:&lt;/strong>&lt;br>
Higher causal levels cannot be derived from lower levels without additional assumptions.&lt;/p>
&lt;ul>
&lt;li>Observational data alone cannot identify interventions&lt;/li>
&lt;li>Intervention data alone cannot fully identify counterfactuals&lt;/li>
&lt;/ul>
&lt;p>More data does not eliminate this limitation.&lt;/p>
&lt;hr>
&lt;h2 id="why-pearls-do-operator-matters">Why Pearl’s do-operator matters&lt;/h2>
&lt;p>Pearl introduced:&lt;/p>
&lt;p>$$
P(Y\mid do(X=x))
$$&lt;/p>
&lt;p>This differs critically from:&lt;/p>
&lt;p>$$
P(Y\mid X=x)
$$&lt;/p>
&lt;p>The difference is selection bias.&lt;/p>
&lt;p>People who naturally select into treatment differ systematically from those randomly assigned treatment.&lt;/p>
&lt;p>In DAG terms, the do-operator works by:&lt;/p>
&lt;p>&lt;strong>Cutting all incoming arrows into $X$&lt;/strong>&lt;/p>
&lt;p>This mathematically simulates intervention.&lt;/p>
&lt;p>Pearl’s do-calculus then provides formal rules for when observational data can identify causal effects.&lt;/p>
&lt;p>This answers one of the most important practical questions:&lt;/p>
&lt;p>&lt;strong>Under what structural assumptions can we recover causality from observational data?&lt;/strong>&lt;/p>
&lt;hr>
&lt;h2 id="pearl-vs-rubin-same-destination-different-frameworks">Pearl vs. Rubin: Same destination, different frameworks&lt;/h2>
&lt;p>Economics largely relies on the Potential Outcomes Framework (Rubin Causal Model):&lt;/p>
&lt;p>$$
Y(1) - Y(0)
$$&lt;/p>
&lt;p>Causal identification depends on assumptions like:&lt;/p>
&lt;ul>
&lt;li>Conditional Independence (CIA)&lt;/li>
&lt;li>Parallel trends&lt;/li>
&lt;li>Exclusion restrictions&lt;/li>
&lt;/ul>
&lt;p>Pearl explicitly argued that Rubin’s framework is a subset of Structural Causal Models (SCM).&lt;/p>
&lt;p>In essence:&lt;/p>
&lt;ul>
&lt;li>Rubin = design-based&lt;/li>
&lt;li>Pearl = model-based&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h3 id="rubin-framework-strengths">Rubin framework strengths&lt;/h3>
&lt;ul>
&lt;li>Strong empirical design focus&lt;/li>
&lt;li>Practical for RCTs, DID, IV, RDD&lt;/li>
&lt;li>Minimal structural assumptions&lt;/li>
&lt;/ul>
&lt;h3 id="pearl-framework-strengths">Pearl framework strengths&lt;/h3>
&lt;ul>
&lt;li>Explicit DAG modeling&lt;/li>
&lt;li>Systematic covariate selection&lt;/li>
&lt;li>Collider bias detection&lt;/li>
&lt;li>Mediation analysis&lt;/li>
&lt;li>Placebo test logic&lt;/li>
&lt;li>Complex identification strategy diagnostics&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="one-of-economics-biggest-blind-spots-collider-bias">One of economics’ biggest blind spots: Collider bias&lt;/h2>
&lt;p>Pearl’s DAG framework sharply highlights collider bias:&lt;/p>
&lt;p>If:&lt;/p>
&lt;p>$$
X \rightarrow Z \leftarrow Y
$$&lt;/p>
&lt;p>Then conditioning on $Z$ can create false associations.&lt;/p>
&lt;p>&lt;strong>Example:&lt;/strong>&lt;br>
When estimating education’s effect on wages, controlling for employment status may induce bias if employment is influenced by both education and latent ability.&lt;/p>
&lt;p>This is often overlooked in practice.&lt;/p>
&lt;p>Potential outcomes frameworks do not directly tell you which variables should or should not be controlled.&lt;/p>
&lt;p>&lt;strong>DAGs do.&lt;/strong>&lt;/p>
&lt;hr>
&lt;h2 id="practical-lessons-for-applied-researchers">Practical lessons for applied researchers&lt;/h2>
&lt;h3 id="1-draw-dags-before-estimation">1. Draw DAGs before estimation&lt;/h3>
&lt;p>Before writing identification assumptions:&lt;/p>
&lt;ul>
&lt;li>Define treatment&lt;/li>
&lt;li>Define outcome&lt;/li>
&lt;li>Map controls&lt;/li>
&lt;li>Identify backdoor paths&lt;/li>
&lt;li>Detect colliders&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h3 id="2-use-dags-to-validate-placebo-tests">2. Use DAGs to validate placebo tests&lt;/h3>
&lt;p>A placebo should estimate zero effect only if your identification assumptions structurally imply it.&lt;/p>
&lt;p>DAGs clarify whether placebo logic is actually valid.&lt;/p>
&lt;hr>
&lt;h3 id="3-scm-improves-mediation-analysis">3. SCM improves mediation analysis&lt;/h3>
&lt;p>For separating:&lt;/p>
&lt;ul>
&lt;li>Direct effects&lt;/li>
&lt;li>Indirect effects&lt;/li>
&lt;/ul>
&lt;p>Pearl’s mediation formula offers more systematic tools than many traditional econometric approaches.&lt;/p>
&lt;hr>
&lt;h2 id="bottom-line">Bottom line&lt;/h2>
&lt;p>Pearl’s core argument is not that statistics is wrong.&lt;/p>
&lt;p>It is that:&lt;/p>
&lt;p>&lt;strong>Statistics alone is insufficient for causal reasoning.&lt;/strong>&lt;/p>
&lt;p>Prediction is not intervention.&lt;br>
Intervention is not counterfactual reasoning.&lt;/p>
&lt;p>For economists, data scientists, and policy researchers, this distinction matters enormously.&lt;/p>
&lt;p>As causal questions become more central in:&lt;/p>
&lt;ul>
&lt;li>Product experimentation&lt;/li>
&lt;li>Policy design&lt;/li>
&lt;li>Marketing attribution&lt;/li>
&lt;li>Forecasting&lt;/li>
&lt;li>Personalized interventions&lt;/li>
&lt;/ul>
&lt;p>Understanding both design-based and structural approaches becomes increasingly essential.&lt;/p>
&lt;p>The real frontier is not choosing Pearl or Rubin.&lt;/p>
&lt;p>It is knowing when each framework sharpens your identification strategy — and when relying on statistical association alone can fundamentally mislead decision-making.&lt;/p></description></item></channel></rss>