A/B Test Sample Size Calculator

Estimate sample size and runtime for an A/B or multi-arm conversion experiment using a fixed-horizon normal approximation.

Inputs

Percent of users who convert in control.

Relative lift percent, such as 5 for a 5% lift.

Use one-sided only when decreases are not decision-relevant and the direction is fixed in advance.

Percent of total daily eligible users included in the experiment.

Total arms including control.

Percent of experiment traffic allocated to control. Remaining traffic is split evenly across treatment arms.

Outputs

Absolute MDE 0.40 pp
Treatment rate at MDE 8.40%
Sample per variant 73,853
Control sample 73,853
Each treatment sample 73,853
Total sample 147,706
Runtime 5.9 days
Control users per day 12,500
Each treatment users per day 12,500

This calculator uses the normal approximation for a fixed-horizon two-sided test of two conversion rates.

Control share: 50.0%. Each treatment share: 50.0%.

This is a standard control-versus-treatment two-arm setup. Effective alpha per comparison tail rule: 0.0500.

What this is good for

  • Rough planning for A/B tests and multi-arm conversion experiments
  • Comparing how baseline rate and MDE change runtime
  • Checking whether your traffic volume makes a test feasible

This first version does not handle sequential testing, CUPED variance reduction, clustered assignment, heterogeneous treatment effects, or ratio metrics.

Formula and assumptions

Let p1 be the baseline conversion rate and p2=p1+Δ be the treatment rate at the minimum detectable effect. Let wc be the control traffic share and wt be the traffic share for each treatment arm.

The calculator uses the normal approximation for a fixed-horizon test of two proportions with unequal allocation:

SE0=p¯(1p¯)(1nc+1nt),SE1=p1(1p1)nc+p2(1p2)nt

where p¯=(p1+p2)/2. For planning, it solves

Δ=z1αSE0+zpowerSE1

for the required treatment-arm sample size nt, using the allocation relationship nc/nt=wc/wt.

  • α=α/2 for a two-sided test and α=α for a one-sided test before any multiple-comparison correction.
  • When there are multiple treatment arms, the calculator applies a Bonferroni adjustment across treatment-versus-control comparisons.
  • Traffic is assumed independent and identically distributed across users.
  • Runtime assumes stable daily traffic and no ramp schedule.