Design statistically valid A/B tests
ab-testingskillsetup L2★29
dirnbauer/webconsulting-skills ↗What it does
Design and validate statistically sound A/B tests
Best for
Product and growth teams validating feature changes or messaging hypotheses before rolling out to 100% of users.
Inputs
- · Hypothesis statement with observation, change, expected outcome
- · Baseline conversion rate
- · Current traffic volume
- · Minimum detectable effect (MDE) target
Outputs
- · Sample size calculation per variant
- · Test duration estimate
- · Metrics plan (primary, secondary, guardrail)
Requires
- · A/B testing platform (Optimizely, VWO, etc.)
- · Analytics integration (GA4)
Preconditions
- · Baseline conversion rate known
- · Traffic sufficient for sample size
- · Hypothesis is specific (not just 'let's see what happens')
Failure modes
- · Underpowered tests (too-small sample size) detect false positives or miss real effects
- · Peeking at results early → false positives; breaks statistical validity
- · Testing multiple variables at once → cannot isolate causation
- · Novelty effects (change-driven uplift) fade; long-running tests may invalidate early conclusions
Trust signals
- · Hypothesis framework (Because [observation], we believe [change] will cause [outcome] for [audience]).
- · Weak vs strong hypothesis examples with specificity
- · Four test types (A/B split, A/B/n, MVT, split URL) with trade-offs
- · Sample size quick reference table (by baseline rate and lift target)
- · Duration formula and references to Evan Miller + Optimizely calculators
- · Metrics selection guidance (primary tied to hypothesis, secondary for interpretation, guardrail for safety)