cyberneticlibrary

Design growth experiments with hypotheses

surge-experimentskillsetup L139
tonone-ai/tonone

Causal-lift measurements

ab-experimentation0pp vs no-skill baselinewith-skill 100% · baseline 100%

Measured by running the task with and without this artifact, K=5, graded by deterministic checks — no LLM judging.

What it does

Design growth experiments with hypotheses before building

Best for

Growth PMs designing experiments where you need clarity on mechanism, sample size, and decision criteria before engineering starts building.

Inputs
  • · Growth lever (which part of funnel: acquisition, activation, retention, revenue, referral)
  • · Specific lever hypothesis (if [change], then [metric] increases by [X%] because [mechanism])
  • · Target population (new users, existing, paid, all)
  • · Available daily traffic
Outputs
  • · Growth hypothesis with mechanism (the causal theory)
  • · Experiment design (control, variant, traffic split, target population)
  • · Primary, secondary, and guardrail metrics (with MDE)
  • · Sample size and run time calculation
  • · Decision playbook (what happens if you WIN / LOSS / GUARDRAIL FAIL / EARLY STOP)
  • · Implementation checklist
Preconditions
  • · Funnel stage clarity (where in the user journey does this experiment sit)
  • · Available traffic sufficient for MDE within 6 weeks (or accept larger MDE)
  • · Metrics instrumented (or ready to instrument before launch)
Failure modes
  • · Hypothesis is vague (missing mechanism — 'why will this work?')
  • · MDE not set (experiment could declare victory on noise)
  • · Run time exceeds 6 weeks due to low traffic (experiment is too ambitious)
  • · Primary and guardrail metrics are unclear (decision criteria ambiguous)
  • · No decision playbook (what do you actually do if you WIN?)
Trust signals
  • · Mechanism is mandatory (not just 'will increase X')
  • · MDE is specific number, not a percentage (e.g., 'Conversion rate from 5% to 6%', not '+20% improvement')
  • · Sample size calculated explicitly (provides formula context)
  • · Decision playbook for each outcome: WIN, LOSS, GUARDRAIL FAIL, EARLY STOP
  • · Guardrail metrics protect against shipping a win that breaks something else