cyberneticlibrary

Test performance hypotheses with experiments

perf-theory-testerskillsetup L2★1,721

ComposioHQ/awesome-claude-plugins ↗

What it does

Run controlled performance experiments with baseline reversion

Best for

Proving performance improvements with rigorous statistical evidence.

Inputs

· hypothesis ID
· single code change
· benchmark command

Outputs

· verdict (accept/reject/inconclusive)
· metrics delta
· evidence links

Requires

· benchmarking framework

Preconditions

Clean baseline state; single change per experiment.

Failure modes

· parallel benchmarks confound results
· baseline drift between passes

Trust signals

· enforces revert-to-baseline protocol