cyberneticlibrary

Operationalize LLM engineering with model routing

agentic-engineeringskillsetup L20
Sheshiyer/skill-clusters
What it does

Execute LLM engineering with eval-first

Best for

Engineering workflows where AI agents perform most implementation and evals enforce quality gates.

Inputs
  • · Completion criteria
  • · Implementation task
  • · Baseline evals
Outputs
  • · Agent-executed implementation
  • · Eval comparisons
  • · Regression report
Preconditions

Define evals before execution; decompose into 15-minute units

Failure modes

No evals = no regression signal; coupling between units = slow iteration

Trust signals
  • · Eval-first discipline
  • · Model-tier routing by complexity
  • · Regression baseline checks