cyberneticlibrary

Reduce LLM API costs

llm-cost-optimizerskillsetup L117,464
alirezarezvani/claude-skills
What it does

Route LLM requests by complexity and cache static context

Best for

Large-scale AI systems where token efficiency directly impacts unit economics

Inputs
  • · Current spend breakdown
  • · request volume profile
  • · latency constraints
  • · acceptable quality degradation
Outputs
  • · Cost audit report
  • · routing strategy
  • · caching targets
  • · estimated savings
Requires
  • · Anthropic API
  • · OpenAI API
  • · Google Gemini API
Preconditions
  • · Per-request token logging implemented
  • · cost baseline established
Failure modes
  • · Over-compression causes hallucination
  • · routing mispredicts task complexity
  • · caching misses increase latency
Trust signals
  • · Tested cost reduction patterns
  • · complexity classification framework
  • · model tier recommendations