Reduce LLM latency and cost with caching
prompt-cachingskillsetup L2★0
Sheshiyer/skill-clusters ↗What it does
Cache LLM prompts and responses to reduce cost and latency
Best for
Repeated queries with stable context where latency and cost matter more than freshness.
Inputs
- · Prompt text
- · System instructions
- · Context documents
Outputs
- · Cached tokens metadata
- · Cost savings estimate
Requires
- · Anthropic API
- · Redis
- · OpenAI API
Preconditions
- · Stable prompt prefix or repeated queries
- · API access key
Failure modes
- · Cache invalidation on semantic drift
- · Stale cached responses
Trust signals
- · Anthropic native cache_control API
- · Code examples for CAG pattern
- · Redis integration patterns