Run local inference without cloud costs
local-llm-bridgeskillsetup L3★3
richfrem/agent-plugins-skills ↗What it does
Route bounded tasks to local Gemma LLM
Best for
Sub-second bounded tasks using local Gemma without cloud latency
Inputs
- · Task prompt
- · Persona
Outputs
- · Task output file
Requires
- · llama-server
- · Python
Preconditions
llama-server running on localhost:8089
Failure modes
llama-server not running; command fails
Trust signals
- · Measured 2s latency for typical tasks
- · KV cache orchestration included