Install local LLM inference server

local-llm-setupskillsetup L33
richfrem/agent-plugins-skills
What it does

Bootstrap local Gemma inference stack

Best for

Day-1 bootstrap or reconfiguration of local LLM inference

Inputs
  • · System type
  • · GPU type
Outputs
  • · Running llama-server
  • · Model downloaded
Requires
  • · llama.cpp
  • · launchd/systemd
Preconditions

Bash and Python 3 available

Failure modes

GPU not available; falls back to CPU

Trust signals
  • · Cross-platform Metal/CUDA/ROCm support
  • · llama-server authoritative params