cyberneticlibrary

Monitor LLM traces and evals in production

phoenix-observabilityskillsetup L29,423
Orchestra-Research/AI-Research-SKILLs
What it does

Trace, evaluate, and monitor LLM applications with observability tooling

Best for

Production LLM systems needing detailed observability without vendor lock-in or cost overhead.

Inputs
  • · LLM framework code (OpenAI, LangChain, LlamaIndex, Anthropic)
  • · Evaluation datasets
  • · Custom evaluator functions
Outputs
  • · Trace visualizations in web UI
  • · Evaluation scorecards
  • · Real-time monitoring dashboards
  • · Experiment comparison reports
Requires
  • · arize-phoenix (12.0+)
  • · OpenTelemetry
  • · PostgreSQL or SQLite backend
  • · OpenAI/LangChain/LlamaIndex SDKs
Preconditions
  • · Python 3.8+
  • · GPU optional
  • · Self-hosted or cloud deployment
Failure modes
  • · Traces lost if server not running
  • · Incorrect instrumentation skips spans
  • · Large-scale tracing may slow inference
  • · Database scaling required for production
Trust signals
  • · OpenTelemetry standard instrumentation
  • · Self-hosted control
  • · MIT licensed
  • · Used for framework-wide tracing integration