Track ML experiments with Hugging Face Trackio
hugging-face-trackioskillsetup L2★0
Sheshiyer/skill-clusters ↗What it does
Monitor live training jobs with real-time metrics (loss, throughput, hardware)
Best for
When running long training jobs and need real-time visibility into convergence, hardware utilization, and early stopping signals.
Inputs
- · job ID
- · Trackio API token
Outputs
- · live dashboard
- · training curves
- · hardware utilization
- · alerts on anomalies
Requires
- · Trackio
- · Hugging Face Jobs API
Preconditions
Job submitted to HF Jobs, Trackio integrated in training script
Failure modes
- · metrics lag ≥5 minutes
- · dashboard timeout if job_id invalid
- · alerts trigger on harmless fluctuations
Trust signals
- · integrated into HF model-trainer skill example scripts