cyberneticlibrary

Evaluate machine learning model performance

evaluate-modelskillsetup L21
morganmuli/metaskill
What it does

Load checkpoint and evaluate on test set

Best for

Standardized model evaluation reporting post-training.

Inputs
  • · model checkpoint path
  • · test dataset directory
  • · config YAML
Outputs
  • · metrics.json
  • · confusion matrix image
  • · metric deltas
Requires
  • · PyTorch
  • · scikit-learn
  • · pandas
  • · numpy
Preconditions

Checkpoint exists, test split available

Failure modes
  • · PyTorch version mismatch
  • · missing test files
  • · CUDA OOM
Trust signals
  • · Baseline comparison
  • · metric delta tracking