Optimize PyTorch model training
ml-training-recipesskillsetup L2★9,423
Orchestra-Research/AI-Research-SKILLs ↗What it does
Execute battle-tested PyTorch training recipes across LLM, vision, diffusion, medical imaging domains
Best for
Starting model training quickly with expert-vetted defaults instead of tuning from scratch
Inputs
- · domain type (LLM, vision, diffusion, etc.)
- · training config (lr, batch_size, epochs, etc.)
- · dataset
Outputs
- · trained model checkpoint
- · training metrics (loss, accuracy, etc.)
- · validation results
Requires
- · torch
- · pytorch-lightning
- · transformers
- · domain-specific libraries
Preconditions
GPU available (NVIDIA/Metal); datasets prepared; hyperparams within sane ranges; memory sufficient for batch_size
Failure modes
NaN loss if learning rate too high; underfitting if epochs too few; overfitting if regularization insufficient; OOM if batch_size too large
Trust signals
- · Covers 8+ domains (LLM, vision, diffusion, medical, protein, spatial, genomics)
- · Battle-tested means multiple successful deployments
- · Recipes include checkpointing strategy