Train models cleanly with PyTorch Lightning
pytorch-lightningskillsetup L2★9,423
Orchestra-Research/AI-Research-SKILLs ↗What it does
Scaffold PyTorch training loops with distributed, callback, and logging automation
Best for
Scaling training from laptop to multi-node/multi-GPU without rewriting boilerplate; automatic DDP/FSDP/DeepSpeed support.
Inputs
- · PyTorch model
- · train/val data loaders
- · loss function, optimizer config
Outputs
- · LightningModule subclass
- · trained checkpoint
- · TensorBoard logs
Requires
- · lightning
- · torch
- · transformers
Preconditions
PyTorch model defined, data loaders ready
Failure modes
- · DDP synchronization missed if not using Trainer
- · incorrect lr_scheduler hook signature
- · callbacks not properly integrated
Trust signals
- · 40+ lines → 15 lines reduction shown
- · automatic distributed support (DDP/FSDP/DeepSpeed)
- · callback ecosystem (ModelCheckpoint, EarlyStopping)