Scale neural network training
pytorch-lightningskillsetup L3★27,559
K-Dense-AI/scientific-agent-skills ↗What it does
Train neural networks with distributed training and experiment tracking
Best for
When you want to scale PyTorch training across GPUs without rewriting boilerplate.
Inputs
- · PyTorch model
- · DataLoader
- · loss function
- · optimizer
Outputs
- · trained model checkpoint
- · metrics logs
- · inference-ready weights
Requires
- · Python
- · PyTorch Lightning
- · PyTorch
- · optional: Tensorboard / Weights & Biases
Preconditions
- · data in DataLoader format
- · model inherits LightningModule
Failure modes
- · out-of-memory during distributed training
- · learning rate too high/low
Trust signals
- · automatic mixed precision (AMP)
- · checkpoint/resume
- · gradient accumulation