Implement GPT from scratch
nanogptskillsetup L2★9,423
Orchestra-Research/AI-Research-SKILLs ↗What it does
Train GPT models from scratch in ~300 lines of readable, hackable PyTorch
Best for
Learning transformer internals, quick prototyping, or experimenting with variants without framework overhead.
Inputs
- · text dataset
- · config (batch_size, learning_rate, n_layer, n_head, n_embd)
Outputs
- · model checkpoint
- · generated text samples
Requires
- · torch
- · numpy
- · transformers
- · datasets
- · tiktoken
- · wandb
Preconditions
- · PyTorch installed
- · CPU or GPU available
Failure modes
- · CUDA OOM if batch_size not reduced
- · poor generation if max_iters too low
Trust signals
- · ~300-line codebase (no abstractions)
- · reproduces GPT-2 124M
- · Andrej Karpathy design ethos