Extend model context windows to 32k-128k tokens
long-contextskillsetup L3★9,423
Orchestra-Research/AI-Research-SKILLs ↗What it does
Extend LLM context windows beyond pre-trained limits
Best for
Processing long documents (32k-128k+ tokens) by extending pre-trained models with RoPE/YaRN interpolation.
Inputs
- · base model
- · rope_scaling config
- · long documents (32k+ tokens)
Outputs
- · extended-context model
- · RoPE/YaRN embeddings
Requires
- · transformers
- · torch
- · flash-attn (optional)
Preconditions
- · base model compatible
- · position encoding updateable
Failure modes
- · extrapolation artifacts
- · attention OOM on very long
- · config incompatible
Trust signals
- · RoPE: decaying inter-token dependency
- · YaRN: NTK-aware interpolation
- · ALiBi: no retraining needed
- · position interpolation proven