Build state-space LLM architectures

mamba-architectureskillsetup L39,423
Orchestra-Research/AI-Research-SKILLs
What it does

Build and deploy Mamba state-space models with O(n) linear complexity

Best for

Long sequences (100K+ tokens), streaming inference, or memory-constrained deployments beating Transformer quadratic scaling.

Inputs
  • · model config (d_model, n_layer, d_state)
  • · training data or checkpoint
  • · inference prompt
Outputs
  • · trained model
  • · generated sequence
  • · memory/latency benchmarks
Requires
  • · mamba-ssm
  • · torch
  • · causal-conv1d
  • · transformers
Preconditions
  • · Linux/NVIDIA GPU
  • · CUDA 11.6+
  • · PyTorch 1.12+
Failure modes
  • · installation stalls on source build
  • · causal-conv1d missing → 5× slower inference
Trust signals
  • · 5× faster inference than Transformers
  • · no KV cache → O(1) memory per token
  • · million-token sequences verified