The library
Everything we index — ranked by what works, never by stars.
forSalesMarketingHRFinanceLegalOpsProductEngineeringDataProductivitySupportsetup≤ plug & play≤ + a key≤ multi-tool
WORKS 48★64WORKS 48★0WORKS 48★9,726WORKS 48★9,726WORKS 51★64WORKS 48★9,726WORKS 48★9,726WORKS 48★381WORKS 48★381WORKS 48★64WORKS 48★64WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★64WORKS 48★9,423WORKS 48★64WORKS 48★64WORKS 48★64WORKS 48★64WORKS 48★64WORKS 48★64WORKS 48★9,423WORKS 48★64WORKS 48★9,423WORKS 48★64WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★381WORKS 48★9,423WORKS 48★9,423WORKS 51★381WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423WORKS 48★9,423
Master nutrition science fundamentalsskillDataL1
nutrition-science-foundations · When you need foundational knowledge of how macronutrients, micronutrients, and digestion work.
Engineer features from survey dataskillDataL2
skillx_class_aware_c2__feature_engineering__SKILL · Preparing respondent-keyed data for difference-in-differences causal analysis.
Build production ML pipelinesskillEngineeringDataL4
ml-pipeline · Best for designing end-to-end ML infrastructure with experiment tracking and pipelines.
Transform data with pandasskillDataL2
pandas-pro · When manipulating large datasets with grouped aggregations, merges, time series resampling, or memory optimization.
Optimize database queries and schemaskillEngineeringDataL2
sql-patterns · When optimizing queries, designing schemas, or ensuring SQL injection prevention via parameterized queries.
Optimize PostgreSQL performanceskillEngineeringDataL2
postgres-pro · When tuning PostgreSQL for production performance via indexes, connection pooling, or replication strategy.
Write type-safe Python codeskillEngineeringDataL2
python-pro · When building production Python with strict typing, 80%+ test coverage, and async I/O optimization.
Define new data column conventionsskillDataL1
new-column · Schema evolution, adding new data fields, or incremental database schema changes
Preview Denmark statistics tablesskillDataL1
tables · Organizing structured data where tables provide clarity and queryability.
Design and conduct psychology researchskillDataL1
research-methods-psych · Psychology research where internal/external validity and replicability are non-negotiable.
Find root causes with causal inferenceskillOpsDataL2
rca-causal-inference · Incidents with rich quantitative data where you need defensible, reproducible, mathematical causal claims (not just narrative RCA), especially when distinguishing multiple confounding factors.
Interpret 70B models without local GPUskillEngineeringDataL3
nnsight-remote-interpretability · Running the same interpretability code on GPT-2 locally and Llama-405B remotely without code changes, enabling scalable mechanistic interpretability research on massive models.
Train sparse autoencoders to find featuresskillEngineeringDataL3
sparse-autoencoder-training · guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable f...
Reverse-engineer transformer internalsskillEngineeringDataL3
transformer-lens-interpretability · guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints a...
Process ML datasets at scaleskillDataEngineeringL3
ray-data · Batch inference and preprocessing on 100GB+ datasets across multi-node clusters.
Fine-tune models with GRPOskillEngineeringDataL3
grpo-rl-training · Teaching specific output formats (XML, JSON) and verifiable tasks without preference pairs.
Train large MoE models efficientlyskillEngineeringDataL4
miles-rl-training · Training 1TB+ MoE models with speculative RL for 25%+ rollout speedup.
Scale RLHF training with RayskillEngineeringDataL4
openrlhf-training · Scaling PPO/GRPO/RLOO/DPO training to 70B+ models with multi-node vLLM.
Align models with SimPOskillEngineeringDataL3
simpo-training · Quick preference optimization without reward model or RL infrastructure.
Train GLM models with SLIMEskillEngineeringDataL4
slime-rl-training · Research-grade RL training with flexible reward functions and algorithm variants.
Train agents with TorchForgeskillEngineeringDataL4
torchforge-rl-training · PyTorch-native RL training with hardware acceleration and custom loss functions.
Fine-tune LLMs with TRLskillEngineeringDataL3
fine-tuning-with-trl · Multi-phase RLHF pipelines (SFT→Reward→PPO) where you control each alignment stage.
Analyze scientific data and experimentsskillDataL2
data-analysis-sci · When you have raw measurements and need to extract honest conclusions with proper error analysis.
Scale RL training with VeRLskillEngineeringDataL4
verl-rl-training · Production math/reasoning tasks (GSM8K, MATH) where you need proven RL algorithms at scale.
Design controlled experiments correctlyskillDataL1
experimental-design-sci · When you must isolate causal effects and design has constraints (ethics, cost, timescale).
Run systematic investigations reliablyskillDataL1
scientific-method · When teaching or designing any empirical study and need a repeatable framework.
Apply Bayesian methods to statistical inferenceskillDataL1
bayesian-methods · Incorporating prior domain knowledge into inference and updating beliefs with small sample sizes where frequentist confidence intervals fail
Summarize and visualize data distributionsskillDataL1
descriptive-statistics · Data exploration when hypothesis testing is premature and you need to understand raw data distribution first
Master statistical hypothesis testingskillDataL1
inferential-statistics · Determining whether a sample observation generalizes to the population when descriptive stats alone insufficient
Understand probability fundamentalsskillDataL1
probability-theory · Building rigorous statistical reasoning from first principles when intuitive probability reasoning fails
Benchmark code generation modelsskillEngineeringDataL3
evaluating-code-models · Comparing code model performance across standard benchmarks when new architecture or training method is evaluated
Build predictive regression modelsskillDataL2
regression-modeling · Quantifying how variables influence outcomes when you need interpretable coefficients rather than pure prediction.
Evaluate LLM academic benchmarksskillEngineeringDataL3
evaluating-llms-harness · Standardized model comparison using industry-standard benchmarks when you need reproducible academic metrics.
Run statistical simulations and analysisskillDataL2
statistical-computing · Deriving confidence intervals and p-values when analytical formulas are unavailable or standard assumptions violated.
Track ML experiments and modelsskillEngineeringDataL2
mlflow · Managing dozens of experiments with repeatable comparison and governance when you need audit trails.
Track ML experiments locallyskillEngineeringDataL2
experiment-tracking-swanlab · Real-time experiment monitoring during training when live curves beat post-hoc analysis.
Visualize ML training metricsskillEngineeringDataL2
tensorboard · When debugging deep learning models with real-time metric visualization and experiment comparison across runs.
Track and optimize ML experimentsskillEngineeringDataL2
weights-and-biases · When managing production ML experiments that need team collaboration, automatic metric logging, and hyperparameter optimization.
Query databases efficientlyskillEngineeringDataL1
sql-patterns · When using sql-patterns is more effective than generic alternatives.
Build RAG applicationsskillEngineeringDataL3
llamaindex · When using llamaindex is more effective than generic alternatives.
Store and search embeddingsskillEngineeringDataL2
chroma · Open-source RAG prototyping where self-hosted storage is preferred over managed cloud.
Conduct comprehensive AI researchskillDataMarketingL2
gemini-deep-research · Multi-source synthesis tasks where systematic web search + AI analysis beats single-prompt research.
Search billions of vectors fastskillEngineeringDataL2
faiss · High-throughput vector search where metadata filtering is not required and GPU acceleration helps.
Generate text embeddings for semantic tasksskillEngineeringDataL2
sentence-transformers · Text embedding when you need semantic vectors that are domain-tuned or when you want pure open-source.
Enforce structured output with grammarsskillEngineeringDataL2
guidance · Structured extraction when you need 100% format compliance and can define the grammar precisely.
Extract validated structured data reliablyskillEngineeringDataL2
instructor · Extraction tasks where you want automatic validation and retry without building your own harness.
Generate valid JSON and code structuresskillEngineeringDataL2
outlines · Batch generation where you need guaranteed valid JSON/SQL/code and sampling speed is critical.
Analyze images with vision-language modelskillProductDataL2
blip-2-vision-language · Zero-shot image understanding tasks where training data is unavailable and frozen backbones reduce compute.
Match images to text semanticallyskillProductDataL2
clip · Quick zero-shot image classification and semantic search when training data unavailable and model is open-source.
Compile research into AI artifactsskillProductDataL3
ara-compiler · Ingest papers, code, experiments into falsifiable, agent-traversable knowledge package with cognitive + physical layers and provenance tracking.