cyberneticlibrary

The library

Everything we index — ranked by what works, never by stars.

WORKS 57
Score items with adversarial votingworkflowDataOpsL3
wf-swarm-score · Verifying adversarial score spreads and detecting hallucinating ants via 3-role consensus.
WORKS 55
Synthesize multi-source research with verificationworkflowOpsDataL3
research · Exploratory research with parallel hypothesis agents and cross-check synthesis.
WORKS 57
Grade adversarial test corpus with consensusworkflowEngineeringDataL3
ix-adversarial-llm-panel · Complex workflows requiring parallel agents, synthesized judgment, or multi-phase triage.
WORKS 54
Measure software performance comprehensivelyworkflowEngineeringDataL3
measure-performance · Perf fixes where file groups are disjoint and can edit in parallel without race conditions.
WORKS 55
Analyze speech synthesis gaps per languageworkflowEngineeringDataL3
espeak-audit-all-langs · Finding subtle bugs or policy violations across large codebases that require consistent multi-agent consensus.
WORKS 55
Compare TTS voice quality with human audioworkflowEngineeringDataL3
espeak-audit-tts-vs-human · Finding subtle bugs or policy violations across large codebases that require consistent multi-agent consensus.
WORKS 58
Research topics with citation verificationworkflowDataL3
investigate · Orchestrate multi-agent investigate across parallel branches and synthesis.
WORKS 55
Detect and classify unknown platformsworkflowDataL3
detect-platforms · workflow automation in specialized problem domain; check artifact name and description.
WORKS 60
Extract evidence from research papersworkflowDataL2
fieldatlas-deepread · Literature review requiring faithful extraction with verbatim evidence from long academic papers.
WORKS 59
Tag and discover benchmark tasksworkflowProductDataL2
enrich-task-tags · Batch tagging homogeneous content (e.g., all AL benchmark tasks) against controlled vocabulary.
WORKS 55
Extract structured data from Japanese PDF vocabularyworkflowDataL2
vocab-extract · Best for automating complex multi-phase vocab-extract processes at scale.
WORKS 55
Extract and deduplicate reusable judgment patternsworkflowDataL3
compound-extract · Mining reusable patterns from code changes while deduplicating existing knowledge.
WORKS 59
Run daily multi-tier model performance arenaworkflowDataL3
model-arena-daily · Benchmarking multi-tier LLM responses against a canonical prompt with regression detection and cost-efficiency scoring.
WORKS 59
Generate state-of-field research reportworkflowProductDataL3
fieldatlas-synth · Grounding research idea generation in corpus citations with novelty-checking and feasibility critique.
WORKS 54
Compare AI models dailyworkflowDataProductL3
model-arena-daily · Daily intelligence on which Claude tier is best for each task type. Cost-disciplined by design: 3 generators + 1 judge =
WORKS 58
Audit research findings for completenessworkflowDataL3
grounding-audit · Parallel read-only audits against design claims with completeness criticism.
WORKS 55
Generate daily research briefworkflowDataMarketingL3
research-pulse-daily · Daily lightweight research pulse: 3 parallel domain scans + synthesis.
WORKS 48
Build company research source catalogworkflowDataL3
driver-menu-build · Interactive menu-driven workflows with dynamic branching.
WORKS 56
Search from multiple angles in parallelworkflowDataL3
multi-modal-sweep · Orchestrate multi-phase multi-modal-sweep tasks with structured verification and recovery.
WORKS 55
Debug trading indicator efficacyworkflowDataL3
indicator-why-believed · Orchestrate multi-phase indicator-why-believed tasks with structured verification and recovery.
WORKS 57
Review papers across model tiersworkflowEngineeringDataL3
paper-review-fanout · Parallel review and triage workflows across multiple criteria or models.
WORKS 55
Validate field atlas extractionsworkflowDataEngineeringL2
fieldatlas-validate · Environment or artifact verification with mechanical pass/fail detection against ground truth.
WORKS 55
Sweep screenspot parameters single-variableworkflowEngineeringDataL3
screenspot-param-sweep · Best for adversarial evaluation across agent personas.
WORKS 55
Active-learn classification prompt rulesworkflowDataEngineeringL3
classify-improve · improving classifiers via data-driven signal mining from real failures
WORKS 60
Scout and combine research algorithmsworkflowDataL3
scout-and-combine · fan-out parallel research with synthesized results
WORKS 55
Survey binary behavior across reposworkflowDataL3
rebench-binary-survey · Comprehensive multi-phase scan across heterogeneous source corpus with parallel validation.
WORKS 59
Benchmark model conversion performanceworkflowDataL3
model-convert-benchmark · Multi-phase model-convert-benchmark orchestration with structured phase control and receipts.
WORKS 56
Classify reference taxonomyworkflowDataL2
references-audit-classify · Comprehensive multi-phase scan across heterogeneous source corpus with parallel validation.
WORKS 54
Load bulk contract data into BigQueryworkflowDataL3
cf-parallel-harvester-rawload · Append-loading UK Contracts Finder historical data (2016-2026) into BigQuery in resumable 2-month shards.
WORKS 57
Extract entities from news articlesworkflowDataL2
news-graph-extract · Building knowledge graphs from news corpora where entities and relationships must be structured for Neo4j ingestion.
WORKS 48
Catalog object-detection datasetsworkflowDataL3
find-kitchen33-datasets · When you need systematic discovery and verification of specialized datasets across multiple sources.
WORKS 52
Verify retention tournament metricsworkflowDataL3
alberta-retention-tournament-verify · When you need multi-phase orchestration with parallel agents across a complex workflow.
WORKS 60
Build driver catalogs by industryworkflowOpsDataL3
driver-menu-build · Building curated driver catalogs from event sources with blind naming.
WORKS 55
Analyze with adversarial score verificationworkflowDataL3
wf-analyze · Go/no-go decisions on architecture changes when you want scored evidence from multiple dimensions plus adversarial challenge built in.
WORKS 51
Research topics across multiple sourcesworkflowOpsDataL3
research · Synthesizing evidence on academic/technical topics when you need multi-modality (text+code+data) sweep plus two-stage extraction-enrichment pipeline.
WORKS 48
Autonomous ML research loopworkflowDataEngineeringL3
{PROJECT_NAME}-autoresearch · Use when workflow operations need {PROJECT_NAME} autoresearch.
WORKS 48
Extract and merge literature reviewworkflowDataProductL3
paper-review-literature · Use when workflow operations need paper review literature.
WORKS 48
Adversarially review research paperworkflowDataOpsL3
review-paper · Use when workflow operations need review paper.
WORKS 48
Fan-out deep research with citationsworkflowDataMarketingL3
deep-research · Use when workflow operations need deep research.
WORKS 48
Research and verify specificity audit methodologyworkflowDataL3
step3-specificity-methodology · Auditing bispecific therapeutic window via parallel research, synthesis, and adversarial critique.
WORKS 48
Ingest and index large books in parallelworkflowDataL3
ingest-mcluhan-understanding-media · Ingesting large multi-chapter books with deduplication and cross-chunk synthesis via chunked planners.
WORKS 48
Cross-check research claims with sourcesworkflowDataL2
deep-research · Running multi-source research with web fetching, fact verification, and cited synthesis.
WORKS 55
Fan-out research with adversarial fact-checkworkflowOpsDataL3
research · Fan-out web research, dedup sources, synthesize with citations, and fact-check claims.
WORKS 48
Multi-panel refute findings with consensusworkflowDataOpsL3
verify-findings · Verify code audit findings using 3 independent skeptic lenses.
WORKS 48
Execute deterministic research pipelineworkflowOpsDataL3
research-pipeline · Execute research through fan-out collection, dedup, and analysis phases.
WORKS 51
Extract and verify factual claimsworkflowOpsDataL2
claim_verifier · Verify document claims are grounded in cited sources.
WORKS 48
Scan conformal prediction literatureworkflowDataEngineeringL3
cp-lit-scan · Systematically surveying a research frontier across multiple sub-topics to identify novel research gaps.
WORKS 48
Curate tech community research digestworkflowOpsDataL2
research-digest · Aggregating research signals from 7 heterogeneous sources with real-time relevance ranking and dedup.
WORKS 48
Run ablation study on hardest problemsworkflowDataL3
mhpp-10-ablation · Ablation studies where solver agents invoke external tools themselves (agentic-tool pattern, not pre-generation).
WORKS 48
Backfill public contracts data to BigQueryworkflowDataL4
contracts-finder-backfill · Large-scale procurement time-series ingestion where idempotency and resumability are critical.
page 1 / 2