cyberneticlibrary

Generate multilingual retrieval benchmarks

hiraia-gen-eval-queriesworkflowsetup L2★0

helloluis/hiraia ↗

What it does

Generate a labeled retrieval benchmark: realistic kid

Best for

Generating labeled benchmark queries for retrieval system evaluation.

Inputs

· agent context (workspace, diffs, or data)

Outputs

· QSCHEMA schema result
· NEG_SCHEMA schema result

Requires

· agent() orchestrator
· pipeline() runner
· JSON schema validation

Preconditions

· Workflow runtime initialized
· Input args properly structured
· 2 agent(s) available

Failure modes

· Agent timeout or failure
· Schema validation mismatch
· Input data incomplete or malformed

Trust signals

· Strict schema validation with required fields
· Explicit phase tracking and logging
· 2 independent agents with schemas