cyberneticlibrary

Evaluate workflow runtime systems

evaluate-external-workflowworkflowsetup L30
cyl19970726/multi-agent-harness
What it does

Test an external workflow on benchmarks

Best for

Validating imported workflows when you need to verify they work on your task set.

Inputs
  • · workflow (importable script)
  • · benchmark_tasks (list)
Outputs
  • · success rate; failure analysis; verdict on whether to adopt
Requires
  • · workflow engine
  • · task harness
  • · evaluator agent
Preconditions

Workflow must be runnable; benchmark tasks must match workflow input contract.

Failure modes

Workflow fails to parse; tasks hang workflow; results are non-deterministic.