Audit AI agent skills for safety and cost
skill-evalskillsetup L3★9
aws-samples/sample-agent-skill-eval ↗What it does
Evaluate AI agent skill for production readiness and capability fit
Best for
When vetting skills for inclusion in agent harnesses or production use.
Inputs
- · skill definition
- · test case coverage
Outputs
- · readiness score
- · failure mode analysis
- · recommendation
Requires
- · evaluation rubric
- · test harness
Preconditions
- · skill has README
- · triggers defined
Failure modes
- · score misses critical gap
- · insufficient test coverage
- · false positive readiness
Trust signals
- · Rubric weights specified
- · failure mode categories exhaustive