cyberneticlibrary

Verify facts and sources

data-fidelityskillsetup L264
Tibsfox/gsd-skill-creator
What it does

Measure and improve data quality and consistency

Best for

Discovering data quality issues before training to avoid garbage-in-garbage-out.

Inputs
  • · Dataset
  • · Ground truth labels
  • · Schema definition
  • · Fidelity metrics
Outputs
  • · Quality report
  • · Error patterns
  • · Cleaning recommendations
  • · Confidence scores
Requires
  • · pandas
  • · Great Expectations (optional)
Preconditions
  • · Dataset accessible
  • · Schema defined
  • · Ground truth available
Failure modes
  • · Labels biased
  • · Schema mismatch
  • · Missing value imputation wrong
  • · Outlier corruption
Trust signals
  • · Great Expectations framework
  • · Systematic quality gates