Audit dataset quality and completeness
data-quality-auditorpluginsetup L2★17,464
alirezarezvani/claude-skills ↗What it does
Audit datasets for completeness, consistency, accuracy, and validity with DQS scoring
Best for
When preparing data for analysis or ML requires systematic audit of quality issues rather than spot-checking a few rows.
Inputs
- · dataset file (CSV, Parquet, JSON, database query result)
Outputs
- · data quality score (DQS) per dimension (completeness, consistency, accuracy, validity)
- · missing value analysis: MCAR/MAR/MNAR classification
- · outlier detection: multi-method (IQR, Isolation Forest, Z-score, etc.)
- · audit report with recommendations
Requires
- · 3 stdlib-only Python tools (data profiler, missing-value analyzer, multi-method outlier detector)
- · DQS framework (Gartner reference)
- · statistical methods (no external ML deps)
Preconditions
- · dataset is structured (tabular, not unstructured text)
- · column types identified (numeric, categorical, datetime, etc.)
Failure modes
- · DQS score masks important outliers in small datasets
- · MCAR/MAR/MNAR classification unreliable with <50 observations
- · Multi-method outlier detection produces conflicting flags (human judgment needed)
Trust signals
- · 3 stdlib-only Python tools (no external deps)
- · DQS framework (Gartner reference)
- · MCAR/MAR/MNAR missing-data classification
- · Multi-method outlier detection (IQR, Isolation Forest, Z-score)