Audit and clean CRM data quality

crm-hygiene-scannerskillsetup L2★40

What it does

Audit CRM data quality, detect duplicates, flag stale records, score completeness

Best for

CRM admin or marketing ops cleaning up legacy data before a migration or merge campaign—measure health, detect garbage, plan sprints.

Inputs

· CSV export file path (contacts, companies, or deals)
· Data type (contacts/companies/deals)
· CRM system (HubSpot, Salesforce, Pipedrive, other)
· Critical fields to measure (email, phone, company name, deal stage, last activity, owner)

Outputs

· Data profile: total records, columns, data types, fill rate per column, date range, unique vs. total
· Duplicate groups with confidence levels (HIGH: exact email match; MEDIUM: fuzzy name match or cross-field)
· Stale record flags (last activity threshold)
· Row completeness percentage
· CRM hygiene quality score (0-100)
· Prioritized cleanup plan with merge/delete recommendations

Requires

Preconditions

Failure modes

· CSV encoding mismatches (UTF-8 vs. Latin-1) → garbled data
· Column mapping wrong (email in 'Email Address', fuzzy names) → duplicates missed
· No date columns → cannot detect stale records
· Critical field list not provided → defaults to generic columns, missing business context

Trust signals

· Four-step methodology with explicit skip warnings
· Three types of duplicates with confidence levels and normalization rules
· Fuzzy match criteria: company name normalization (strip legal suffixes, punctuation), name variants (Bob/Robert), edit distance ≤2
· Large CSV sampling rule (first 10K rows for samples ≥ 10K, extrapolate results)
· Recommendation per duplicate group with 'keep most complete + recent' rule