Analyze categorical data with charts
comparison-analysisskillsetup L3★1,354
OpenSenseNova/SenseNova-Skills ↗What it does
Analyze categorical data with comparison stats and charts
Best for
Quick categorical comparison analysis when data is spread across Excel sheets and visual charts (bar + pie) help identify distribution patterns.
Inputs
- · Excel file (.xlsx) with data across multiple sheets
- · two categorical dimensions to compare
- · row count per sheet (to assess if large-file optimization needed)
Outputs
- · total row count across all sheets
- · data cleaned: merged cells filled (ffill), empty values dropped, placeholder rows excluded
- · categorization statistics: count per category, difference, percent distribution
- · multi-dimensional comparison table
- · bar chart (matplotlib, colorized, labeled)
- · pie chart (matplotlib, percentage labels)
- · Excel export of analysis report
- · download link to report
Requires
- · pandas (read_excel, ffill, groupby, count)
- · matplotlib (bar chart, pie chart, Chinese font config)
Preconditions
- · Excel file (.xlsx) with multiple sheets
- · Two categorical dimensions identified for comparison
- · File size assessed (total row count determines optimization strategy)
Failure modes
- · Merged cells not handled (data cells treated as empty)
- · Placeholder rows ('代码', '名称') not excluded (inflates counts)
- · Empty values not dropped (distorts statistics)
- · Comparison table is sparse or missing (no aggregation)
- · Charts lack labels or units (hard to interpret)
- · Chinese font not configured in matplotlib (mojibake output)
- · Large files processed without streaming (memory overflow)
Trust signals
- · Five-step workflow: count rows → clean (ffill + exclude) → aggregate → visualize → export
- · Chinese font configuration explicit (plt.rcParams for SimHei/DejaVu)
- · DataFrame operations named: ffill (merged cells), dropna, groupby, count
- · Two visualization types: bar chart (with value labels) + pie chart (with percentages)
- · Merged cell handling documented (ffill strategy)
- · Placeholder exclusion pattern ('代码' as exclude_val example)