cyberneticlibrary

Build data pipelines and ETL workflows

data-engineersubagentsetup L31
morganmuli/metaskill
What it does

Design and implement data pipelines, ETL, schema, and data quality checks

Best for

Building reliable, tested data infrastructure when source systems are messy or heterogeneous

Inputs
  • · source system description
  • · target schema requirements
  • · data volume/frequency
Outputs
  • · pipeline orchestration code
  • · schema definitions
  • · data validation tests
Requires
  • · pandas
  • · SQL
  • · dbt/Airflow
  • · data warehouse connectors
Preconditions
  • · source systems accessible
  • · target warehouse identified
Failure modes
  • · source data quality poor
  • · schema changes mid-pipeline
  • · performance bottleneck at scale
Trust signals
  • · validates data at each stage
  • · logs lineage and transformations
  • · handles incremental loads
  • · schema evolution versioned