Accelerate large data transfers
data-throughput-acceleratorskillsetup L3★0
Sheshiyer/skill-clusters ↗What it does
Optimize large data movement and warehouse loading
Best for
When pipeline throughput is the bottleneck and data correctness must be auditable via hard counts.
Inputs
- · source extraction rate
- · network transfer rate
- · warehouse load speed
- · transform speed
Outputs
- · optimized pipeline
- · accounting block with metrics
- · manifest + row counts + timestamps
Requires
- · Read
- · Write
- · Edit
- · Bash
- · Grep
- · Glob
Preconditions
- · source, target, manifest contracts defined
- · backlog measured
Failure modes
- · deleted raw data to hide lag
- · silent file failures
- · manifest/table count mismatch
Trust signals
- · accounts for manifest rows, raw rows, derived rows
- · rerun final accounting
- · separate raw/derived/serving tables