Backfill public contracts data to BigQuery
contracts-finder-backfillworkflowsetup L4★0
chrisns/uk-tenders-mcp ↗What it does
Resilient sharded ingestion of UK Contracts Finder OCDS 2016–2026 into BigQuery with resumable partial loads
Best for
Large-scale procurement time-series ingestion where idempotency and resumability are critical.
Inputs
- · CF API
- · month window (2016-11 to 2026-05)
- · BigQuery project
Outputs
- · BigQuery releases + processes tables
- · cross-source dedup via process_group
Requires
- · BigQuery
- · UK Contracts Finder API
- · Python ingestion module (uk_tenders_ingest)
Preconditions
GCP project + BigQuery initialized; CF API accessible; Python .venv available; PYTHONPATH configured
Failure modes
- · Per-month agent timeout (600s) → status=partial_or_timeout, resumable
- · Dedup call fails → return match result anyway
Trust signals
- · Streaming per-week within month (partial success survives timeout)
- · Idempotent UPSERT (no duplicates on retry)
- · Cross-source dedup via process_group matcher