cyberneticlibrary

Convert genomic coordinates across assemblies

liftover-coordinatesskillsetup L235
ammawla/encode-toolkit
What it does

Convert genomic coordinates between assembly versions (GRCh37↔GRCh38, mm9↔mm10)

Best for

Integrating historical GWAS or legacy ENCODE data when assembly consistency is a silent correctness requirement.

Inputs
  • · Genomic coordinates (BED/VCF format)
  • · From assembly (e.g., hg19)
  • · To assembly (e.g., hg38)
  • · Chain file
Outputs
  • · Converted coordinates
  • · Unmapped region log with provenance
Requires
  • · LLM/Claude API
  • · Bash/CLI
  • · HTTP/REST API
  • · Git/GitHub
  • · ENCODE/UCSC genomics APIs
  • · Playwright/Browser automation
Preconditions
  • · UCSC chain files downloaded
  • · liftOver or CrossMap installed
  • · Coordinates in recognized assembly
Failure modes
  • · Regions with gaps unmapped silently (lossy)
  • · Reverse liftover (38→19) has different chain gaps
  • · Reverse-complement needed for minus strand
Trust signals
  • · Cites Kent et al. 2002 (UCSC liftOver original)
  • · Provenance logging of unmapped regions
  • · Assembly gap tables provided