cyberneticlibrary

Annotate peaks with genomic features

peak-annotationskillsetup L235
ammawla/encode-toolkit
What it does

Annotate ChIP-seq/ATAC-seq peaks with nearest genes, regulatory elements, conservation

Best for

Functional interpretation of ChIP-seq experiments when gene-centric answers are needed quickly.

Inputs
  • · Peak BED file
  • · Gene GTF/GFF
  • · Optional: conservation tracks, TF motif database
Outputs
  • · Annotated peaks with gene associations
  • · Peak classification (promoter, enhancer, etc.)
  • · Summary statistics
Requires
  • · Bash/CLI
  • · HTTP/REST API
  • · ENCODE/UCSC genomics APIs
  • · Database
  • · Git/GitHub
Preconditions
  • · Peaks in BED format
  • · Gene coordinates same assembly
  • · bedtools or HOMER available
Failure modes
  • · Nearest gene ≠ causal gene (can be 1 Mb away)
  • · Multi-promoter genes assigned ambiguously
  • · Conservation scores are tool/threshold-specific
Trust signals
  • · Multiple annotation methods (nearest, overlap, window)
  • · ENCODE regulatory element standards