cyberneticlibrary

Discover TF binding motifs in ChIP-seq peaks

motif-analysisskillsetup L335
ammawla/encode-toolkit
What it does

Perform de novo and known motif discovery in genomic regions

Best for

Discovering unknown binding factors or validating ChIP-seq specificity when a prior motif is not available.

Inputs
  • · Peak sequences (FASTA)
  • · Optional: JASPAR motif database
  • · Optional: motif PWM file
Outputs
  • · De novo motif PWMs
  • · Motif enrichment p-values
  • · Motif logos
Requires
  • · Bash/CLI
  • · HTTP/REST API
  • · ENCODE/UCSC genomics APIs
  • · Database
  • · Git/GitHub
  • · Runtime (Python/Node)
Preconditions
  • · Peak sequences in FASTA
  • · MEME/HOMER installed or accessible via API
  • · Assembly fasta reference
Failure modes
  • · Low-complexity sequences dominate de novo discovery
  • · Background model misspecified (affects p-values)
  • · Motif logos hard to interpret if low information content
Trust signals
  • · Heinz et al. 2010 (HOMER protocol)
  • · Multiple motif comparison methods
  • · FDR-corrected p-values