Discover TF binding motifs in ChIP-seq peaks
motif-analysisskillsetup L3★35
ammawla/encode-toolkit ↗What it does
Perform de novo and known motif discovery in genomic regions
Best for
Discovering unknown binding factors or validating ChIP-seq specificity when a prior motif is not available.
Inputs
- · Peak sequences (FASTA)
- · Optional: JASPAR motif database
- · Optional: motif PWM file
Outputs
- · De novo motif PWMs
- · Motif enrichment p-values
- · Motif logos
Requires
- · Bash/CLI
- · HTTP/REST API
- · ENCODE/UCSC genomics APIs
- · Database
- · Git/GitHub
- · Runtime (Python/Node)
Preconditions
- · Peak sequences in FASTA
- · MEME/HOMER installed or accessible via API
- · Assembly fasta reference
Failure modes
- · Low-complexity sequences dominate de novo discovery
- · Background model misspecified (affects p-values)
- · Motif logos hard to interpret if low information content
Trust signals
- · Heinz et al. 2010 (HOMER protocol)
- · Multiple motif comparison methods
- · FDR-corrected p-values