cyberneticlibrary

Train genomic region embeddings

genimlskillsetup L227,559
K-Dense-AI/scientific-agent-skills
What it does

Train machine learning models on genomic regions

Best for

Unsupervised learning on genomic interval data, region embeddings, and single-cell ATAC analysis.

Inputs
  • · BED file collection
  • · Universe peak reference
  • · Optional: cell metadata
Outputs
  • · Region embeddings array
  • · Model weights
  • · Cluster assignments
Requires
  • · PyTorch
  • · Word2vec/StarSpace
  • · scanpy optional
Preconditions
  • · BED format valid
  • · Universe prebuilt
Failure modes
  • · Insufficient coverage
  • · Tokenization mismatch
  • · OOM
Trust signals
  • · BSD-2-Clause license
  • · databio/geniml repo