cyberneticlibrary

Parse documents into structured text

liteparseskillsetup L227,559
K-Dense-AI/scientific-agent-skills
What it does

Extract layout-aware text and bounding boxes from PDFs, Office, and images locally

Best for

Local parsing with spatial awareness when cloud APIs are forbidden or bounding boxes are required for RAG grounding.

Inputs
  • · Document file (PDF, DOCX, XLSX, images) or bytes
  • · Optional: target pages, OCR settings, DPI
Outputs
  • · Text (layout-preserved) or JSON with per-item bboxes, font metadata, OCR confidence
Requires
  • · LLM/Claude API
  • · Bash/CLI
  • · Vault/Obsidian
  • · HTTP/REST API
  • · Git/GitHub
  • · Runtime (Python/Node)
Preconditions
  • · Python 3.10+
  • · Optional LibreOffice (Office formats), ImageMagick, bundled Tesseract
Failure modes
  • · OCR confidence drops on poor-quality scans
  • · Complex layouts (multi-column, sidebars) tokenize incorrectly
  • · Font detection unavailable for embedded fonts
Trust signals
  • · Rust core for speed
  • · Bundled Tesseract (no external OCR calls)
  • · Batch processing via CLI