cyberneticlibrary

Scrape and extract web data

firecrawl-cliskillsetup L20
firecrawl/cli
What it does

Scrape, crawl, and extract web data via CLI

Best for

Extract structured data from dynamic websites in CI/CD pipelines or local scripts without browser overhead

Inputs
  • · URL or website domain
  • · format flags (markdown, html, json)
  • · schema for structured extraction
Outputs
  • · Markdown content
  • · Raw HTML
  • · JSON with extracted data
  • · Screenshots
Requires
  • · Firecrawl API (firecrawl.dev or self-hosted)
Preconditions

Firecrawl API key configured (env var or login)

Failure modes

Authentication failure if API key missing; timeouts on large crawls; JS rendering delays

Trust signals
  • · Official Firecrawl CLI with init workflow
  • · MCP server integration for AI agents
  • · Multiple format outputs including structured extraction