cyberneticlibrary

Scrape URLs at scale with Bright Data

brightdataskillsetup L20
Sheshiyer/skill-clusters
What it does

Scrape URLs with progressive fallback tiers

Best for

Sites with bot detection or CAPTCHA that block naive fetch

Inputs
  • · URL
  • · content type (optional)
Outputs
  • · markdown content
Requires
  • · WebFetch
  • · Curl
  • · Playwright
  • · Bright Data MCP
Preconditions

URL accessible or proxy-able; optional Bright Data account

Failure modes

Escalation chain exhausts without content; CAPTCHA blocks tier 4

Trust signals
  • · Four-tier documented fallback
  • · Markdown preservation