cyberneticlibrary

Scan website for broken links and crawl issues

broken-link-checkerskillsetup L22,106
nowork-studio/toprank
What it does

Scan website for broken internal and external links affecting user experience and crawl budget

Best for

Site owners auditing technical health before major migrations or teams enforcing link quality in CI/CD.

Inputs
  • · target website URL
Outputs
  • · JSON report with broken_links array
  • · grouped by status (404 vs 5xx)
  • · source page for each broken link
  • · actionable fix (redirect vs update vs remove)
Requires
  • · Python script (seo/broken-link-checker/scripts/checker.py)
  • · --max-pages parameter (default 50)
Preconditions
  • · website accessible from scanning environment
  • · robots.txt allows scanning
  • · --max-pages tuned to site size
Failure modes
  • · crawl rate limits (returns incomplete results)
  • · JavaScript-rendered pages miss links
  • · false positives (temporary redirects, rate-limit 429s)
  • · external link drift (3rd-party site went down, not site's fault)
Trust signals
  • · grouped analysis (internal priority > external)
  • · status code classification (404 vs 5xx)
  • · source mapping (which pages link to the broken URL)
  • · actionable recommendations (301 redirect vs content fix)