Pricing

$5.00 / 1,000 link checkeds

Broken Link Checker — Recursive Site Crawler

Recursively crawl your website and find every broken link, 404, redirect, and timeout. Checks internal and external links with configurable depth. 100 links free per run.

Pricing

$5.00 / 1,000 link checkeds

Rating

0.0

(0)

Developer

Manchitt Sanan

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Why this exists

Broken links hurt your SEO rankings, frustrate visitors, and make your site look unmaintained. Manually checking links on a 500-page site takes hours. This actor crawls your entire site in minutes, checks every internal and external link, and gives you a structured report.

Recursive crawling — follows internal links up to a configurable depth, not just one page
External link checking — lightweight HEAD requests to verify links to other domains
Status categorization — every link classified as broken (404/410/5xx), redirect (301/302), timeout, or OK
Severity levels — critical (404, 5xx), warning (redirects, timeouts), info (working links)
Context — shows which page the broken link was found on and what the anchor text says
100 links — try it on your site with zero risk

Quick start

{
    "startUrl": "https://example.com",
    "maxDepth": 3,
    "maxPages": 500,
    "checkExternalLinks": true
}

Hit Start and get a full report in minutes.

Feature comparison

Feature	HTTP Status Checker	parseforge	This actor
Single URL check	Yes	Yes	Yes
Recursive site crawl	No	Yes	Yes
External link checking	No	Yes	Yes
Status categorization	No	Basic	404/301/302/500/timeout
Severity classification	No	No	critical / warning / info
Anchor text context	No	No	Yes
Source page tracking	No	Yes	Yes
Configurable depth	No	Yes	Yes
Configurable max pages	No	Yes	Yes
Respect robots.txt	No	No	Yes (configurable)
URL pattern exclusion	No	No	Yes (glob patterns)
Dry run mode	No	No	Yes
Free tier	No	No	100 links free

Input

Field	Type	Default	Description
`startUrl`	string	(required)	URL to start crawling from
`maxDepth`	integer	`3`	Maximum link depth to follow (1–10)
`maxPages`	integer	`500`	Maximum pages to crawl (1–10,000)
`checkExternalLinks`	boolean	`true`	Check links pointing to other domains
`respectRobotsTxt`	boolean	`true`	Skip pages disallowed by robots.txt
`ignoredPatterns`	array	`[]`	URL patterns to skip (glob-style: `logout`, `admin`)
`outputFormat`	enum	`broken-only`	`broken-only` or `all`
`sitemapUrl`	string	(auto-detect)	URL to sitemap.xml. If not set, auto-checks `/sitemap.xml` and `/sitemap_index.xml`
`webhookUrl`	string	(optional)	POST full JSON results to this URL when audit completes
`googleSheetsId`	string	(optional)	Export broken links to this Google Sheet (spreadsheet ID)
`googleServiceAccountKey`	string	(optional)	Google Service Account JSON key for Sheets export
`dryRun`	boolean	`false`	Preview what would be crawled — no charges

Output

{
    "status": "success",
    "startUrl": "https://example.com",
    "summary": {
        "pagesChecked": 142,
        "linksChecked": 1847,
        "brokenLinks": 12,
        "redirects": 34,
        "errors": 3
    },
    "brokenLinks": [
        {
            "url": "https://example.com/old-page",
            "statusCode": 404,
            "statusCategory": "broken",
            "severity": "critical",
            "foundOn": "https://example.com/blog/post-1",
            "anchorText": "Learn more",
            "lastChecked": "2026-04-13T10:30:00Z",
            "error": null
        }
    ]
}

Status categories

Category	HTTP codes	Severity	Meaning
`broken`	404, 410, 5xx	critical	Link target is dead or server is failing
`redirect`	301, 302, 303, 307, 308	warning	Link works but goes through a redirect — consider updating
`timeout`	—	warning	Server did not respond within 10 seconds
`error`	—	critical	Network error, DNS failure, or connection refused
`ok`	2xx	info	Link is working (only shown in `all` output mode)

Pricing

$0.003 per link checked (pay-per-event pricing).

Only charged on successful runs — errors and dry runs are never charged.
500 links = $1.50
2,000 links = $6.00

Performance

Uses CheerioCrawler (pure HTTP) — no headless browser overhead
Default concurrency handled by Crawlee's built-in request queue
External links checked with parallel HEAD requests (batches of 20)
Typical: 200–500 links/minute depending on target site response time
10-second timeout per request, 1 retry on failure

Limitations

JavaScript-rendered links are not detected. This actor uses HTTP requests only (CheerioCrawler), not a headless browser. Links injected by JavaScript after page load will be missed.
Some sites aggressively block crawlers. If you see many timeouts, try reducing maxConcurrency or disabling checkExternalLinks.
External links are checked with HEAD requests only. Some servers respond differently to HEAD vs GET — a HEAD 404 does not always mean GET would also 404.
Maximum 10,000 pages per run to prevent runaway costs.

Other tools by accurate_pouch for web intelligence + automation:

TheCrawler — Web scraper + LLM-powered structured extraction. AGPL-3.0, also on npm (thecrawler@0.1.1). $0.005/page.
Sitemap Analyzer — Parse and validate XML sitemaps, status-check every URL, handle sitemap indexes. $0.004/sitemap.
Website Change Monitor — Track page changes via text, hash, or CSS selector; diff + webhook on change. $0.005/page.
Lighthouse Auditor — PageSpeed Insights API, Core Web Vitals, deltas, competitor comparison, Sheets export. $0.005/audit.
Tech Stack Detector — 7,517 signatures across 105 categories, implies chains. $0.02/URL.

Run on Apify

No setup needed. Click above to run in the cloud. $0.003 per operation.

Broken Link Crawler

pattonholdings/broken-link-crawler

Crawl a site, find every broken link, return one row per broken link with full referrer trail. Fetch-only (no headless browser) for speed and predictable cost. Configurable depth + external link inclusion.

Coleton Patton

Broken Link Checker - Find Dead 404 Links

logiover/broken-link-checker

Site-wide broken link checker: crawl any website, find 404 and dead links, export the link audit to CSV or JSON with source page and status code.

Logiover

Broken Link Checker: 404s & Dead Links w/ Status Codes

eliai/broken-link-checker

Broken link checker: give it a start URL, it crawls the page or whole site and returns every broken internal and external link as JSON â€” source page URL, broken link URL, and HTTP status code. Pay only for pages actually crawled, so cost is bounded before you run.

Anthony Snider

Website Broken Link Checker

techionik9993/website-broken-link-checker

Crawl websites and export clean broken-link, redirect, and HTTP status results for SEO audits, QA checks, content cleanup, and site maintenance.

Techionik

Broken Link Checker

automation-lab/broken-link-checker

Broken Link Checker crawls your website, discovers all internal and external links, and verifies each one. It finds 404 errors, server errors, timeouts, and other broken links — then tells you exactly which page links to each broken URL and what the anchor text says.

Stas Persiianenko

🔗 Sitemap & Broken Link Checker — Find Dead Links & 404s

iskoren/sitemap-broken-link-checker

Crawl any website or its sitemap and check every link for broken 404s, redirects, and errors — with the source page for each. Perfect for SEO audits and site QA.

Is Koren

Broken Link Checker - Find 404s and Dead Links

santamaria-automations/broken-link-checker

Crawl any website and find broken links, 404 errors, redirect chains, timeouts, and SSL failures. Essential for SEO audits, QA, and content maintenance. Export data, run via API, schedule and monitor runs, or integrate with other tools.

NanoScrape

Bulk URL Status Checker – Broken Link & Redirect Audit

logiover/bulk-url-status-checker

Bulk HTTP status code checker and broken link checker. Trace redirect chains, find 404s, export to CSV/JSON. No browser, no login.

Logiover

491

Broken Link Opportunity Auditor

glowing_glove/broken-link-opportunity-auditor

Find broken internal and external links on websites and return actionable SEO/link-building repair opportunities.

Ushba Khan

Broken Link Checker & Scraper - 404 Audit API

pink_comic/broken-link-checker

Scan pages for broken links, dead URLs, 404s, redirects, timeouts, and resource errors. Bulk link checker/scraper for SEO audits, content QA, site migrations, and link-rot monitoring. Returns source URL, link URL, anchor text, status code, broken flag, and error details.