Broken Link Checker — Find 404s, Dead Links & Redirect Issues
Pricing
from $1.00 / 1,000 link checkeds
Broken Link Checker — Find 404s, Dead Links & Redirect Issues
Crawl a website, scan a URL list, or verify all URLs from a sitemap. Returns broken links with source page, anchor text, status, redirect chain, and failure class — for SEO audits, content QA, and migration validation.
Pricing
from $1.00 / 1,000 link checkeds
Rating
0.0
(0)
Developer
Khadin Akbar
Actor stats
0
Bookmarked
2
Total users
2
Monthly active users
2 days ago
Last modified
Categories
Share
Broken Link Checker — Find 404s, Dead Links, Redirects & Slow Pages
Scan a whole website, a list of URLs, or every URL in a sitemap.xml and find every broken link — 404s, server errors, timeouts, SSL/DNS failures, redirect chains, and slow pages. Each broken link comes with the source page that linked to it, the anchor text, the HTTP status, the redirect chain, and a failure class like broken, redirect_loop, timeout, dns_error, or ssl_error. Built for SEO audits, content QA, site migrations, and AI-agent link health checks.
Try it now — paste a website URL and get a CSV of every broken link in minutes.
What does Broken Link Checker do?
Broken Link Checker is a fast, configurable link auditor that runs on the Apify platform. Three modes cover every link-checking workflow:
- Crawl mode — start at a URL, crawl up to N pages on the same domain, extract every link (
<a>,<img>,<script>,<link>,<iframe>if enabled), and verify each one. - List mode — paste up to 5,000 URLs and get HTTP status + classification for each. No crawling.
- Sitemap mode — point at a
sitemap.xml(sitemap-index files supported) and verify every URL inside.
For every URL the actor tries HEAD first (fast, ~1 KB), then falls back to GET if the server returns an ambiguous code (405, 403, 501, etc.) — exactly how a real browser would. Redirects are tracked hop-by-hop so you can see the full chain. Output ships as a structured dataset you can download as CSV / JSON / Excel, query via the Apify API, schedule, or call from Make / Zapier / your own code.
Why use Broken Link Checker?
- Fix SEO-killing 404s — broken internal links bleed PageRank and frustrate users; broken outbound links damage E-E-A-T signals.
- Validate site migrations — confirm every old URL 301s correctly to its new home; flag redirect chains that lose link equity.
- Audit content at scale — agencies running monthly link-rot reports, enterprises auditing 10,000+ blog posts.
- Pre-launch QA — catch dead links in marketing pages, docs, and email templates before they go live.
- AI agents — perfect MCP tool for Claude / GPT agents auditing site quality. Predictable cost (~$0.001 per URL), structured JSON output, no setup required.
How to use Broken Link Checker
- Open the actor on Apify Console and click Try for free.
- Pick an input mode:
crawl→ paste a website URL into Website URL and set Max pages.list→ paste your URLs into URLs to verify.sitemap→ paste your sitemap.xml URL into Website URL.
- (Optional) Tune Max links to verify, toggle Check external links, and enable Check images, scripts, and stylesheets for a full content audit.
- Click Save & Start. The actor reports progress in real-time.
- When finished, open the Output tab → download as CSV, JSON, or Excel, or pull via the Apify API.
Input
| Field | Type | Default | Description |
|---|---|---|---|
mode | string | crawl | One of crawl, list, sitemap. |
startUrl | string | — | Required for crawl and sitemap. The site to crawl or the sitemap.xml URL. |
urls | array | — | Required for list. Up to 5,000 URLs. |
maxPages | int | 50 | Max internal pages to crawl (crawl mode only). |
maxLinksToCheck | int | 500 | Hard cap on links verified per run. |
checkExternalLinks | bool | true | Verify links pointing to other domains. |
checkAssets | bool | false | Also check <img>, <script>, <link>, <iframe> URLs. |
onlyReportBroken | bool | true | Dataset contains only failing links. Set to false for a full inventory. |
slowThresholdMs | int | 5000 | Above this, link is classified slow. |
requestTimeoutMs | int | 15000 | Per-request timeout in milliseconds. |
maxConcurrency | int | 20 | Parallel verification requests. |
maxRedirects | int | 10 | Max hops before classifying as redirect_loop. |
userAgent | string | ApifyBrokenLinkChecker UA | Custom User-Agent for all requests. |
proxyConfiguration | object | Apify datacenter | Proxy settings; switch to residential only if blocked. |
Output
Each broken link becomes a structured record in the dataset.
{"url": "https://example.com/missing-page","finalUrl": "https://example.com/missing-page","status": 404,"statusText": "Not Found","classification": "broken","isBroken": true,"sourceUrl": "https://example.com/blog/post-with-broken-link","anchorText": "see our pricing","linkType": "a","allSources": [{ "sourceUrl": "https://example.com/blog/post-with-broken-link", "anchorText": "see our pricing", "linkType": "a" }],"method": "GET","hops": 0,"redirectChain": [],"durationMs": 234,"error": null,"checkedAt": "2026-05-03T19:55:00.000Z"}
The final record is a _summary: true row with totals and a per-class breakdown.
You can download the dataset in JSON, CSV, Excel, HTML, RSS, or XML, or pull it via API.
Data table
| Field | Description |
|---|---|
url | Normalized URL that was checked. |
finalUrl | URL after following redirects. |
status | HTTP status of the final response. null if no response. |
statusText | HTTP status reason phrase. |
classification | One of ok, slow, redirect_chain, redirect_loop, broken, timeout, dns_error, ssl_error, connection_refused, blocked, error. |
isBroken | true for any failing class — easy filter for actionable rows. |
sourceUrl | First page on the crawled site that linked to this URL. |
anchorText | Visible link text (or alt/title for assets), trimmed and capped at 200 chars. |
linkType | How it was referenced: a, img, script, link, iframe, sitemap, list. |
allSources | Up to 5 source pages that link to this URL. |
method | HTTP method used for the final response (HEAD or GET). |
hops | Redirect hops followed. |
redirectChain | Array of { from, to, status } per hop. |
durationMs | Total verification time. |
error | { code, message } when verification failed. |
checkedAt | ISO 8601 timestamp. |
How much does it cost to check broken links?
Broken Link Checker uses pay-per-event pricing — only pay for what you actually verify.
| Event | Price |
|---|---|
| Actor start | $0.00005 |
| Link checked | $0.001 per URL verified |
Typical costs:
- Small site crawl (50 pages,
200 links) → **$0.20** - Bulk URL list (1,000 URLs) → ~$1.00
- Full sitemap (5,000 URLs) → ~$5.00
There are no monthly fees, no setup fees, and runs that fail before discovering links cost only the negligible $0.00005 start fee.
Tips & advanced options
- Cut cost in half by setting
checkExternalLinks: false— most sites have more outbound links than internal pages. - Skip asset checks (
checkAssets: false) for hyperlink-only audits — typical sites have 5-10× more assets than<a>links. - Use
mode: 'list'for outreach link audits — paste your placement URLs directly without crawling. - Use
mode: 'sitemap'for full URL coverage on sites with deep navigation that's hard to crawl. - For rate-limited targets, drop
maxConcurrencyto5-10and increaserequestTimeoutMs. - For fragile sites that block bots, switch
proxyConfiguration.apifyProxyGroupsto["RESIDENTIAL"]and customizeuserAgentto a real browser string. - Combine with other Apify tools: pair with Bulk Website Contact Extractor to find broken backlinks AND contact owners, or feed sitemaps from Sitemap URL Extractor.
Schedule recurring link audits
On Apify Console, open the Schedules tab on the actor and add a cron expression — e.g. 0 6 * * 1 to run every Monday at 6 AM. Combine with the Apify webhook integration to push results into Slack, Google Sheets, or your own monitoring stack.
FAQ
Does it follow robots.txt?
The actor verifies links by issuing a single HEAD/GET per URL — the same load a normal browser visit would create. For polite crawling, the crawl mode respects robots.txt via Crawlee's defaults; you can override per request if needed.
Can it check authenticated pages? Not yet. Add an issue if you'd like cookie-based authentication added.
What counts as a "broken" link?
The isBroken flag is true for these classes: broken (4xx/5xx, except 403/429), timeout, dns_error, ssl_error, connection_refused, redirect_loop, and error. The classes slow, redirect_chain, and blocked (403/429) are reported but not flagged broken — they're worth reviewing but may be intentional.
Why HEAD first? HEAD requests are 5-10× cheaper than GET for the target server and faster for you. Some servers refuse HEAD with 405/403/501 — when that happens we automatically retry with GET, so you still get an accurate result.
Disclaimer
Broken Link Checker only verifies HTTP responses to URLs. It does not download page content beyond crawl mode (which fetches HTML to extract links). Use this actor responsibly: respect target site Terms of Service, set sensible concurrency for small sites, and do not use the tool to probe systems you do not own or have permission to audit. The actor reports raw HTTP signals — interpretation (whether a 403 is intentional access control or a broken link) is up to you.
Support and feedback
Found a bug or want a feature? Open an issue on the actor's Issues tab on Apify Console.