Broken Link Checker & Site Auditor
Pricing
$1.00 / 1,000 page crawleds
Broken Link Checker & Site Auditor
Crawl websites to detect 404 broken links and missing resources. Essential for maintaining technical SEO and user experience.
Pricing
$1.00 / 1,000 page crawleds
Rating
0.0
(0)
Developer
Andok
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
18 days ago
Last modified
Categories
Share
Broken Links Checker
Find every broken link on your website before visitors and search engines do. Broken links tank user experience and leak link equity during site migrations — this actor crawls your pages, checks every link (internal and external), and reports exactly which URLs return 404, 5xx, or timeout errors. Configure crawl depth, page limits, and concurrency to match sites of any size.
Features
- Deep crawling — follows internal links up to a configurable depth to discover broken links across your entire site
- Internal and external — optionally checks outbound links to third-party sites, not just internal pages
- Smart link checking — uses HEAD requests first, falls back to GET when servers block HEAD
- Per-page reporting — groups broken links by the page they appear on for easy triage
- Redirect chain tracking — captures the full redirect chain for each broken link
- Same-domain filtering — restrict crawling to the start URL's domain or allow cross-domain discovery
- Configurable limits — set max pages, max depth, and max links per page to control scope and cost
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
startUrls | array | Yes | — | Pages to begin crawling from (request list format) |
maxPages | integer | No | 100 | Maximum number of pages to crawl |
maxDepth | integer | No | 3 | How many link levels deep to follow from the start URLs |
sameDomainOnly | boolean | No | true | Only crawl pages on the same domain as the start URL |
checkExternalLinks | boolean | No | false | Also check outbound links to external domains |
maxLinksPerPage | integer | No | 200 | Maximum number of links to check on each crawled page |
timeoutSeconds | integer | No | 20 | HTTP timeout in seconds for each link check |
concurrency | integer | No | 5 | Number of pages to process in parallel |
userAgent | string | No | Mozilla/5.0 (compatible; ApifyActor/1.0; +https://apify.com) | Custom User-Agent header for HTTP requests |
Input Example
{"startUrls": [{ "url": "https://crawlee.dev" }],"maxPages": 200,"maxDepth": 3,"checkExternalLinks": true,"sameDomainOnly": true}
Output
Each crawled page produces one dataset item listing all broken links found on that page.
pageUrl(string) — the page that was crawleddepth(number) — how many links deep from the start URLpageStatus(number | null) — HTTP status of the crawled page itselfbrokenCount(number) — total broken links found on this pagebrokenLinks(array) — list of broken links withurl,status,finalUrl,redirectChain, anderrorcheckedAt(string) — ISO timestamperror(string | null) — error if the page itself could not be fetched
Output Example
{"pageUrl": "https://crawlee.dev/docs/introduction","depth": 1,"pageStatus": 200,"brokenCount": 2,"brokenLinks": [{"url": "https://crawlee.dev/docs/old-guide","status": 404,"finalUrl": "https://crawlee.dev/docs/old-guide","redirectChain": [],"error": null},{"url": "https://external-site.com/removed-page","status": null,"finalUrl": null,"redirectChain": [],"error": "ECONNREFUSED"}],"checkedAt": "2025-11-20T14:30:00.000Z","error": null}
Pricing
| Event | Cost |
|---|---|
| Page Crawled | $0.001 per page |
You are charged per page crawled. Platform usage fees apply separately.
Use Cases
- Pre-migration audits — scan your entire site before a domain or CMS migration to establish a baseline
- Post-migration validation — verify no links broke after launching on a new domain or URL structure
- SEO health checks — find and fix broken internal links that waste crawl budget and leak link equity
- Client reporting — agencies can schedule weekly runs to catch broken links before clients notice
- External link monitoring — detect when third-party resources you link to go offline
Related Actors
| Actor | What it adds |
|---|---|
| Redirect Chain Analyzer | Trace the full redirect chain for URLs to diagnose redirect loops and chains |
| Hreflang Checker | Validate hreflang tags on multilingual pages found during crawling |
| XML Sitemap URL Extractor | Extract all URLs from your sitemap to use as start URLs for a comprehensive crawl |