Pricing

Pay per event

🔗 Broken Link Checker & Crawler

Crawl any website up to five levels deep to extract 404 errors, dead outbound URLs, and precise anchor text details for technical SEO audits.

Pricing

Pay per event

Rating

0.0

(0)

Developer

太郎山田

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

Broken Link Checker API | Crawl, 404 & Redirect Audit

Part of the Website Health Suite — Comprehensive website trust, compliance, and technical SEO monitoring.

Crawl websites to systematically extract dead URLs, 404 errors, and broken outbound links using this high-performance web scraper. Maintaining a healthy link profile is a foundational pillar of technical SEO and user experience. Search engines like Google actively penalize sites with excessive broken links, making routine link audits indispensable for SEO agencies, digital marketers, and e-commerce managers. This tool automatically crawls your web pages, navigating internal structures up to five levels deep, to pinpoint exactly where user journeys hit dead ends.

Designed for recurring website audits — Schedule weekly or monthly runs to catch broken links before they hurt your rankings. Perfect for post-launch QA during large-scale site migrations, routine content maintenance, or vetting partner websites for link decay. When you run this scraper, it checks both internal navigation elements and external outbound references. The scraped results provide a comprehensive per-page breakdown of your website's link health. For every broken URL discovered, the tool captures vital data including the exact source page where the broken link lives, the anchor text used, and the specific HTTP error classification. Integrate these results directly into your reporting dashboards or QA workflows to keep your site optimized and error-free.

Store Quickstart

Start with the Quickstart template (single starting URL, depth 2). For full-site audits, use Deep Crawl (depth 5, up to 500 pages). For ongoing monitoring, use Weekly Site Health with webhook alerts.

Key Features

🕸️ Configurable crawl depth — Follows internal links up to 5 levels deep, up to 500 pages per run
🌐 Internal + external checks — Validate both your own links and outbound references
📍 Anchor text reporting — Identify which link text points to the broken URL
🏷️ Error classification — TIMEOUT, DNS_FAILED, CONNECTION_REFUSED, SSL_ERROR
⚡ Concurrent fetching — 1-10 parallel requests to speed up crawls
📊 Per-page breakdown — Each result shows all broken links grouped by source page

Use Cases

Who	Why
SEO agencies	Regular broken-link audits for client websites to protect ranking
Content editors	Find dead outbound links in blog posts and documentation
E-commerce sites	Monitor product pages for broken navigation and outbound partner links
Site migrations	Validate internal linking after URL restructuring
Technical SEO	Identify redirect chains and crawl traps that waste crawl budget

Input

Field	Type	Default	Description
startUrls	string[]	(required)	URLs to start crawling (max 10)
maxDepth	integer	2	Crawl depth (1-5)
maxPages	integer	50	Max pages to crawl (1-500)
concurrency	integer	5	Parallel requests (1-10)
checkExternal	boolean	true	Check external links
timeoutMs	integer	10000	Request timeout in ms

Input Example

{
  "startUrls": ["https://example.com"],
  "maxDepth": 2,
  "maxPages": 50,
  "concurrency": 5,
  "checkExternal": true
}

Output

Field	Type	Description
`url`	string	Page URL that was crawled
`brokenLinks`	object[]	Array of broken link objects found on the page
`brokenLinks[].href`	string	The broken link URL
`brokenLinks[].anchor`	string	Anchor text of the link
`brokenLinks[].statusCode`	integer	HTTP status code returned (404, 500, 0 for network errors)
`brokenLinks[].error`	string	null
`depth`	integer	Crawl depth at which this page was discovered
`crawledAt`	string	ISO 8601 timestamp

Output Example

{
  "url": "https://example.com/blog",
  "brokenLinks": [
    {
      "href": "https://example.com/deleted-page",
      "statusCode": 404,
      "anchorText": "Old announcement",
      "isExternal": false,
      "error": null
    }
  ]
}

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~broken-link-checker/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "startUrls": ["https://example.com"], "maxDepth": 2, "maxPages": 50, "concurrency": 5, "checkExternal": true }'

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/broken-link-checker").call(run_input={
  "startUrls": ["https://example.com"],
  "maxDepth": 2,
  "maxPages": 50,
  "concurrency": 5,
  "checkExternal": true
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/broken-link-checker').call({
  "startUrls": ["https://example.com"],
  "maxDepth": 2,
  "maxPages": 50,
  "concurrency": 5,
  "checkExternal": true
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

Start with maxDepth: 2 and maxPages: 50 for fast audits before scaling up.
Set checkExternal: false to focus only on internal broken links (faster).
Combine with URL Health Checker for a full SEO link audit.
Run weekly to catch newly broken outbound links in published content.

FAQ

How does crawl depth work?

Depth 1 = only starting URLs. Depth 2 = starting URLs + links found on them. Depth 5 is the maximum and covers most typical sites.

Does it respect robots.txt?

Yes. Pages blocked by robots.txt are skipped during crawl.

Can I exclude certain URL patterns?

Not in the current version. Add URL pattern exclusion to input if needed in future releases.

How long does a 500-page crawl take?

With concurrency=5 and 10s timeout: roughly 5-15 minutes depending on site speed.

Will it crawl pages behind login?

No — public pages only. Pages requiring authentication will be skipped.

How does it handle JavaScript-rendered links?

It uses static HTML parsing. SPA links rendered after page load will not be found.

Complete Your Website Health Audit

Website Health Suite — Build a comprehensive compliance and trust monitoring workflow:

1. Link & URL Health (you are here)

🔗 Broken Link Checker — Find broken links across your entire site structure
🔗 Bulk URL Health Checker — Validate HTTP status, redirects, SSL, and response times for URL lists

2. SEO & Metadata Quality

🏷️ Meta Tag Analyzer — Audit title tags, Open Graph, Twitter Cards, and hreflang
Schema.org Validator — Validate JSON-LD and Microdata with quality scoring

3. Security & Email Deliverability

DNS/DMARC Security Checker — Audit SPF, DKIM, DMARC, and MX records

4. Historical Data & Recovery

📚 Wayback Machine Checker — Find archived snapshots for content recovery

Recommended workflow: Run Broken Link Checker weekly → Fix dead links → Validate with URL Health Checker → Monitor metadata with Meta Tag Analyzer → Schedule recurring compliance audits.

Other Website Tools:

Sitemap Analyzer — SEO sitemap audit
Site Governance Monitor — Robots.txt and schema monitoring
Domain Trust Monitor — SSL expiry and security headers

Cost

Pay Per Event:

actor-start: $0.01 (flat fee per run)
dataset-item: $0.005 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.005) = $5.01

No subscription required — you only pay for what you use.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.

Broken Link Checker — Recursive Site Crawler

accurate_pouch/broken-link-checker

Recursively crawl your website and find every broken link, 404, redirect, and timeout. Checks internal and external links with configurable depth. 100 links free per run.

Manchitt Sanan

Broken Link Checker & Site Auditor

andok/broken-links-checker

Crawl websites to detect 404 broken links and missing resources. Essential for maintaining technical SEO and user experience.

Andok

Broken Link Checker

automation-lab/broken-link-checker

Broken Link Checker crawls your website, discovers all internal and external links, and verifies each one. It finds 404 errors, server errors, timeouts, and other broken links — then tells you exactly which page links to each broken URL and what the anchor text says.

Stas Persiianenko

Broken Link Checker - Find 404s and Dead Links

santamaria-automations/broken-link-checker

Crawl any website and find broken links, 404 errors, redirect chains, timeouts, and SSL failures. Essential for SEO audits, QA, and content maintenance. Export data, run via API, schedule and monitor runs, or integrate with other tools.

Alessandro Santamaria

Broken Link Checker

parseforge/broken-link-checker

Scan thousands of URLs instantly and detect broken links, 404s, redirects, and slow pages. Get comprehensive link health reports with status codes, response times, redirect chains, and detailed error information. Perfect for website maintenance, SEO audits, and quality assurance.

ParseForge

5.0

Broken Link Checker - Website Link Validator & 404 Finder

scrappy_garden/broken-link-checker

Crawl a website (or list of pages) and detect broken links (404/500), unreachable URLs, and invalid asset references. Generates a structured report for SEO audits, QA testing, and website maintenance.

Bikram Adhikari

Broken Link Checker - Dead Link & 404 Finder

pink_comic/broken-link-checker

Scan any web page and find all broken links (404s, timeouts, errors). Checks every link including images, scripts, stylesheets. Returns status codes, anchor text, error details. Parallel processing for speed. For SEO audits, website maintenance, content QA, and link rot detection.

Ava Torres

Broken Link Checker - Ensure Your Website's Integrity

dainty_screw/find-broken-links-of-your-website

Maintain your website's health and user experience with our Broken Link Checker. Easily identify and fix broken links to enhance your site's navigation, improve SEO, and keep visitors engaged.

codemaster devops

5.0

Broken Link Finder

pillowy_travel/broken-link-finder

Finds and analyzes broken links on given web pages

SAHIL KUMAR

Broken Link Checker

jancurn/find-broken-links

Crawls a website and finds broken links. Unlike other similar SEO analysis tools, the actor also reports broken URL #fragments. The results are stored in a JSON and HTML report.