Check Broken Link — Data, Details & Metadata avatar

Check Broken Link — Data, Details & Metadata

Pricing

from $10.00 / 1,000 results

Go to Apify Store
Check Broken Link — Data, Details & Metadata

Check Broken Link — Data, Details & Metadata

Check broken link data at scale with this powerful Apify actor. Extracts data, details & metadata with automatic pagination and proxy rotation. Perfect for market research, competitive intelligence, and data-driven decision making.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Donny Nguyen

Donny Nguyen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Broken Link Checker

Overview

Broken Link Checker is an Apify actor that crawls a website to find broken links including 404 errors, server errors, and timeouts. It checks both internal and external links, reporting the broken URL, the page where it was found, the HTTP status code, and the anchor text. This tool is essential for SEO health audits, website maintenance, and ensuring a good user experience. The actor uses efficient Cheerio-based crawling to process large sites quickly with minimal memory usage.

Features

  • Crawl entire websites starting from any URL
  • Detect 404, 500, and other HTTP error status codes
  • Check both internal links (same domain) and external links
  • Report source page where each broken link was found
  • Capture anchor text for context
  • Classify links as internal or external
  • Configurable crawl depth with max pages limit
  • Concurrent link checking for fast processing
  • HEAD requests for external links to minimize bandwidth

Use Cases

  • SEO Auditing: Broken links negatively impact search rankings; find and fix them
  • Website Maintenance: Regularly check your site for link rot after content changes
  • Content Migration: Verify all links still work after moving or restructuring content
  • Quality Assurance: Catch broken links before they affect users
  • Competitor Analysis: Check competitor sites for broken links as competitive intelligence

Input Configuration

ParameterTypeDefaultDescription
startUrlString"https://docs.apify.com"The URL to start crawling from
maxPagesInteger100Maximum number of pages to crawl

Output Format

Each result includes the broken URL, HTTP status code, error description, the source page where the link was found, anchor text, and link type (internal/external). Results are stored in the default Apify dataset and can be exported to CSV, JSON, or Excel.

Integration

Schedule regular audits via Apify schedules. Send broken link reports to email or Slack via Apify integrations. Use the Apify API to trigger checks from CI/CD pipelines after deployments.

Limitations and Notes

The actor crawls pages within the same hostname by default and checks external links via HEAD requests. Some servers may block HEAD requests, leading to false positives. Very large sites should use a higher maxPages value but will take longer to complete. Dynamic content loaded via JavaScript may not be visible to the Cheerio-based crawler. For JavaScript-heavy sites, consider using a Puppeteer-based alternative. Rate limiting may apply for external link checks to avoid overloading third-party servers.