Broken Link Checker - Ensure Your Website's Integrity avatar
Broken Link Checker - Ensure Your Website's Integrity

Pricing

$9.99/month + usage

Go to Apify Store
Broken Link Checker - Ensure Your Website's Integrity

Broken Link Checker - Ensure Your Website's Integrity

Developed by

codemaster devops

codemaster devops

Maintained by Community

Maintain your website's health and user experience with our Broken Link Checker. Easily identify and fix broken links to enhance your site's navigation, improve SEO, and keep visitors engaged.

0.0 (0)

Pricing

$9.99/month + usage

3

31

3

Last modified

9 hours ago

Broken Links Checker

An Apify actor for auditing websites, discovering broken links, and delivering actionable SEO insights. The crawler normalizes URLs, respects configurable throttling, supports Apify Proxy, and can notify your team the moment broken links are detected.

Quick Start

  • Open the actor in the Apify Console and click Run with the default input (it targets https://quicklifesolutions.com and finishes in a few minutes).
  • Review the HTML report in the default Key-Value Store record OUTPUT.html and inspect the dataset items for CSV-friendly exports.
  • Add your email address to receive a summary of broken links once the crawl completes.
  • Tune maxPages, requestDelayMs, and maxConcurrency to match the size and rate-limit tolerance of your target site.

Apify Console Flow

  1. Go to the actor page and choose Try me or Run.
  2. Paste your starting URL into Website URL (keep the protocol).
  3. Optionally adjust concurrency, delay, or maximum pages to control load.
  4. Add any notification recipients and choose whether to crawl subdomains.
  5. Select or customize the proxy settings. If you do not have valid Apify Proxy credentials, the actor gracefully continues without the proxy.
  6. Click Run. Track progress in the console log; intermediate results are checkpointed so long runs can resume after migrations.
  7. Download the HTML/JSON reports or connect the dataset to your preferred export format.

Trigger from API, Schedulers, or Integrations

  • Apify API (REST):
    curl "https://api.apify.com/v2/acts/USERNAME~actor-find-broken-links/runs?token=YOUR_TOKEN" \
    -X POST \
    -H 'Content-Type: application/json' \
    -d @input.json
  • Apify Schedules: create a schedule with your actor and note the default input stored in the configuration above.
  • Webhooks & Integrations: use Apify webhooks, Zapier, Make, or n8n to trigger runs and react to dataset updates. The actor fails fast on invalid input so orchestrations can handle errors cleanly.

Default Input (copy & run)

{
"baseUrl": "https://quicklifesolutions.com",
"crawlSubdomains": true,
"maxConcurrency": 10,
"maxPages": 10,
"notificationEmails": [
"codemasterdevops@gmail.com"
],
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": []
},
"requestDelayMs": 1500,
"saveOnlyBrokenLinks": true
}

Configuration Reference

ParameterTypeDefaultRequiredDescription
baseUrlstringStarting URL for the crawl. Must include http:// or https://.
crawlSubdomainsbooleanfalse (UI prefill true)Crawl matching subdomains when enabled. Helpful for multi-site audits.
maxConcurrencyinteger10Maximum parallel pages the crawler opens. Reduce if the target rate limits aggressively.
maxPagesinteger | null10Total number of pages to visit. Set to 0 to remove the limit.
requestDelayMsinteger1500Delay between requests in milliseconds to respect rate limits.
notificationEmailsstring[][]Recipients for the email summary of broken links. Empty array skips email.
saveOnlyBrokenLinksbooleantrueWhen true, dataset items are CSV-friendly records of broken pages only.
proxyConfigurationobject{ "useApifyProxy": true, "apifyProxyGroups": [] }Standard Apify proxy object. Invalid credentials are logged and the crawl continues without the proxy.

Output Schema

  • Key-Value Store
    • OUTPUT.json: structured site map with normalized URLs, link details, and anchor validation.
    • OUTPUT.html: ready-to-share report highlighting broken links in red.
  • Dataset (default store)
    • When saveOnlyBrokenLinks = true, each item is:
      {
      "url": "https://example.com/broken-page",
      "isBaseWebsite": true,
      "httpStatus": 404,
      "title": "Broken Page",
      "referrer": "https://example.com/"
      }
    • When saveOnlyBrokenLinks = false, full crawl diagnostics are saved with discovered anchors and outbound links.

Dataset Handling

  • Datasets append on every successful crawl. Clear the dataset between major audits if you want an empty slate.
  • Records are deduplicated per URL internally, so resuming after migrations updates in-memory data without duplicating rows.
  • Export datasets to CSV, JSON, or integrate with Google Sheets via Apify integrations.

Local Development

  1. Clone this repository and install dependencies with npm install.
  2. Set APIFY_TOKEN and run npm start to execute the actor locally with INPUT.json placed in the root (or specify APIFY_INPUT).
  3. Use APIFY_LOCAL_STORAGE_DIR=./storage to inspect datasets, key-value stores, and request queues as the run progresses.
  4. Adjust throttling and limits in src/config.js while testing against staging hosts before scaling up.

Troubleshooting

  • Input validation failed: ensure baseUrl includes the protocol and that numeric fields are positive integers.
  • Proxy errors: missing or invalid tokens are logged; runs continue without a proxy. Update proxyConfiguration if you require specific proxy groups.
  • Unexpected duplicates: clearing the target dataset before a new audit guarantees a clean export.
  • Email not delivered: check the Apify platform logs for the apify/send-mail actor. Failures are reported in the log but do not stop the crawl.

Support & Resources