Website Broken Links & Redirects Checker avatar
Website Broken Links & Redirects Checker

Pricing

Pay per event

Go to Apify Store
Website Broken Links & Redirects Checker

Website Broken Links & Redirects Checker

Analyzes websites to detect broken links (4xx/5xx) and redirects (3xx). Checks internal/external links on single pages or crawls entire sites. Provides detailed reports per page and site summary.

Pricing

Pay per event

Rating

5.0

(2)

Developer

My Smart Digital

My Smart Digital

Maintained by Community

Actor stats

2

Bookmarked

8

Total users

4

Monthly active users

a day ago

Last modified

Share

Broken Links Checker

Apify Actor to analyze broken links and redirects on a website.

Description

This actor analyzes a website to detect broken links (404, 500, etc.) and redirects (301, 302, etc.). It can analyze a single page or crawl multiple pages of a site to check all internal and external links.

Features

  • Detection of broken links (HTTP 4xx and 5xx status codes)
  • Detection of redirects (HTTP 3xx status codes) with destination URL
  • Optional crawling of pages from the same domain
  • Verification of internal and external links (optional)
  • Detailed report per page with counters
  • Global site summary with complete statistics
  • Response time measurement for each link (in milliseconds)

Input

{
"startUrls": ["https://example.com"],
"crawlPages": false,
"maxPages": 50,
"maxConcurrency": 5,
"sameDomain": true,
"checkExternal": false,
"timeout": 10000
}

Parameters

  • startUrls (required) : List of starting URLs to analyze
  • crawlPages (optional, default: false) : Enable page crawling. If disabled, only the starting URLs are analyzed
  • maxPages (optional, default: 50) : Maximum number of pages to crawl (only if crawlPages is enabled)
  • maxConcurrency (optional, default: 5) : Number of pages to crawl in parallel
  • sameDomain (optional, default: true) : Only crawl links from the same domain (only if crawlPages is enabled)
  • checkExternal (optional, default: false) : Also check external links (to other domains)
  • timeout (optional, default: 10000) : Timeout in milliseconds to check each link

Output

The actor generates two types of records:

Page Record

{
"type": "page",
"pageUrl": "https://example.com/page",
"title": "Page Title",
"httpStatus": 200,
"linksCount": 25,
"brokenLinksCount": 2,
"redirectLinksCount": 3,
"links": [
{
"url": "https://example.com/link",
"text": "Link Text",
"isInternal": true,
"httpStatus": 404,
"responseTime_ms": 150
},
{
"url": "http://example.com/old-page",
"text": "Old Page",
"isInternal": true,
"httpStatus": 301,
"responseTime_ms": 120,
"redirectUrl": "https://example.com/new-page"
}
]
}

Page Record Fields:

  • linksCount : Total number of links checked on the page
  • brokenLinksCount : Number of broken links (HTTP 4xx and 5xx status codes)
  • redirectLinksCount : Number of redirects (HTTP 3xx status codes)

Site Summary

{
"type": "site-summary",
"pagesCrawled": 10,
"linksTotal": 250,
"brokenLinksTotal": 15,
"redirectLinksTotal": 8,
"byStatus": {
"200": 227,
"301": 5,
"302": 3,
"404": 10,
"500": 2
},
"byType": {
"internal": 200,
"external": 50
},
"topBrokenLinks": [
{
"url": "https://example.com/broken",
"count": 5,
"pages": ["https://example.com/page1", "https://example.com/page2"],
"httpStatus": 404
}
]
}

Site Summary Fields:

  • pagesCrawled : Total number of pages analyzed
  • linksTotal : Total number of links checked
  • brokenLinksTotal : Total number of broken links (4xx, 5xx)
  • redirectLinksTotal : Total number of redirects (3xx)
  • byStatus : Distribution of links by HTTP status code
  • byType : Distribution of links by type (internal/external)
  • topBrokenLinks : Top 20 most frequent broken links with the pages where they appear

Link Fields:

  • url : Absolute URL of the link
  • text : Link text (content of the <a> tag)
  • isInternal : true if the link is on the same domain, false otherwise
  • httpStatus : HTTP response status code (200, 301, 302, 404, 500, etc.)
  • responseTime_ms : Response time in milliseconds
  • redirectUrl : Destination URL if the link is a redirect (3xx)