Bulk URL Status Checker – Broken Link & Redirect Audit
Pricing
from $1.00 / 1,000 results
Bulk URL Status Checker – Broken Link & Redirect Audit
Bulk URL checker for HTTP status codes (200/301/302/404/410/500) with final URL, redirect chain and response time. Fast parallel link audit for SEO, site migrations, monitoring and QA. Export clean results to CSV/JSON.
Pricing
from $1.00 / 1,000 results
Rating
0.0
(0)
Developer
Logiover
Maintained by CommunityActor stats
1
Bookmarked
60
Total users
19
Monthly active users
a day ago
Last modified
Categories
Share
Bulk URL Status Checker — Broken Link & Redirect Audit

Check the HTTP status of thousands of URLs in seconds. Detects broken links (404, 500), traces full redirect chains (301, 302), and measures response times — all in one fast, scalable run. The essential tool for SEO audits, link monitoring, content migrations, and QA pipelines.
What Is This Actor?
Every website accumulates broken links, outdated redirects, and slow pages over time. Checking them manually — or even with browser-based tools — doesn't scale. This actor takes any list of URLs and checks each one in parallel via direct HTTP requests, returning a structured report with status codes, redirect chains, final destination URLs, and response times.
Built for:
- 🔍 SEO auditors — identify 404s and redirect chains hurting crawl budget
- 🛠️ Web developers — validate links after a site migration or CMS change
- 📋 QA engineers — run automated link health checks in CI/CD pipelines
- 📣 Content managers — monitor internal and external link health across articles
- 📊 Data analysts — enrich URL datasets with live status metadata
- 🔗 Backlink managers — verify that inbound links still resolve correctly
Features
- Bulk checking — audit thousands of URLs in a single run
- Full HTTP status detection — 200 OK, 301/302 redirects, 403 Forbidden, 404 Not Found, 410 Gone, 500 Server Error, and more
- Redirect chain tracing — captures every hop in a redirect sequence, not just the final destination
- Final URL reporting — always shows where a URL ultimately resolves to
- Response time measurement — millisecond-precision timing per URL
- Network error handling — DNS failures, timeouts, and connection errors are recorded as status
0(not silently dropped) - Automatic retries — transient network errors are retried up to 2 times before marking as failed
- High concurrency — up to 100 parallel checks configurable; default 20
- Proxy support — built-in Apify Proxy integration to avoid rate limiting at scale
- No browser overhead — pure HTTP via
got-scraping; blazing fast and low cost - Export-ready — JSON, CSV, and Excel output via Apify Dataset
Output Data
Each record in the dataset corresponds to one checked URL.
| Field | Type | Description |
|---|---|---|
url | string | The original URL as provided in the input |
statusCode | integer | HTTP response status code. 0 means a network/DNS error |
statusMessage | string | Human-readable status description (e.g. "Not Found", "OK") |
isBroken | boolean | true if statusCode is 400 or above, or 0 (network error) |
isRedirect | boolean | true if the URL was redirected at least once before resolving |
redirectChain | array | Ordered list of intermediate URLs in the redirect chain |
finalUrl | string | null | The last URL after all redirects. null on network error |
responseTime | integer | Total time in milliseconds from request to final response |
checkedAt | string | ISO 8601 timestamp of when this URL was checked |
Status Code Reference
| Code | Message | isBroken | Meaning |
|---|---|---|---|
200 | OK | ❌ No | Page loaded successfully |
201 | Created | ❌ No | Resource created |
301 | Moved Permanently | ❌ No | Permanent redirect (followed) |
302 | Found | ❌ No | Temporary redirect (followed) |
307 | Temporary Redirect | ❌ No | Temporary redirect (followed) |
308 | Permanent Redirect | ❌ No | Permanent redirect (followed) |
400 | Bad Request | ✅ Yes | Malformed request |
401 | Unauthorized | ✅ Yes | Authentication required |
403 | Forbidden | ✅ Yes | Access denied |
404 | Not Found | ✅ Yes | Page does not exist |
410 | Gone | ✅ Yes | Page permanently removed |
429 | Too Many Requests | ✅ Yes | Rate limited |
500 | Internal Server Error | ✅ Yes | Server-side error |
502 | Bad Gateway | ✅ Yes | Upstream server error |
503 | Service Unavailable | ✅ Yes | Server temporarily down |
504 | Gateway Timeout | ✅ Yes | Server timed out |
0 | Network Error | ✅ Yes | DNS failure, connection refused, timeout |
Sample Output Records
Healthy URL:
{"url": "https://apify.com/blog","statusCode": 200,"statusMessage": "OK","isBroken": false,"isRedirect": false,"redirectChain": [],"finalUrl": "https://apify.com/blog","responseTime": 312,"checkedAt": "2025-05-15T10:22:05.000Z"}
Redirect chain:
{"url": "http://oldsite.com/page","statusCode": 200,"statusMessage": "OK","isBroken": false,"isRedirect": true,"redirectChain": ["https://oldsite.com/page","https://newsite.com/page"],"finalUrl": "https://newsite.com/page","responseTime": 580,"checkedAt": "2025-05-15T10:22:06.000Z"}
Broken link:
{"url": "https://apify.com/non-existent-page","statusCode": 404,"statusMessage": "Not Found","isBroken": true,"isRedirect": false,"redirectChain": [],"finalUrl": "https://apify.com/non-existent-page","responseTime": 210,"checkedAt": "2025-05-15T10:22:07.000Z"}
Network error:
{"url": "https://this-domain-does-not-exist.io/page","statusCode": 0,"statusMessage": "getaddrinfo ENOTFOUND this-domain-does-not-exist.io","isBroken": true,"isRedirect": false,"redirectChain": [],"finalUrl": null,"responseTime": 5021,"checkedAt": "2025-05-15T10:22:12.000Z"}
Input Configuration
startUrls · array · required
The list of URLs to check. Supports the full Apify requestListSources format.
[{ "url": "https://example.com/page-1" },{ "url": "https://example.com/page-2" },{ "url": "https://oldsite.com/legacy-path" }]
You can paste URLs directly in the Apify Console, import from a text list, or pass them programmatically via the Apify API. There is no hard limit on the number of URLs — the actor processes them all.
maxConcurrency · integer · default: 20 · min: 1 · max: 100
How many URLs to check simultaneously.
| Value | Use Case |
|---|---|
5–10 | Small lists, conservative proxy usage |
20 (default) | Balanced speed and reliability for most use cases |
50–100 | Maximum speed for large audits; requires sufficient proxy pool |
Higher concurrency means faster runs but also more simultaneous outbound requests. For very high concurrency, using Apify Proxy is strongly recommended to avoid triggering rate limits on target servers.
proxyConfiguration · object · default: Apify Proxy enabled
Proxy configuration for all HTTP requests.
{ "useApifyProxy": true }
Using a proxy is recommended for:
- Large URL lists (thousands of URLs to the same domain)
- Checking URLs that rate-limit by IP (e.g.
429 Too Many Requests) - Avoiding your actor's IP being blocked mid-run
For small, diverse URL lists against different domains, a proxy may not be necessary.
Usage Examples
Example 1 — Quick check of a handful of URLs
{"startUrls": [{ "url": "https://example.com" },{ "url": "https://example.com/contact" },{ "url": "https://example.com/old-page" }],"maxConcurrency": 5,"proxyConfiguration": { "useApifyProxy": false }}
Example 2 — Post-migration audit of 10,000 URLs
{"startUrls": [{ "url": "https://oldsite.com/page-1" },{ "url": "https://oldsite.com/page-2" }// ... (import full list via CSV or API)],"maxConcurrency": 50,"proxyConfiguration": { "useApifyProxy": true }}
Example 3 — Backlink health check
{"startUrls": [{ "url": "https://partner-site.com/our-mention" },{ "url": "https://news-site.com/article/brand-coverage" }],"maxConcurrency": 20,"proxyConfiguration": { "useApifyProxy": true }}
Example 4 — Sitemap-driven full-site audit
Combine this actor with the Sitemap to URL Crawler actor:
- Run Sitemap to URL Crawler on your domain → get all URLs
- Export that dataset as JSON
- Feed the URL list into this actor as
startUrls - Get a complete HTTP status report for every page on your site
How It Works
The actor uses BasicCrawler with got-scraping for pure HTTP requests — no browser, no JavaScript rendering, no unnecessary overhead.
For each URL in the input:
Step 1 — Send HTTP GET request
A full GET request is made (not just HEAD) for maximum compatibility. Some servers return different status codes for HEAD vs GET. The response body is not stored — only headers and metadata are used.
Step 2 — Follow redirects
Redirects are followed automatically (up to 10 hops). Every intermediate URL in the redirect chain is recorded in redirectChain.
Step 3 — Record result
Status code, message, redirect chain, final URL, and response time are written to the dataset immediately.
Step 4 — Handle errors
If the request fails (DNS failure, connection refused, timeout), the error is caught and recorded as statusCode: 0 with the raw error message in statusMessage. The URL is never silently skipped.
Step 5 — Retry on transient failures
Network-level failures (not 4xx/5xx HTTP errors) are automatically retried up to 2 times before recording a final failure.
Input URL List│▼┌─────────────────────────────────┐│ GET request via got-scraping ││ - followRedirect: true (max 10)││ - throwHttpErrors: false ││ - timeout: 15s │└────────────┬────────────────────┘│┌──────┴──────┐│ │Success Network Error│ │Parse status statusCode = 0+ redirect error.message →chain statusMessage│ │└──────┬──────┘│Push to Dataset
Performance
| URL Count | Concurrency | Est. Time | Notes |
|---|---|---|---|
| 100 | 20 | < 30 sec | Small audit |
| 1,000 | 20 | ~3–5 min | Standard blog/site audit |
| 10,000 | 50 | ~10–20 min | Post-migration check |
| 100,000 | 100 | ~1–2 hours | Enterprise-scale audit |
Response time per URL depends heavily on target server speed. The actor itself adds minimal overhead — it's a direct HTTP check.
Cost: This actor uses BasicCrawler with pure HTTP requests — no browser, no Playwright. Compute cost is negligible. For 100,000 URLs at concurrency 50, expect < $0.50 in Apify compute units.
Export & Analysis
Download your results from the Apify Dataset in:
- JSON — full structured output with arrays for
redirectChain - CSV — flat table;
redirectChainis serialized as a comma-joined string - Excel (.xlsx) — native spreadsheet for sharing with non-technical stakeholders
- JSONL — one record per line for streaming into data pipelines
Filtering Broken Links in CSV/Excel
Once exported to CSV, use a simple filter:
- Column
isBroken = TRUE→ all broken URLs (4xx, 5xx, network errors) - Column
isRedirect = TRUE→ all URLs that redirect - Column
statusCode = 404→ specifically missing pages - Column
statusCode = 301→ permanent redirects (high SEO importance)
Filtering via Apify API
Use the dataset filter API to retrieve only broken URLs:
GET /v2/datasets/{datasetId}/items?filter=isBroken%3Dtrue
Common Use Cases In Detail
Post-Migration Redirect Audit
After moving a website to a new domain or restructuring URLs, every old URL should redirect (301) to its new equivalent. This actor lets you:
- Feed your old sitemap URLs as input
- Check that every URL returns
301or308 - Verify
finalUrlpoints to the correct new page - Flag any
404s where a redirect is missing
SEO Broken Link Detection
Search engines penalize sites with broken internal and external links. Export your full site URL list from a sitemap or crawl, run this actor, filter isBroken = true, and prioritize fixes by page importance.
Redirect Chain Optimization
Long redirect chains (A → B → C → D) waste crawl budget and add latency. Use the redirectChain field to identify multi-hop chains and collapse them to direct redirects. Flag any chain with redirectChain.length > 1.
Backlink Monitoring
If you've earned backlinks pointing to specific pages, use this actor on a schedule to verify those URLs still resolve to 200 OK. A 404 on a linked page means lost link equity.
API & Webhook URL Validation
Before deploying an integration, run all API endpoint URLs through this actor to confirm they return 200 or 201 rather than unexpected 4xx or 5xx responses.
Limitations
- No JavaScript rendering. The actor makes raw HTTP requests. Pages that require JavaScript to load (SPAs, React apps) still return the correct HTTP status code, but the final resolved URL may differ from what a browser would show after JS-based routing.
- Authentication not supported. URLs behind login walls return
401or403as expected, but the actor cannot authenticate to check protected content. HEADvsGET. The actor usesGET(notHEAD) for better compatibility. This means it downloads the response body, but discards it immediately — a small amount of extra bandwidth is used per URL.- Max 10 redirects per URL. Redirect chains longer than 10 hops are aborted. Chains this long almost always indicate a redirect loop and would be flagged as broken in practice.
- Rate limiting. If a target server rate-limits your requests (
429), the result is recorded asisBroken: truewithstatusCode: 429. Use Apify Proxy and/or lowermaxConcurrencyto reduce the rate of requests per server. - Timeout at 15 seconds. URLs that don't respond within 15 seconds are recorded as network errors (
statusCode: 0). Increase this threshold via code modification if checking known slow endpoints.
Frequently Asked Questions
Q: What's the difference between statusCode: 0 and statusCode: 404?
statusCode: 404 means the server responded and explicitly said the page doesn't exist. statusCode: 0 means the actor never received a response at all — DNS resolution failed, the server refused the connection, or the request timed out.
Q: Does it follow redirects?
Yes, automatically, up to 10 hops. redirectChain records every intermediate URL, and finalUrl shows the ultimate destination.
Q: Can I check HTTP (non-HTTPS) URLs?
Yes. Both HTTP and HTTPS URLs are supported. Invalid SSL certificates are also handled gracefully — ignoreHTTPSErrors is not enabled by default, so SSL errors are recorded as failures.
Q: How do I check 100,000 URLs efficiently?
Set maxConcurrency to 50–100 and enable Apify Proxy. At concurrency 100 with average server response times of 500 ms, you can expect ~200 URLs/second throughput.
Q: Can I use this in a scheduled run for link monitoring?
Yes — use the Apify Scheduler to trigger runs daily, weekly, or on any interval. Combine with the Apify API or webhook notifications to alert you when new broken links appear.
Q: What happens if a URL in my list is malformed?
Malformed URLs that can't be parsed as valid HTTP URLs will result in a network error (statusCode: 0) with a descriptive error message.
Q: Can I pipe the output of the Sitemap Crawler directly into this actor?
Yes. Export the sitemap crawler's dataset as JSON, then use that as input to this actor — or connect them via the Apify API for a fully automated audit pipeline.
Q: Is responseTime the time to first byte or total download time?
It is the total wall-clock time from when the request is sent to when the final response (including all redirect hops) is received. Since the body is discarded immediately, this closely approximates time to first byte for redirect chains.
Technical Details
| Property | Value |
|---|---|
| Runtime | Node.js (ES Modules) |
| Framework | Apify SDK v3 + Crawlee BasicCrawler |
| HTTP client | got-scraping (browser-like headers, proxy support) |
| Request method | GET with responseType: 'text' (body discarded) |
| Redirect handling | Automatic, max 10 hops |
| Request timeout | 15,000 ms |
| Handler timeout | 30,000 ms |
| Max retries | 2 (transient network errors only) |
| Default concurrency | 20 |
| Max concurrency | 100 |
| Error handling | All failures recorded; nothing silently dropped |
Changelog
- 2026-06-01 — Maintenance & reliability pass: pulled the latest source and rebuilt the Actor on the current base image; build verified.
- 2026-05-25 — Maintenance & reliability pass: pulled the latest source and rebuilt the Actor on the current base image; build verified.
v1.0
- Initial release
- Full HTTP GET status checking with
got-scraping - Automatic redirect chain tracing (up to 10 hops)
- Network error capture as
statusCode: 0 isBrokenandisRedirectboolean flags for easy filtering- Response time measurement per URL
- Automatic retry on transient network failures (2 retries)
- Configurable concurrency (1–100)
- Apify Proxy integration
- JSON, CSV, and Excel export
Support
If you encounter unexpected results — wrong status codes, proxy issues, or timeouts — please open a support ticket via the Apify Console. Include the affected URLs, your input configuration, and the run ID to help diagnose the issue.
Changelog
- 2026-05-20 — Maintenance pass: reviewed the input schema and default values for a smooth one-click start, and rebuilt the Actor on the latest base image.
Last reviewed: 2026-06-01.