Pricing

Pay per event

Bulk URL Status Checker

Bulk check URLs for status codes, redirects, broken links, response times, canonical tags, robots meta, headers, and final destinations.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

32 minutes ago

Last modified

What does Bulk URL Status Checker do?

Bulk URL Status Checker takes URLs from pasted lists, text blocks, hosted URL lists, or XML sitemaps.

It checks each URL over HTTP and returns a structured dataset row for every URL.

The actor reports status code, status text, final URL, redirect count, redirect chain, broken-link flag, response time, content type, content length, canonical URL, robots meta, and error metadata.

It is designed for operational checks where the status itself is the data.

If a target page returns 403, 404, 500, timeout, or another failure, the actor records that response instead of treating the whole run as failed.

Who is it for?

🔎 SEO agencies auditing migrations and technical SEO fixes.
🧭 Website migration teams validating old-to-new URL maps.
🧪 QA teams checking landing pages before releases.
📰 Content operations teams finding removed or redirected articles.
📈 Growth teams checking campaign URLs before launch.
🧰 Developers building status-check APIs into internal dashboards.
🧾 Analysts who need CSV, JSON, Excel, or API exports from URL checks.

Why use this URL status checker?

A simple browser test is not enough when you have hundreds or thousands of URLs.

This actor gives you a repeatable Apify run, dataset exports, API access, webhooks, and scheduling.

You can run it after deployments, before ad campaigns, during SEO migrations, and as part of weekly site health checks.

It is HTTP-only and lightweight, so it is cheaper than browser crawlers for status-code workflows.

Key features

✅ Bulk HTTP status checks.
✅ HEAD with GET fallback for speed and compatibility.
✅ GET-only mode for servers that reject HEAD.
✅ Redirect following with redirect-chain details.
✅ Broken-link classification.
✅ Response-time measurement.
✅ Canonical URL extraction from HTML.
✅ Robots meta extraction from HTML.
✅ Content type and content length.
✅ Optional raw response headers.
✅ Sitemap URL ingestion.
✅ Hosted plain-text or CSV-like URL list ingestion.
✅ Configurable concurrency, timeout, and user-agent.

How much does it cost to check bulk URL status codes?

This actor uses pay-per-event pricing.

There is a small start fee and a per-URL checked fee.

The default starting price in the actor package is:

Start event: $0.005 per run.
URL checked event: $0.000069405 at the BRONZE tier, with volume discounts for higher tiers.

Pricing was calculated from cloud runs using the standard 70% target NET margin formula.

Input sources

You can provide URLs in four ways.

urls: direct list of URLs.
urlsText: pasted text containing URLs separated by newlines, spaces, commas, tabs, or semicolons.
sitemapUrl: XML sitemap URL; the actor extracts <loc> entries.
listUrl: hosted text or CSV-like file containing URLs.

At least one source is required.

Duplicates are removed after normalization.

Input options

Field	Type	Description
`urls`	array	URLs to check directly.
`urlsText`	string	Pasted URL block.
`sitemapUrl`	string	XML sitemap URL to parse.
`listUrl`	string	Hosted text or CSV-like URL list.
`maxUrls`	integer	Maximum unique URLs to check.
`maxConcurrency`	integer	Parallel URL checks.
`timeoutSecs`	integer	Request timeout per URL.
`followRedirects`	boolean	Follow redirects and report the final URL.
`method`	string	`head-get-fallback` or `get`.
`includeHtmlSignals`	boolean	Extract canonical and robots meta.
`includeHeaders`	boolean	Include raw response headers.
`userAgent`	string	Optional custom User-Agent.

Example input

{
  "urls": [
    "https://example.com/",
    "https://www.iana.org/domains/example",
    "https://httpstat.us/404"
  ],
  "maxUrls": 100,
  "maxConcurrency": 20,
  "timeoutSecs": 15,
  "followRedirects": true,
  "method": "head-get-fallback",
  "includeHtmlSignals": true,
  "includeHeaders": false
}

Sitemap audit example

{
  "sitemapUrl": "https://www.iana.org/sitemap.xml",
  "maxUrls": 500,
  "maxConcurrency": 10,
  "includeHtmlSignals": true
}

Use this mode to audit indexed URLs, migration sitemaps, or generated sitemap files.

Output data

Each dataset item represents one URL check.

Field	Description
`inputUrl`	Original URL supplied by the user.
`normalizedUrl`	URL after scheme normalization and hash removal.
`statusCode`	Final HTTP status code, or null on request error.
`statusText`	Human-readable status text when known.
`finalUrl`	Final URL after redirects.
`redirectChain`	Array of redirect hops with URL, status, and location.
`redirectCount`	Number of redirect hops.
`isBroken`	True for request errors or HTTP status 400+.
`isRedirect`	True when at least one redirect was followed.
`responseTimeMs`	Request duration in milliseconds.
`contentType`	Response Content-Type header.
`contentLength`	Response Content-Length header when available.
`canonicalUrl`	Canonical URL extracted from HTML, when requested.
`robotsMeta`	Robots meta content extracted from HTML, when requested.
`errorType`	Request error code or error name.
`errorMessage`	Request error message.
`checkedAt`	ISO timestamp for the check.
`headers`	Optional raw response headers.

Example output

{
  "inputUrl": "https://example.com/",
  "normalizedUrl": "https://example.com/",
  "statusCode": 200,
  "statusText": "OK",
  "finalUrl": "https://example.com/",
  "redirectChain": [],
  "redirectCount": 0,
  "isBroken": false,
  "isRedirect": false,
  "responseTimeMs": 184,
  "contentType": "text/html",
  "contentLength": 1256,
  "canonicalUrl": null,
  "robotsMeta": null,
  "errorType": null,
  "errorMessage": null,
  "checkedAt": "2026-06-22T00:00:00.000Z"
}

Redirect chain checks

When followRedirects is enabled, the actor follows redirects up to the HTTP client limit.

The final dataset row still represents the original URL.

The redirectChain field stores each hop with the source URL, status code, and Location header.

Use this for migration maps, HTTP-to-HTTPS checks, trailing-slash cleanup, and canonical destination validation.

Broken link checks

isBroken is true when the final status code is 400 or higher.

It is also true for invalid URLs, timeouts, DNS errors, TLS errors, and connection failures.

Blocked URLs such as 401, 403, or 429 are preserved as HTTP status results.

That makes the actor useful for reporting what happened instead of hiding protected URLs as run failures.

Canonical and robots meta checks

When includeHtmlSignals is true, the actor parses HTML pages for:

canonical link: <link rel="canonical" href="...">
robots meta: <meta name="robots" content="...">

This is useful for SEO QA after site migrations and template changes.

The actor only attempts these checks for HTML responses.

Performance tips

Start with maxConcurrency 10-20 for general websites.

Use lower concurrency for small sites, fragile servers, or URLs behind rate limits.

Use head-get-fallback for most runs because HEAD is fast and GET fallback handles servers that reject HEAD.

Use get when you know target servers return inaccurate HEAD responses.

Keep includeHeaders disabled unless you need raw headers in exports.

Integrations

You can integrate this actor into many workflows:

Schedule a weekly sitemap status audit.
Trigger a run after a deployment.
Send broken-link results to Slack through Apify webhooks.
Export redirect chains to Google Sheets.
Pull dataset items into a BI tool.
Use API results in internal QA dashboards.
Compare old and new migration URL maps in a data warehouse.

API usage with Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/bulk-url-status-checker').call({
  urls: ['https://example.com/', 'https://httpstat.us/404'],
  followRedirects: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

API usage with Python

from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/bulk-url-status-checker').call(run_input={
    'urls': ['https://example.com/', 'https://httpstat.us/404'],
    'followRedirects': True,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

API usage with cURL

curl -X POST 'https://api.apify.com/v2/acts/automation-lab~bulk-url-status-checker/runs?token=YOUR_APIFY_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{"urls":["https://example.com/","https://httpstat.us/404"],"followRedirects":true}'

MCP integration

Use Apify MCP to run this actor from Claude Desktop, Claude Code, or other MCP clients.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/bulk-url-status-checker

Add it in Claude Code:

$claude mcp add --transport http apify-bulk-url-status-checker https://mcp.apify.com/?tools=automation-lab/bulk-url-status-checker

Claude Desktop, Cursor, and VS Code JSON configuration:

{
  "mcpServers": {
    "apify-bulk-url-status-checker": {
      "url": "https://mcp.apify.com/?tools=automation-lab/bulk-url-status-checker"
    }
  }
}

Example prompts:

"Check these 50 URLs and summarize the broken links."
"Run a sitemap status audit and group results by status code."
"Find redirects in this migration URL list and export final URLs."

Data quality notes

HTTP status checks depend on target server behavior.

Some servers treat HEAD and GET differently.

Some servers block datacenter traffic, unknown user agents, or high concurrency.

For those cases, use GET mode, lower concurrency, and a custom user agent that identifies your crawler policy.

The actor reports the observed result rather than attempting to bypass access controls.

Troubleshooting

Why do I see 403 or 429?

The target server is refusing or rate limiting requests.

Lower concurrency, use a custom user agent, or check whether your organization allows automated checks against that domain.

Why is `contentLength` null?

Many servers use chunked transfer encoding or omit Content-Length.

The actor reports null when the header is missing.

Why is `canonicalUrl` null?

The page may not be HTML, canonical extraction may be disabled, or the page may not contain a canonical tag.

Legality and ethical use

Only check URLs you are allowed to audit.

Respect robots policies, rate limits, and site terms.

Do not use the actor to overload third-party servers.

The actor is intended for diagnostics, QA, SEO operations, and link-health monitoring.

Use the simpler HTTP Status Checker for small one-off status checks.

Use Bulk URL Status Checker when you need sitemap/list ingestion, canonical hints, robots meta, and richer redirect/broken-link audit fields.

FAQ

Can I check thousands of URLs?

Yes. Increase maxUrls and choose a concurrency that is safe for the target domains.

Does it use a browser?

No. It is an HTTP-only actor for status and header checks.

Does it scrape page content?

No. It only fetches enough page HTML to extract canonical and robots meta when that option is enabled.

Can I schedule it?

Yes. Use Apify schedules to run it daily, weekly, or after deployments.

Can I export to CSV?

Yes. Apify datasets can be exported as JSON, CSV, Excel, XML, RSS, or HTML.

Changelog

0.1.0: Initial build with URL list, text, sitemap, and hosted list ingestion; status checks; redirect chain output; canonical and robots meta extraction.

Bulk URL Status Checker

fetch_cat/bulk-url-status-checker

Check large URL lists for status codes, redirects, broken links, response timing, headers, titles, canonical URLs, and robots meta.

Hanna Nosova

Bulk URL Status Checker - Redirect & Broken Link Audit

webdata_labs/bulk-url-status-checker

[💵 $2.00 / 1K] Check URLs in bulk for HTTP status codes, broken links, redirects, response times, final URLs, and redirect chains. Built for SEO audits, migrations, QA, and monitoring. CSV/JSON.

WebData Labs

Bulk Website Uptime Checker

glowing_glove/website-uptime-health-checker

Check website availability in bulk and return status codes, redirects, final URLs, response times, and failure reasons.

Ushba Khan

Sitemap URL Status Auditor

automation-lab/sitemap-url-status-auditor

Audit XML sitemaps for broken URLs, redirects, HTTP status codes, response timing, content type, canonical tags, and robots metadata.

Stas Persiianenko

Bulk URL Checker — Status, Redirects & SEO Tags

haketa/bulk-url-checker

Check thousands of URLs at once: HTTP status code, redirects, final URL, response time, content type and on-page SEO tags (title, meta description, H1, canonical, Open Graph, noindex). Fast, cheap bulk link auditing for SEO, QA and migrations.

Haketa

Bulk HTTP Status Checker - Status Codes, Redirects, Dead Links

eliai/http-status-checker

Check HTTP status codes for a list of URLs in bulk. Input: urls array. Output: JSON per URL with final status code, full redirect chain, and response time. Find dead pages and broken links fast. Priced per URL checked - cheap, predictable pay-per-result.

Anthony Snider

URL Status Batch Checker

mahogany_songbird/url-status-batch-checker

HTTP status codes and response times for URL lists.

Britton Furness

HTTP Status Code Checker API - Bulk URL & Redirect Audit

pink_comic/http-status-code-checker

Bulk HTTP status checker for SEO and uptime audits. Check URL status codes, redirects, response times, content types, and broken 4xx/5xx pages. Use it as a URL status API for site migrations, monitoring, and link QA.

Ava Torres

Website Status Metadata API

intimate_pacu/website-status-metadata-api

Check public website status, redirects, title, meta description, canonical URL, and basic response metadata for monitoring and agent workflows.

Intimate PAcu

Bulk URL Status Checker & Redirect Audit

rtworule/bulk-url-health-auditor

Check public URLs in bulk for HTTP status, redirect chains, response time, content type, title, canonical URL, robots directives, and safe error diagnostics.

Kunteper Koyu

Bulk URL Status Checker

What does Bulk URL Status Checker do?

Who is it for?

Why use this URL status checker?

Key features

How much does it cost to check bulk URL status codes?

Input sources

Input options

Example input

Sitemap audit example

Output data

Example output

Redirect chain checks

Broken link checks

Canonical and robots meta checks

Performance tips

Integrations

API usage with Node.js

API usage with Python

API usage with cURL

MCP integration

Data quality notes

Troubleshooting

Why do I see 403 or 429?

Why is contentLength null?

Why is canonicalUrl null?

Legality and ethical use

Related scrapers and tools

FAQ

Can I check thousands of URLs?

Does it use a browser?

Does it scrape page content?

Can I schedule it?

Can I export to CSV?

Changelog

You might also like

Bulk URL Status Checker

Bulk URL Status Checker - Redirect & Broken Link Audit

Bulk Website Uptime Checker

Sitemap URL Status Auditor

Bulk URL Checker — Status, Redirects & SEO Tags

Bulk HTTP Status Checker - Status Codes, Redirects, Dead Links

URL Status Batch Checker

HTTP Status Code Checker API - Bulk URL & Redirect Audit

Website Status Metadata API

Bulk URL Status Checker & Redirect Audit

Why is `contentLength` null?

Why is `canonicalUrl` null?