🔗 Bulk URL Status Scraper avatar

🔗 Bulk URL Status Scraper

Pricing

Pay per event

Go to Apify Store
🔗 Bulk URL Status Scraper

🔗 Bulk URL Status Scraper

Extract HTTP status codes, trace 301/302 redirect chains, and find 404 broken links across massive URL lists for comprehensive technical SEO audits.

Pricing

Pay per event

Rating

0.0

(0)

Developer

太郎 山田

太郎 山田

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

9 hours ago

Last modified

Share

🔗 URL Health Checker

Perform technical SEO audits at scale by extracting precise HTTP status codes, redirect chains, and server response times from massive lists of URLs. This lightweight bulk URL scraper skips resource-heavy browser rendering to deliver incredibly fast health checks across thousands of web pages. Technical SEO professionals, marketing teams, and content managers use this tool to identify broken links, map complex 301 and 302 redirect paths, and measure server latency with millisecond precision.

By feeding this scraper your entire XML sitemap or a massive database of backlinks, you can instantly flag 404 not found errors, 500 server crashes, or infinite redirect loops that harm your search rankings on Google and other search engines. The scraped data provides a clear breakdown of every requested URL, detailing the final HTTP status code, accurate response times, and comprehensive SSL certificate validity information.

Perfect for recurring URL validation workflows, you can schedule daily or weekly runs to monitor critical landing pages, affiliate links, and API endpoints. Instead of relying on slow desktop crawlers, you can run this tool in the cloud to extract vital site health details. Whether you are migrating domains, cleaning up old content, or auditing third-party contact websites, this status checker ensures your web infrastructure remains compliant, fast, and fully optimized for search performance.

Store Quickstart

Start with the Quickstart template (4 demo URLs including 404/500 cases). Once validated, switch to SEO Link Audit for crawl-style batches, Uptime Monitor for critical endpoint tracking, or Webhook Alert for operational monitoring.

Key Features

  • Bulk checking — Process up to 1,000 URLs per run with configurable concurrency (1-20)
  • 🔁 Redirect chain tracking — Full 301/302/307/308 redirect path from source to final URL
  • ⏱️ Response time measurement — Millisecond-precision timing per URL
  • 🔒 SSL validation — Detects invalid, expired, or mismatched SSL certificates
  • 🏷️ Error classification — TIMEOUT, DNS_RESOLUTION_FAILED, CONNECTION_REFUSED, SSL_ERROR
  • 🪝 Webhook delivery — Send results to Slack, Discord, or any custom endpoint

Use Cases

WhoWhy
SEO agenciesAudit client sites for broken links, slow pages, and SSL issues in bulk
DevOps teamsSchedule daily URL health checks and alert via webhook on failures
Migration projectsVerify every old URL redirects correctly after a domain migration
API health monitorsTrack endpoint availability and response times across services
Content teamsFind outdated outbound links in articles and documentation

Input

FieldTypeDefaultDescription
urlsstring[](required)URLs to check (max 1000)
concurrencyinteger10Parallel requests (1-20)
timeoutMsinteger10000Request timeout in ms
followRedirectsbooleantrueFollow HTTP redirects
deliverystringdataset"dataset" or "webhook"
webhookUrlstringWebhook endpoint URL
dryRunbooleanfalseTest without saving

Input Example

{
"urls": ["https://google.com", "https://github.com", "https://httpstat.us/404"],
"concurrency": 10,
"timeoutMs": 10000,
"followRedirects": true,
"delivery": "dataset"
}

Output

FieldTypeDescription
urlstringOriginal URL that was checked
statusCodeintegerHTTP status code (200, 404, 500, etc.)
redirectChainstring[]Ordered list of intermediate URLs in redirect chain
finalUrlstringURL after following all redirects
responseTimeMsintegerTotal response time in milliseconds
contentTypestringContent-Type header from the response
sslValidbooleanWhether the SSL certificate is valid
errorstringnull
checkedAtstringISO 8601 timestamp when the check ran

Output Example

{
"url": "https://example.com",
"statusCode": 200,
"redirectChain": [],
"finalUrl": "https://example.com",
"responseTimeMs": 142,
"contentType": "text/html; charset=UTF-8",
"sslValid": true,
"error": null,
"checkedAt": "2026-04-05T12:00:00.000Z"
}

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~bulk-url-health-checker/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "urls": ["https://google.com", "https://github.com", "https://httpstat.us/404"], "concurrency": 10, "timeoutMs": 10000, "followRedirects": true, "delivery": "dataset" }'

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/bulk-url-health-checker").call(run_input={
"urls": ["https://google.com", "https://github.com", "https://httpstat.us/404"],
"concurrency": 10,
"timeoutMs": 10000,
"followRedirects": true,
"delivery": "dataset"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/bulk-url-health-checker').call({
"urls": ["https://google.com", "https://github.com", "https://httpstat.us/404"],
"concurrency": 10,
"timeoutMs": 10000,
"followRedirects": true,
"delivery": "dataset"
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

  • Use concurrency 5–10 for most workloads. Higher values can trigger rate limiting on shared hosting.
  • Set delivery: "webhook" to forward results to Slack/Discord/PagerDuty in real time.
  • Schedule daily runs to track uptime trends and catch SSL expiry before users do.
  • For large lists (>1,000 URLs), split into multiple runs to stay within memory limits.

FAQ

How is this different from a simple ping?

Ping only checks reachability. This actor fetches the URL over HTTP, inspects status codes, follows redirects, validates SSL, and measures response times — full production-level health checks.

Can I check URLs that require authentication?

Not in the current version. This actor uses anonymous HTTP requests. Authentication support is on the roadmap.

Will Cloudflare/WAF block this actor?

It respects robots.txt and uses normal User-Agent. For heavily protected sites, use moderate concurrency (1-5) and longer timeouts.

What counts as 'SSL valid'?

We verify certificate chain, expiry, hostname match, and trust. Self-signed or expired certs return sslValid=false with details.

Can I get notified when a URL goes down?

Yes — set delivery: "webhook" and provide your endpoint URL. Each failed check is POSTed immediately.

Does this consume Apify proxy credits?

No. This actor uses the Apify datacenter network without paid residential proxies.

Complete Your Website Health Audit

Website Health Suite — Build a comprehensive compliance and trust monitoring workflow:

1. Link & URL Health (you are here)

2. SEO & Metadata Quality

3. Security & Email Deliverability

4. Historical Data & Recovery

Recommended workflow: Use Broken Link Checker to discover broken URLs → Validate fixes with URL Health Checker (here) → Monitor SSL expiry → Automate with webhooks.

Other Website Tools:

Cost

Pay Per Event:

  • actor-start: $0.01 (flat fee per run)
  • dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01

No subscription required — you only pay for what you use.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.