Anti-Blocking Diagnostics
Pricing
Pay per event
Anti-Blocking Diagnostics
Anti-Blocking Diagnostics tests any URL against multiple access methods — direct HTTP, datacenter proxy, residential proxy, and real browser — to diagnose why your scraper is being blocked. It identifies the root cause (WAF, CAPTCHA, IP block, geo-restriction, rate limit, or JS challenge)...
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
17 hours ago
Last modified
Categories
Share
What does Anti-Blocking Diagnostics do?
Anti-Blocking Diagnostics tests any URL against multiple access methods — direct HTTP, datacenter proxy, residential proxy, and real browser — to diagnose why your scraper is being blocked. It identifies the root cause (WAF, CAPTCHA, IP block, geo-restriction, rate limit, or JS challenge) and provides specific fix recommendations so you can choose the right scraping strategy.
Who is it for?
- 🕷️ Web scraping engineers — diagnosing why scrapers get blocked on target sites
- 🔧 DevOps teams — testing anti-bot defenses and proxy configurations
- 🛡️ QA engineers — validating that scraping infrastructure handles blocks gracefully
- 📊 Data pipeline managers — troubleshooting failed extraction jobs
- 💻 Apify developers — optimizing actor configurations for difficult websites
Why use Anti-Blocking Diagnostics?
- Four test methods — HTTP direct, datacenter proxy, residential proxy, and headless Chromium browser
- Automatic detection — identifies Cloudflare, Akamai, Imperva, reCAPTCHA, hCaptcha, and other protections
- Root cause classification — categorizes blocking as IP-based, WAF, CAPTCHA, JS challenge, geo-block, or rate limit
- Fix recommendations — actionable advice for each blocking type (proxy type, browser mode, request rate)
- Best method finder — tells you the cheapest access method that works for each URL
- Structured output — machine-readable results for integration with monitoring and alerting workflows
What data can you extract?
| Field | Example |
|---|---|
url | https://www.cloudflare.com |
verdict | partially-blocked |
blockingType | js-challenge |
bestMethod | browser-residential |
httpDirectStatus | Blocked: cloudflare-challenge (403) |
httpResidentialStatus | Blocked: cloudflare-challenge (403) |
browserStatus | OK (200) |
recommendation | Use browser-based scraping with residential proxies |
tests | Full test results array with timing and content length |
testedAt | 2026-02-28T12:00:00.000Z |
How much does it cost to run anti-blocking diagnostics?
Anti-Blocking Diagnostics uses pay-per-event pricing.
| Event | What triggers it | FREE tier | GOLD+ tier |
|---|---|---|---|
start | Each actor run | $0.05 | $0.035 |
url-tested | Each URL diagnosed | $0.01 | $0.007 |
Real-world cost examples
| Scenario | URLs | Cost (FREE) |
|---|---|---|
| Test 1 URL | 1 | $0.06 |
| Test 5 URLs | 5 | $0.10 |
| Test 20 URLs | 20 | $0.25 |
Platform compute costs are billed separately. Browser tests use ~1 GB memory.
How to diagnose website blocking issues
- Go to Anti-Blocking Diagnostics on Apify Store and click Try for free.
- Enter the URLs your scraper has trouble accessing.
- Choose a proxy country to match your scraping needs.
- Enable browser test (default) to include Playwright browser testing for JS challenge detection.
- Click Start and wait for the run to finish.
- Review results in the Dataset tab — each URL gets tested 3-4 ways, with a diagnosis and recommendation.
Example input
{"urls": ["https://www.cloudflare.com","https://www.amazon.com","https://example.com"],"countryCode": "US","includeBrowserTest": true,"timeoutSecs": 30}
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
urls | string[] | required | Web page URLs to test for blocking. Each URL is tested with multiple methods. |
countryCode | string | US | Country for proxy geotargeting. Helps detect geo-blocking. |
includeBrowserTest | boolean | true | Include a Playwright browser test. Detects JS challenges and behavioral defenses. |
timeoutSecs | integer | 30 | Timeout in seconds for each individual test request. |
Output example
{"url": "https://www.cloudflare.com","verdict": "partially-blocked","blockingType": "js-challenge","bestMethod": "browser-residential","httpDirectStatus": "Blocked: cloudflare-challenge (403)","httpDatacenterStatus": "Blocked: cloudflare-challenge (403)","httpResidentialStatus": "Blocked: cloudflare-challenge (403)","browserStatus": "OK (200)","recommendation": "JavaScript challenge detected. Use browser-based scraping (Playwright/Puppeteer) with residential proxies — this combination passed the test.","tests": [{"method": "http-direct","statusCode": 403,"contentLength": 15234,"responseTimeMs": 450,"challengeDetected": true,"challengeType": "cloudflare-challenge","error": null,"success": false}],"testedAt": "2026-02-28T12:00:00.000Z"}
Tips for best results
- Test the exact URLs your scraper targets — not just the homepage. Blocking often varies by page.
- Try different countries — some sites have geo-specific blocking rules.
- Enable browser testing for sites that use Cloudflare, Akamai, or similar JS-challenge WAFs.
- Use results to choose your scraping stack — if
http-residentialworks, you don't need a browser (cheaper and faster). - Re-test periodically — sites update their defenses. What worked last month may not work today.
- Check the
testsarray for detailed timing and content length data to understand response patterns.
Integrations
Connect Anti-Blocking Diagnostics with Apify integrations to build automated monitoring. Run weekly diagnostics on your target URLs and alert via Slack or email when blocking changes.
Using the Apify API
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('automation-lab/anti-blocking-diagnostics').call({urls: ['https://www.cloudflare.com', 'https://www.amazon.com'],countryCode: 'US',includeBrowserTest: true,});const { items } = await client.dataset(run.defaultDatasetId).listItems();for (const item of items) {console.log(`${item.url}: ${item.verdict} — ${item.recommendation}`);}
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_API_TOKEN')run = client.actor('automation-lab/anti-blocking-diagnostics').call(run_input={'urls': ['https://www.cloudflare.com', 'https://www.amazon.com'],'countryCode': 'US','includeBrowserTest': True,})items = client.dataset(run['defaultDatasetId']).list_items().itemsfor item in items:print(f"{item['url']}: {item['verdict']} — {item['recommendation']}")
cURL
curl "https://api.apify.com/v2/acts/automation-lab~anti-blocking-diagnostics/runs" \-X POST \-H "Authorization: Bearer YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"urls": ["https://www.cloudflare.com"], "countryCode": "US", "includeBrowserTest": true}'
Use with AI agents via MCP
Anti-Blocking Diagnostics is available as a tool for AI assistants via the Model Context Protocol (MCP).
Setup for Claude Code
$claude mcp add --transport http apify "https://mcp.apify.com"
Setup for Claude Desktop, Cursor, or VS Code
{"mcpServers": {"apify": {"url": "https://mcp.apify.com"}}}
Example prompts
- "Test if this website blocks scrapers"
- "Run anti-blocking diagnostics on these URLs"
Learn more in the Apify MCP documentation.
Legality
Scraping publicly available data is generally legal according to the US Court of Appeals ruling (HiQ Labs v. LinkedIn). This actor only accesses publicly available information and does not require authentication. Always review and comply with the target website's Terms of Service before scraping. For personal data, ensure compliance with GDPR, CCPA, and other applicable privacy regulations.
FAQ
What test methods does it use? Four methods: (1) HTTP without proxy, (2) HTTP with datacenter proxy, (3) HTTP with residential proxy, (4) Headless Chromium browser with residential proxy. Each method tests a different dimension of blocking.
What blocking types can it detect? IP-based blocking, WAF challenges (Cloudflare, Akamai, Imperva), CAPTCHAs (reCAPTCHA, hCaptcha), JavaScript challenges, geo-blocks, rate limits, and authentication requirements.
How long does it take? About 30–60 seconds per URL with all tests enabled. The browser test adds ~10–15 seconds.
Can I test without the browser?
Yes. Set includeBrowserTest: false to run only HTTP tests. This is faster and uses less memory, but won't detect JS challenge bypass capability.
What does "partially-blocked" mean?
It means some test methods were blocked but at least one succeeded. The bestMethod field tells you which one worked.
Does it actually bypass blocking? No — it diagnoses blocking and tells you what works. It's a planning tool to help you choose the right scraping strategy before building your scraper.
Can I monitor blocking changes over time?
Yes. Schedule the actor to run weekly on your target URLs. Compare the verdict and blockingType fields across runs to detect when sites update their defenses.
What proxies does it use? Apify's built-in datacenter (SHADER) and residential proxy groups. You choose the country for geotargeting.
The actor times out or fails — what should I do?
Increase the timeoutSecs parameter (default 30s). Some sites with heavy JavaScript or slow responses need 60s or more. Also check that you are not testing an unreachable URL (e.g., internal/private IP addresses).
All methods show "Blocked" — does that mean the site is impossible to scrape? Not necessarily. The diagnostics test standard approaches. Some sites require specialized solutions like session management, CAPTCHA solving, or custom headers. The results help you narrow down what combination to try next.
Other automation and diagnostics tools
- Scraper Regression Watchdog — Monitor your scrapers for regressions and data quality issues.
- Website Change Monitor — Detect changes on any web page and get notified.
- Website Health Report — Get a comprehensive health report for any website.
- HTTP Status Checker — Check HTTP status codes for a list of URLs.


