Stealth Web Scraper avatar

Stealth Web Scraper

Pricing

from $3.50 / 1,000 successful pages

Go to Apify Store
Stealth Web Scraper

Stealth Web Scraper

Get rendered HTML, plain text, and extracted fields from Cloudflare-protected and JavaScript-heavy pages without building your own browser-and-proxy stack.

Pricing

from $3.50 / 1,000 successful pages

Rating

0.0

(0)

Developer

kane liu

kane liu

Maintained by Community

Actor stats

0

Bookmarked

7

Total users

4

Monthly active users

2 days ago

Last modified

Share

Scrape websites that block other scrapers. Get the full page content from Cloudflare-protected, anti-bot, and JavaScript-heavy sites โ€” no browser setup, no proxy config, no code required.

Works on sites like Clutch, G2, Capterra, Trustpilot, Cloudflare-protected storefronts, and many more that return "access denied" to standard scrapers.


Who is this for?

  • ๐Ÿ›๏ธ E-commerce monitors โ€” track competitor prices on Cloudflare-protected shops
  • ๐ŸŽฏ Lead generators โ€” pull company listings from directories like Clutch, Yelp, G2
  • ๐Ÿ“Š SEO & market researchers โ€” collect reviews, ratings, and content from protected sources
  • ๐Ÿค– AI agent builders โ€” feed rendered page content into your automation workflows
  • ๐Ÿ’ผ Competitive analysts โ€” watch competitor landing pages and content changes
  • ๐Ÿ“ฐ Content researchers โ€” gather articles and listings from JavaScript-heavy sites

If your scraper keeps getting "403 Forbidden" or just an empty shell of a page, this Actor is built for you.


What you can do with it

1. Monitor competitor pricing on protected shops

In plain English: give the Actor a list of product page URLs โ†’ get back a clean table with product name, price, and stock status, ready to drop into Excel or an alert system.

You give:

FieldWhat to enter
URLsList of competitor product page links (one per row)
FieldsproductName, price, availability

You get back (table you can download as Excel / CSV / JSON):

Product NamePriceAvailability
Example Product$49.00In stock
Another Product$29.00Out of stock
.........

2. Pull company listings from directories (Clutch, G2, Capterra)

In plain English: point the Actor at a directory category page โ†’ get a list of provider cards with company names, locations, ratings, and review counts.

You give:

FieldWhat to enter
URLsCategory page URLs (e.g. Clutch digital marketing)
Wait for[data-testid='provider-card'] (optional)

You get back (one row per provider):

CompanyLocationRatingReviewsCategory
Agency OneNew York4.942Digital Marketing
Agency TwoLos Angeles4.731Digital Marketing

3. Extract reviews from Trustpilot / G2 / Capterra

In plain English: provide the review page URL โ†’ get a clean list of reviews you can feed into sentiment analysis or a spreadsheet.

You get back:

ReviewerRatingDateReview
John D.52026-03-15Great product, fast shipping...
Sarah K.22026-02-28Had issues with the packaging...

4. Feed AI agents with rendered page content

In plain English: your AI agent needs the actual visible content of a page (not a 403 page) โ†’ pass the URL, get back readable plain text and HTML. Works natively with LangChain, Make, n8n, and Zapier.

Plug the Actor output straight into:

  • Your prompt as context
  • A vector database for RAG
  • A custom summarization pipeline

5. Watch for content changes

In plain English: run the same URLs on a schedule (daily / weekly) โ†’ Actor returns a timestamp and quality signal for each page, so you can diff changes over time.

Useful for tracking:

  • Competitor landing page updates
  • Pricing page changes
  • Legal / Terms of Service updates
  • Product launch announcements

How to use (no code required)

  1. Click "Try for Free" at the top of this page
  2. Paste your list of URLs (one per line)
  3. (Optional) Add CSS selectors if you want specific fields like price or title
  4. (Optional) Set waitForSelector if the page has content that loads dynamically
  5. Click Start โ€” results appear in the Dataset tab within seconds

Download your results as CSV, Excel, or JSON. That's it.

The free $5 monthly Apify credit gets you around 1,400 successful pages.


What you get back

Every successfully scraped page returns:

  • Page title โ€” the <title> of the page
  • Text content โ€” clean plain text, ready for analysis
  • HTML โ€” full rendered HTML (if you need it)
  • Extracted fields โ€” whatever you asked for with CSS selectors
  • Quality signal โ€” full, partial, minimal, or blocked, so you know what you got
  • Timestamp โ€” when the page was scraped

Two separate datasets:

  • โœ… Successful pages go to the main dataset (and count toward billing)
  • โŒ Failed or blocked pages go to a failures dataset (never billed)

You always know exactly what you paid for.


Pricing

Pay only for successful pages. Blocked, failed, or incomplete pages are free.

VolumePrice
100 successful pages~$0.35
1,000 successful pages~$3.50
10,000 successful pages~$35.00

How this compares:

  • Building your own stealth scraper: 20+ hours of dev work, ongoing maintenance
  • Bright Data / Zyte scraping API: $500+/month subscription
  • This Actor: pay only when you scrape, no subscription

Example: Scrape 500 competitor product pages once a week = about $7/month. Scrape 50 Clutch pages once = about $0.18.

The $5 free monthly Apify credit covers around 1,400 successful pages โ€” enough to test whether this fits your workflow before you spend anything.

Apify platform compute/memory is billed separately by Apify, typically pennies per run for small jobs.


Connect to your tools

Use this Actor from your existing stack โ€” no coding needed:

PlatformHow to connect
Make.comSearch "Apify" โ†’ "Run Actor" โ†’ Actor ID: lentic_clockss/stealth-web-scraper
n8nAdd Apify node โ†’ "Run Actor" action โ†’ same Actor ID
ZapierApify integration โ†’ "Run Actor" trigger
LangChainApifyActorsTool("lentic_clockss/stealth-web-scraper")
Python / Node.jsApify SDK or direct HTTPS call

API call example

curl "https://api.apify.com/v2/acts/lentic_clockss~stealth-web-scraper/runs" \
-X POST \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"urls": ["https://www.clutch.co/it-services"], "outputFormat": "text"}'

Results come back in JSON via the Apify Dataset API:

GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json

When to use something else

This Actor is great for public content on protected sites. It's NOT the right tool for:

If you need...Use this instead
Login-only pages (your account dashboard)A custom Actor with session handling
Long sessions with complex interactionsApify's Web Scraper or a custom Actor
Guaranteed success on every requestNo tool can promise this โ€” websites change
Simple non-protected websitesApify's cheaper Web Scraper handles these fine

FAQ

Q: What counts as a successful page? A: A page is successful when it returns status 200, isn't blocked, and (if you set waitForSelector) the element appeared. Only successful pages are billed.

Q: What happens when a page is blocked? A: It goes to the separate failures dataset with the error details. You're not charged for blocked pages.

Q: Do I need my own proxies? A: No. The Actor has built-in proxy support. RESIDENTIAL is the default and works best on heavily protected sites. You can also bring your own proxy list if you prefer.

Q: Can I extract specific fields like prices or titles? A: Yes. Pass extractSelectors with CSS selectors. If the selector matches one element you get a string, if multiple you get a list.

Q: Will this work on LinkedIn / Instagram / Facebook? A: No โ€” those require login sessions. This Actor is for public pages that just happen to be protected by anti-bot systems.

Q: How is this different from Apify's Web Scraper? A: Apify's Web Scraper handles standard sites. This Actor is specifically built for pages blocked by Cloudflare, Akamai, PerimeterX, and similar anti-bot systems. Use the standard Web Scraper for easier targets to save money.

Q: How do I know if my target site needs this Actor? A: Try Apify's standard Web Scraper first. If you get 403 errors or an empty page, switch to this one.


Input reference

For developers who want full control:

ParameterTypeDescription
urlsarrayList of URLs to scrape (required)
extractSelectorsobjectCSS selectors for specific fields, e.g. {"title": "h1", "price": ".price"}
outputFormatstringhtml, text, or both (default: both)
waitForSelectorstringCSS selector that must appear before extraction completes
maxConcurrencyintegerParallel pages, 1-5 (default: 1)
pageTimeoutintegerPage load timeout in seconds, 30-300 (default: 90)
proxyGroupstringauto (datacenter, cheapest), RESIDENTIAL (recommended for protected sites), or BUYPROXIES94952

Full output schema is available in the Dataset tab.


โ†’ Browse all Actors: apify.com/lentic_clockss