Pricing

from $1.00 / 1,000 results

Try for free

Go to Apify Store

Web Scraper API – Any URL, Anti-Bot Proxy, JS Render & AI

Try for free

Free web scraper API for any website. Rotating anti-bot proxies in 220+ countries, JavaScript rendering, browser actions, screenshots, sticky sessions, CSS + AI extraction, plus ready-made Amazon, Google, YouTube & ChatGPT scrapers. Clean HTML, LLM-ready Markdown & structured JSON.

Pricing

from $1.00 / 1,000 results

Rating

5.0

(1)

Developer

youssef farhan

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

Web Scraper API – Scrape Any URL With Anti-Bot Proxies, JS Rendering & AI

Web Scraper API is a free, universal web scraping tool that turns any URL into clean HTML, LLM-ready Markdown, and structured JSON. It routes every request through rotating anti-bot proxies in 220+ countries, renders JavaScript in a real headless browser, runs browser actions (click, scroll, type), captures screenshots, and can run an AI model over each page — so you get reliable data from even the most protected sites without managing browsers, proxies, or CAPTCHAs.

Point it at Amazon, eBay, Walmart, Google, Google Maps, LinkedIn, Instagram, TikTok, Zillow, Yelp, Booking.com, Indeed, Crunchbase, Idealista — or your own list of URLs. Add CSS selectors or an LLM prompt and get structured output in one run. Or skip HTML entirely and use a built-in Ready-made scraper for Amazon, Google, YouTube, and ChatGPT that returns clean JSON. Built for developers, researchers, e‑commerce sellers, growth teams, and AI agents.

✅ Free to use — no rental, no per-result fee from this Actor. You only pay your own scrape.do usage for the underlying requests, and you can grab a free scrape.do token with 1,000 monthly credits → (no credit card required).

Try it now: the default input already points at live Amazon and eBay search results. Paste your Scraper API token, hit Start, and you'll get structured product data back in seconds.

Why use this web scraper API

🆓 Free Actor: no monthly rental and no pay-per-result charge — bring your own scrape.do key and start with 1,000 free monthly credits.
🛡️ Beat anti-bot systems: rotating datacenter + residential/mobile proxies across 220+ countries handle Cloudflare, DataDome, PerimeterX and rate limits for you.
🌐 Any site, famous or niche: Amazon, Google, LinkedIn, Instagram, TikTok, Zillow, Booking.com, Yelp, Indeed — or any URL you paste.
🎭 Full headless browser: render JavaScript, wait for selectors, scroll infinite feeds, click "load more", fill forms, and run custom JS via browser actions.
📸 Screenshots: capture viewport or full-page screenshots, saved straight to your dataset's key-value store.
🤖 LLM-ready output + built-in AI: every page becomes Markdown with embedded JSON‑LD/script data — perfect for RAG — and an optional AI step can summarize, classify, or extract JSON.
⚡ Auto-scaled & resumable: concurrency is sized to your live plan limit, and runs are migration-safe (they resume without re-scraping or duplicating data).

Supported scraping features (all of scrape.do)

Feature	Input	What it does
JavaScript rendering	`render`	Real headless browser executes JS before returning HTML
Geo-targeting (country)	`geoCode`	Route through a proxy in a specific country
Geo-targeting (continent)	`regionalGeoCode`	Route through a whole continent
Residential / mobile proxy	`superProxy`	Premium pool for the hardest targets
Device profile	`device`	Imitate desktop, mobile, or tablet
Sticky sessions	`sessionId`	Reuse the same exit IP across a multi-page flow
Cookies	`setCookies`	Send `name=value` cookies to the target
Custom / forwarded headers	`customHeaders`, `forwardHeaders`	Control the exact headers the site sees
Disable redirects	`disableRedirection`	Return the first response without following 3xx
Wait strategy	`waitUntil`, `customWait`, `waitSelector`	Wait for load events, a fixed delay, or a specific element
Viewport	`viewportWidth`, `viewportHeight`	Set the render window size
Block resources	`blockResources`	Skip images/CSS/fonts for faster, cheaper rendering
Browser actions	`playWithBrowser`	Click, scroll, type, wait, execute JS in sequence
Screenshots	`screenShot`, `fullScreenShot`	Capture viewport or full-page image
Scraper timeout	`scrapeTimeout`	Max time the API spends fetching/rendering a page
CSS extraction	`extractRules`	Pull fields with `{ "name": "css selector" }`
AI processing	`llmEnabled` + `llmPrompt`	Summarize / classify / extract JSON per page via OpenRouter
Ready-made scrapers	`scraperApi`	Structured JSON for Amazon, Google, YouTube & ChatGPT — no parsing

Geo-targeting covers 220+ countries (pick any in the dropdown) plus whole-continent routing.

Ready-made structured scrapers (JSON, no parsing)

Set scraperApi to one of scrape.do's dedicated endpoints and the Actor returns clean, already-parsed JSON in a data field — no CSS selectors, no HTML. Put your inputs in scraperApiQueries (one per line):

Mode (`scraperApi`)	Input per line	Returns
`amazon-product`	product URL or ASIN	Full product detail — title, price, rating, images, specs
`amazon-search`	search keyword	Ranked search results
`amazon-offers`	product URL or ASIN	Every seller offer, price & shipping
`google-search`	search query	SERP results + AI overviews
`google-maps`	place / business query	Places, ratings & reviews
`google-shopping`	product query	Shopping listings with prices
`google-news`	news query	News articles
`youtube`	search query	Videos, channels, playlists, shorts
`chatgpt`	prompt	ChatGPT's answer as structured JSON

Shared options: readyApiLanguage (language / hl), geoCode (country / gl), and amazonZipcode (Amazon location pricing). Example — scrape three Amazon products as JSON:

{
  "apiKey": "YOUR_API_TOKEN",
  "scraperApi": "amazon-product",
  "scraperApiQueries": [
    "https://www.amazon.com/dp/B0CHWRXH8B",
    "B09G9FPHY6",
    "https://www.amazon.com/dp/B0BDHWDR12"
  ],
  "geoCode": "us",
  "amazonZipcode": "90210"
}

Ready APIs run one request at a time per token and cost more credits per call (Google/YouTube ≈ 10, ChatGPT ≈ 25). You can still turn on the LLM step to post-process the JSON.

What you get

Every URL produces one dataset record with these fields:

url — the page that was scraped.
statusCode — HTTP status returned by the target.
ok — true when the status was 2xx.
error — error message for failed URLs (null on success).
scrapedAt — ISO 8601 UTC timestamp of the scrape.
html — full raw HTML (toggle off with storeHtml).
text — the page as LLM-ready Markdown: headings, links, lists, and tables preserved, with structured JSON from <script> tags appended. Built to drop straight into an LLM prompt.
textLength — character count of text.
structuredData — JSON pulled from <script> tags (JSON-LD, __NEXT_DATA__, etc.) as {source, data} — where Amazon/eBay/SSR sites keep their cleanest product data. Present only when found.
extractedFields — your CSS-selector fields as a key/value object.
llmOutput — optional AI result (text or parsed JSON), as a top-level field next to html/text. Accompanied by llmModel, llmUsage, and llmError.
screenshotUrl / screenshotKey — link to the captured screenshot in the key-value store (when screenshots are on).
options — echo of render, geoCode, regionalGeoCode, superProxy, device, and screenshot used per request.

Sample output

{
  "url": "https://www.amazon.com/dp/B0CHWRXH8B",
  "statusCode": 200,
  "ok": true,
  "html": "<!DOCTYPE html><html>...</html>",
  "text": "# Wireless Earbuds Bluetooth 5.4 Headphones\n\n40H Playtime · 4.5 out of 5 stars · **$29.99**\n\n## Structured data (from page scripts)\n\n### application/ld+json\n```json\n{ \"@type\": \"Product\", \"name\": \"Wireless Earbuds\", \"offers\": { \"price\": \"29.99\" } }\n```",
  "textLength": 18742,
  "structuredData": [
    { "source": "application/ld+json", "data": { "@type": "Product", "name": "Wireless Earbuds", "offers": { "price": "29.99", "priceCurrency": "USD" } } }
  ],
  "extractedFields": {
    "title": "Wireless Earbuds Bluetooth 5.4 Headphones",
    "price": "$29.99"
  },
  "llmOutput": { "title": "Wireless Earbuds Bluetooth 5.4", "price": 29.99, "rating": 4.5, "inStock": true },
  "llmModel": "google/gemini-3.1-flash-lite",
  "llmUsage": { "prompt_tokens": 4120, "completion_tokens": 48, "total_tokens": 4168 },
  "llmError": null,
  "options": { "render": true, "geoCode": "us", "regionalGeoCode": null, "superProxy": false, "device": "desktop", "screenshot": false },
  "scrapedAt": "2026-06-25T12:00:00.000000+00:00",
  "error": null
}

What you can build with it

Real things people use this Actor for — copy any of these as a starting point:

🛒 Amazon, eBay & Walmart price tracking — feed product or search URLs and pull title + price on a schedule to watch competitors, catch price drops, or track your own listings. Add geoCode to compare prices across countries (US vs. UK vs. DE).
📦 E-commerce catalog & inventory monitoring — scrape any store to track stock status, new arrivals, ratings, and reviews into a spreadsheet or database.
🗺️ Local business & reviews data — pull listings, ratings, and reviews from Google Maps, Yelp, or TripAdvisor.
🏠 Real-estate & rentals — collect listings and prices from Zillow, Booking.com, or Idealista, and screenshot each listing.
💼 Lead generation & recruiting — pull names, companies, and roles from LinkedIn-style directories, Crunchbase, or Indeed job pages (use superProxy + sessionId).
🤖 Feed AI agents & RAG pipelines — give Claude, ChatGPT, or Cursor live web data over MCP; the LLM-ready Markdown text drops straight into embeddings and prompts.
🔍 SERP & SEO monitoring — scrape Google or Bing results behind anti-bot protection to track rankings and visibility.
📰 News, reviews & sentiment — collect article text or reviews and let the built-in LLM summarize, classify, or score sentiment in the same run.
📸 Visual monitoring & QA — capture full-page screenshots of competitor pages, ads, or your own site over time.

Quick recipes

Goal	`startUrls` example	Key options
Amazon product price	`https://www.amazon.com/dp/B0CHWRXH8B`	`extractRules: { "title": "#productTitle", "price": ".a-price .a-offscreen" }`
Amazon search results	`https://www.amazon.com/s?k=wireless+earbuds`	`render: true`
eBay listing	`https://www.ebay.com/itm/123456789012`	`extractRules: { "title": "h1.x-item-title__mainTitle", "price": ".x-price-primary" }`
Google search results	`https://www.google.com/search?q=best+headphones`	`render: true`, `geoCode: "us"`
Google Maps place	`https://www.google.com/maps/place/...`	`render: true`, `waitSelector: "h1"`
Infinite-scroll feed	any social/listing feed	`playWithBrowser: [ {"Action":"ScrollY","Value":3000}, {"Action":"Wait","Timeout":2000} ]`
Full-page screenshot	any URL	`fullScreenShot: true`
Hard target (LinkedIn/Zillow)	any protected URL	`superProxy: true`, `sessionId: "s1"`, `render: true`

Don't want to write selectors? Turn on 🤖 LLM processing, set Force JSON output, and prompt: "Extract product title, price as a number, rating, and whether it's in stock."

Browser actions (`playWithBrowser`)

Automate the headless browser before the page is captured. Provide a list of steps that run in order:

[
  { "Action": "ScrollY", "Value": 1000 },
  { "Action": "Wait", "Timeout": 2000 },
  { "Action": "Click", "Selector": "#load-more" },
  { "Action": "WaitSelector", "WaitSelector": ".results", "Timeout": 5000 },
  { "Action": "Fill", "Selector": "#search", "Value": "laptop" },
  { "Action": "Execute", "Execute": "window.scrollTo(0, document.body.scrollHeight)" }
]

Supported actions: Click (Selector), Wait (Timeout ms), WaitSelector (WaitSelector, Timeout), ScrollX / ScrollY (Value), ScrollTo (Selector), Fill (Selector, Value), Execute (Execute JS). Enabling browser actions turns JavaScript rendering on automatically.

Pricing

This Actor is free. There is no monthly rental and no per-result charge from the Actor itself — you only pay Apify's standard platform usage (compute) while it runs, plus your own scrape.do credits for the underlying requests. Get 1,000 free monthly scraping credits here to cover those requests at no cost.

How it works

Input: a list of startUrls plus your Scraper API token.
Fetch: each URL is routed through rotating anti-bot proxies, with optional JavaScript rendering, geo-targeting, device profile, sticky sessions, and browser actions.
Process: apply CSS extractRules, capture a screenshot, and/or run an LLM prompt on each page.
Output: structured records land in the Apify dataset (export as JSON, CSV, Excel, or XML); screenshots go to the key-value store.
Automate: run on a schedule and trigger webhooks on finish — built into the Apify platform.

Input example

{
  "apiKey": "YOUR_API_TOKEN",
  "startUrls": [
    { "url": "https://www.amazon.com/s?k=wireless+earbuds" },
    { "url": "https://www.ebay.com/sch/i.html?_nkw=wireless+earbuds" }
  ],
  "render": true,
  "geoCode": "us",
  "device": "desktop",
  "extractRules": { "title": "h1", "price": ".a-price .a-offscreen" }
}

Omit extractRules and llm* fields to just return clean HTML and text. Set concurrencyPercentage to control speed.

FAQ

Is this web scraper API really free? Yes. The Actor has no rental and no per-result fee. You pay only Apify's standard compute usage while it runs and your own scrape.do credits for the requests — and scrape.do gives you 1,000 free credits a month.

Which websites can it scrape? Any public URL, including heavily protected sites like Amazon, eBay, Walmart, Google, Google Maps, LinkedIn, Instagram, TikTok, Zillow, Yelp, Booking.com, Indeed, and Crunchbase.

Does it handle anti-bot protection and proxies? Yes. Every request is routed through a rotating proxy network, with optional residential/mobile proxies (superProxy) and per-country or per-continent geo-targeting.

Can it render JavaScript / dynamic sites? Yes — set render: true for SPAs and dynamic pages. You can also wait for elements (waitSelector), add delays (customWait), and run browser actions (scroll, click, type).

Can it take screenshots? Yes. Enable screenShot (viewport) or fullScreenShot (full page); the image is saved to the key-value store and linked from each record via screenshotUrl.

What output formats are supported? The dataset exports to JSON, CSV, Excel, and XML, or via the Apify API. Screenshots are PNG/JPEG in the key-value store.

Can I scrape one URL or many? Both. Pass a single URL or thousands in startUrls; they are scraped concurrently.

Does it support scheduling and webhooks? Yes — use Apify schedules to run it automatically and webhooks to push results downstream.

Is the data live or cached? Live. Each run fetches the current page in real time.

How do I extract specific fields? Provide extractRules as a { "fieldName": "css selector" } map, or set an llmPrompt for AI extraction.

Can AI agents call it? Yes — it's available via the Apify REST API and as an MCP server for Claude, ChatGPT, and Cursor (see below).

Use via API or MCP

Run it programmatically via the Apify REST API:

POST https://api.apify.com/v2/acts/fayoussef~universal-scraper-api/runs?token=YOUR_TOKEN

Or connect it as an MCP server so AI agents can call it directly:

https://mcp.apify.com/actors/fayoussef~universal-scraper-api

Need a custom scraper?

Need different fields, a specific site, or a fully managed pipeline? Visit automationbyexperts.

Universal Web Scraper - Extract Any URL

lazymac/web-scraper-toolkit

Pay-per-result web scraper with JS rendering, CSS selector / XPath / regex extraction, schema validation, retry on failure. Use for product catalogs, competitor pricing, news aggregation, lead generation. Fast (<2s/page), respects robots.txt by default.

2x lazymac

Universal AI Web Scraper

stanvanrooy6/universal-ai-web-scraper

Turn any website into an API. Extract structured data using plain English. Features anti-bot bypass, dynamic rendering, and web search. No coding needed.

Stan Van Rooy

1.5

Web Search & Scrape by XCrawl Proxy

empathetic_chorus/xcrawl-search-scrape

Search the web or scrape any URL using XCrawl residential proxy network. Bypass anti-bot systems with automatic JS rendering fallback, global IP rotation, and configurable concurrency (1-20). Perfect for market research, LLM data collection, and content aggregation.

Charles

Web Scraper — Extract Data from Any Website

oneary/web-scraper

🕸️ Generic web scraper for any website with CSS selectors & JS rendering via Playwright. Static & dynamic pages. Free platform compute pricing.

Luan M.

Scrape GPT - Universal AI Web Scraper Agent

paradox-analytics/scrape-gpt---universal-ai-web-scraper-agent

AI-powered universal web scraper that works on ANY website without configuration. Extract data from e-commerce, news sites, social media, and more using intelligent LLM-based field mapping. Features JSON-first extraction, automatic pagination, anti-bot bypass, and cost-effective caching.

Paradox Analytics

Web to Markdown API — HTML Scraper works with all sites

imapp/web-markdown-api

Convert any public URL to clean, token-efficient Markdown — purpose-built for RAG pipelines, LLM ingestion, and AI agents. Multi-layer bot evasion handles Cloudflare & JS SPAs. CSS selector filtering, metadata extraction, and token budget control included. Free trial, Pay-per-result from $0.0004.

INAPP

AI Web Crawler

hounderd/ai-web-crawler

Crawl websites and extract clean, LLM-ready markdown content with stealth browser rendering, anti-bot hardening, smart content filtering, and structured metadata extraction. Built for RAG pipelines, AI agents, and data workflows.

Hounderd

General Web Scraper & AI Data Extraction

erng/general-web-scraper

Scrape any website and extract clean markdown/text content. Uses Playwright for JS rendering, Readability for article extraction, and outputs clean machine-readable data.

Ernest Gaigulo

Super Stealth Scraper — Anti-Detection Web Data Extraction

apricot_blackberry/super-stealth-scraper

Anti-detection web scraping: fingerprint rotation, residential proxies, human-like behavior. Scrape sites that block scrapers.

Creator Fusion

Firecrawl Website Crawler

alizarin_refrigerator-owner/firecrawl-website-crawler

Enhanced Website Crawling with Superior JS Rendering Enhanced website crawler using Firecrawl's Crawl API for superior JavaScript rendering, smart rate limiting, anti-bot bypass, and clean markdown extraction.