AI Web Scraper - Webscraper with AI based Summery or answer
Pricing
Pay per usage
AI Web Scraper - Webscraper with AI based Summery or answer
Web Page Scraper + AI Summary/Answer: Scrapes any URL, extracts content (text, links, images, tables, lists,raw html,tech stack), auto-falls back to headless browser for JS sites, and optionally generates an AI summary/answer from your prompt. Try with frontend at-https://aiscraperweb.netlify.app/
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Siddharth Jain
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
9 days ago
Last modified
Categories
Share
Web Scraper AI (Apify Actor)
Scrape a web page (HTTP first, browser fallback for JS-heavy pages) and optionally generate an AI summary via Pollinations.
What it does
- Validates the input URL (
http/httpsonly) - Tries a fast HTTP scrape first (Axios + Cheerio)
- If the page looks blocked / empty / JS-rendered, falls back to a headless browser scrape (Puppeteer +
@sparticuz/chromium) - Optionally calls Pollinations (
https://text.pollinations.ai/) to produce a summary using yourprompt
Input
This Actor expects a JSON input with:
url(required): page URL to scrapeprompt(optional): if provided, the Actor will request an AI summary
Example:
{"url": "https://example.com","prompt": "Summarize the page in 5 bullets"}
Output
The Actor writes results to:
- Default dataset (one item per run)
- Key-value store as
OUTPUT
Fields include:
url,methodUsed,scrapedAttitle,description,paragraphs,images,linkstables,lists,uniqueComponents,rawHTML,techStacksummary/ai answer(only ifpromptis provided)
Run locally (Windows)
Install deps:
$npm install
Quick smoke test:
$npm run test:local
Run with your own URL:
npm run set-input -- --url https://example.com --prompt "Summarize this page"npm start
Where to find output:
storage/datasets/default/000000001.jsonstorage/key_value_stores/default/OUTPUT.json
Deploy / host on Apify
Option A: Apify Console (UI) — easiest
- Zip the project (or connect Git repo)
- In Apify Console → Actors → Create new → Source code
- Upload the code
- Build the Actor image
- Run it with an input JSON (see Input section)
Option B: Apify CLI
If you use the Apify CLI:
- Install/login:
npm i -g apify-cliapify login
- From this project folder:
$apify push
- Then run it from Apify Console or via CLI.
Will I get the “form layout” like other Actors?
Yes.
Apify shows an input form UI automatically when your Actor provides an input schema. This project includes:
INPUT_SCHEMA.json(defines the fields)actor.jsonreferences it via theinputproperty
That’s what generates the “every actor has to fill details” form.
If you want more fields (proxy, cookies, max pages, etc.), you extend INPUT_SCHEMA.json and update the code to read them.
Notes / limitations
- Some sites block scraping (bot protection, captchas, login walls). In those cases, the Actor may return
BLOCKED/LOGIN_REQUIRED. - AI summary depends on Pollinations availability/rate limits.