
Ai Web Scraper - Natural language and Vision scraper
This Actor is paid per event

Ai Web Scraper - Natural language and Vision scraper
This Actor is paid per event
Powerful AI Web Scraper using Google's Gemini Vision. Specify data extraction in natural language. Supports infinite scroll, above-the-fold analysis, automatic cookie consent, pay-per-event pricing, and screenshot storage for debugging.
Actor Metrics
2 Monthly users
No reviews yet
1 bookmark
33% runs succeeded
Created in Mar 2025
Modified 2 days ago
AI Web Scraper – Natural Language & Vision Scraper (Playwright + Pay-Per-Event)
The AI Web Scraper is a powerful, flexible scraping tool powered by Google's Gemini LLM (Large Language Model). Instead of predefined selectors, specify data extraction needs in plain natural language, and the scraper dynamically locates and extracts the data from webpages using screenshots analyzed by AI.
🔥 What's New
- ✅ Playwright Integration: Faster, modern browser automation replaces Selenium.
- ✅ Pay-Per-Event Charging: Charged each time the scraper analyzes a screenshot with Gemini.
- ✅ Configurable Scrolling: Clearly indicate if pages use infinite scrolling or static layouts.
- ✅ Above-the-Fold Analysis: Option to analyze only the visible part of the page (no scrolling).
- ✅ Screenshot Saving: Optionally save captured screenshots to storage for debugging or auditing.
How It Works
-
Natural Language Instructions
- Simply instruct the scraper clearly what data you want:
"Extract the product title, price, and description."
-
Intelligent Scrolling
- Infinite Scrolling (
has_infinite_scroll: true
): Continuously scrolls until the page stops loading new content. - Static Page (
has_infinite_scroll: false
): Captures distinct screenshots ensuring no overlap or duplication. - Above-the-Fold Only (
above_the_fold: true
): Captures only the visible viewport without scrolling.
- Infinite Scrolling (
-
Pay-Per-Event Charging
- Each screenshot analyzed by Gemini counts as an event, clearly tracking your usage and controlling costs.
-
Automated Cookie Handling
- Automatically detects and accepts cookie consent banners, reducing manual intervention.
-
JSON Output
- Data clearly structured and easily exportable:
1{ 2 "url": "https://example.com", 3 "data": { 4 "product title": "Sample Product", 5 "product price": "$19.99", 6 "product description": "Detailed description here..." 7 } 8}
Example Input Configuration
1{ 2 "instructions": "Product title, product price, product description", 3 "start_urls": [ 4 "https://www.ikea.com/nl/nl/p/onsevig-vloerkleed-laagpolig-veelkleurig-60497078/", 5 "https://www.ikea.com/nl/nl/p/vedbak-vloerkleed-laagpolig-lichtgrijs-40528900/" 6 ], 7 "has_infinite_scroll": false, 8 "save_screenshots": false, 9 "above_fold_only": false 10}
Important Notes
- Pay-per-event: Every screenshot analysis counts as one event. Optimize your use to control costs.
- AI Accuracy: Clearly specified instructions improve extraction quality. Ambiguous instructions may yield inconsistent results.
- Screenshot Storage: Enable screenshot saving for debugging purposes; screenshots will be stored in your Apify storage.
- Legal Considerations: Always respect website terms of service and comply with applicable regulations like GDPR.
How to Use
- Apify Account: Sign up or log in.
- Setup Actor: Open the Actor on the Apify platform.
- Configure Inputs:
- Specify your URLs and extraction instructions.
- Indicate scrolling behavior (
has_infinite_scroll
) and optionally limit analysis to the visible area (above_the_fold
). - Enable
save_screenshots
if needed.
- Run the Scraper: Click Start and let the scraper execute.
- Review Results: Access structured JSON data via the Apify dataset. Export to CSV, JSON, XLSX, etc.
Use Cases
- E-commerce: Prices, descriptions, and product reviews.
- Market Research: Competitor price tracking.
- Lead Generation: Extract B2B information from directories.
- News & Blogs: Scrape headlines, article summaries, or authors.
Integrations
- Seamlessly integrates with Apify’s cloud services.
- Automate data processing with Apify tasks, actors, and APIs.
Feedback & Issues
We welcome feedback! Report bugs or suggest enhancements through the Issues section on the Actor’s Apify page.
Thanks for choosing AI Web Scraper!