Ai Web Scraper - Natural language and Vision scraper avatar
Ai Web Scraper - Natural language and Vision scraper

Pricing

Pay per event

Go to Store
Ai Web Scraper - Natural language and Vision scraper

Ai Web Scraper - Natural language and Vision scraper

Developed by

Paco

Maintained by Community

Powerful AI Web Scraper using Google's Gemini Vision. Specify data extraction in natural language. Supports infinite scroll, above-the-fold analysis, automatic cookie consent, pay-per-event pricing, and screenshot storage for debugging.

0.0 (0)

Pricing

Pay per event

1

Monthly users

19

Runs succeeded

>99%

Last modified

5 days ago

AI Web Scraper – Natural Language & Vision Scraper (Playwright + Pay-Per-Event)

The AI Web Scraper is an advanced and intuitive web scraping tool powered by Google's Gemini Large Language Model (LLM). Define your scraping needs in natural language, and the AI dynamically identifies and extracts relevant data directly from webpage screenshots.


🔥 What's New (Update: March 21, 2025)

  • Structured Output via Dynamic Schema: Automatically generates structured data outputs tailored precisely to your scraping instructions, improving data consistency and usability.
  • Enhanced Instruction Parsing: Improved AI understanding of natural language instructions, extracting clearer, database-ready item lists.
  • Enhanced Cookie Consent Handling: Smarter AI-driven cookie acceptance improves automation and reduces manual intervention.
  • Streamlined Browser Management: Efficiently manages Playwright browser instances, optimizing performance and resource utilization.

How It Works

  1. Define Instructions Clearly
    Use plain language to specify exactly what data you need:

    "Extract the product title, price, and description."
  2. AI-Driven Data Extraction
    Gemini LLM intelligently analyzes webpage screenshots, dynamically locating requested items.

  3. Flexible Scrolling Options

    • Infinite Scrolling: For pages that continuously load new content.
    • No-Overlap Scrolling: Captures comprehensive screenshots of static pages.
    • Above-the-Fold Only: Capture just the initially visible content without scrolling.
  4. Structured JSON Outputs Receive data neatly structured for easy analysis and integration:

    1{
    2  "url": "https://example_A.com",
    3  "items": [
    4    {"product_name": "Item A", "price": "$29.99", "description": "A great product.","url ":"https://example_A.com"},
    5    {"product_name": "Item B", "price": "$49.99", "description": "Another great product.","url":"https://example_B.com"}
    6  ]
    7}

Example Input Configuration

1{
2  "instructions": "Give me the product name and price for each product that isn't blue",
3  "start_urls": [
4    "https://www.example_A.com/product1",
5    "https://www.example_B.com/product2"
6  ],
7  "has_infinite_scroll": false,
8  "save_screenshots": false,
9  "above_fold_only": false
10}

Important Notes

  • Pay-per-event: Charges apply each time the Gemini LLM analyzes a screenshot.
  • Optimized Instructions: Clearer instructions produce better AI-driven results.
  • Legal Compliance: Always adhere to website terms of service and relevant privacy regulations.

How to Use

  1. Apify Setup: Log in to Apify and select the actor.
  2. Configure Inputs: Specify URLs, instructions, and scrolling behaviors.
  3. Run and Extract: Start the actor, and seamlessly access structured data outputs.

Use Cases

  • E-commerce Analysis: Extract product details, pricing, and reviews.
  • Market Intelligence: Monitor competitor offerings and pricing.
  • Lead Generation: Collect data from directories or listings.
  • Media Monitoring: Capture news headlines, article summaries, or author details.

Integrations

  • Easily integrates into Apify's cloud ecosystem.
  • Automate post-processing via Apify tasks, actors, or APIs.

Feedback & Issues

Your input is valuable! Report any issues or suggest new features via the Issues section on the Apify actor page.

Thanks for choosing AI Web Scraper!

Pricing

Pricing model

Pay per event 

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

VISION_API_CALL

$0.010

Vision API call for a resolution image of 1920 x 1080