Ai Web Scraper - Extract Data With Ease avatar
Ai Web Scraper - Extract Data With Ease

Pricing

Pay per event

Go to Store
Ai Web Scraper - Extract Data With Ease

Ai Web Scraper - Extract Data With Ease

eloquent_mountain/ai-web-scraper-extract-data-with-ease

Developed by

Paco

Maintained by Community

Ai Web Scraper enables scraping for everyone, including non-techies! It uses Google's Gemini LLM to scrape websites with natural language commands. It dynamically extracts data, no selector input needed, handles dynamic content and cookie consent, avoids bot detection, outputs JSON or other formats.

0.0 (0)

Pricing

Pay per event

8

Monthly users

46

Runs succeeded

85%

Response time

12 hours

Last modified

6 days ago

Ai Web Scraper - Extract data With Ease - Natural Language & Vision Scraper

The Ai Web Scraper is a powerful, flexible scraping tool powered by Google's Gemini LLM (Large Language Model). Instead of predefined selectors, specify data extraction needs in plain natural language, and the scraper dynamically locates and extracts the data from webpages using screenshots analyzed by Ai.


🔥 What's New

  • Structured Output via Dynamic Schema: Automatically generates structured data outputs tailored precisely to your scraping instructions, improving data consistency and usability.
  • Enhanced Instruction Parsing: Improved AI understanding of natural language instructions, extracting clearer, database-ready item lists.
  • Playwright Integration: Faster, modern browser automation replaces Selenium.
  • Pay-Per-Event Charging: Charged each time the scraper analyzes a screenshot with Gemini.
  • Configurable Scrolling: Clearly indicate if pages use infinite scrolling or static layouts.
  • Above-the-Fold Analysis: Option to analyze only the visible part of the page (no scrolling).
  • Screenshot Saving: Optionally save captured screenshots to storage for debugging or auditing.

How It Works

  1. Natural Language Instructions

    • Simply instruct the scraper clearly what data you want:
    "Extract the product title, price, and description, for products that have reviews"
  2. Intelligent Scrolling

    • Infinite Scrolling (has_infinite_scroll: true): Continuously scrolls until the page stops loading new content.
    • Static Page (has_infinite_scroll: false): Captures distinct screenshots ensuring no overlap or duplication.
    • Above-the-Fold Only (above_the_fold: true): Captures only the visible viewport without scrolling.
  3. Pay-Per-Event Charging

    • Each screenshot analyzed by Gemini counts as an event, clearly tracking your usage and controlling costs.
  4. Automated Cookie Handling

    • Automatically detects and accepts cookie consent banners, reducing manual intervention.
  5. JSON Output

    • Data clearly structured and easily exportable:
1{
2  "url": "https://example.com",
3  "items": {
4    "product title": "Sample Product",
5    "product price": "$19.99",
6    "product description": "Detailed description here..."
7  }
8}

Example Input Configuration

1{
2  "instructions": "Get product title, price, and description, for products that have reviews",
3  "start_urls": [
4    "https://www.ikea.com/nl/nl/p/onsevig-vloerkleed-laagpolig-veelkleurig-60497078/",
5    "https://www.ikea.com/nl/nl/p/vedbak-vloerkleed-laagpolig-lichtgrijs-40528900/"
6  ],
7  "has_infinite_scroll": false,
8  "save_screenshots": false,
9  "above_fold_only": false
10}

Important Notes

  • Pay-per-event: Every screenshot analysis counts as one event. Optimize your use to control costs.
  • Ai Accuracy: Clearly specified instructions improve extraction quality. Ambiguous instructions may yield inconsistent results.
  • Screenshot Storage: Enable screenshot saving for debugging purposes; screenshots will be stored in your Apify storage.
  • Legal Considerations: Always respect website terms of service and comply with applicable regulations like GDPR.

How to Use

  1. Apify Account: Sign up or log in.
  2. Setup Actor: Open the Actor on the Apify platform.
  3. Configure Inputs:
    • Specify your URLs and extraction instructions.
    • Indicate scrolling behavior (has_infinite_scroll) and optionally limit analysis to the visible area (above_the_fold).
    • Enable save_screenshots if needed.
  4. Run the Scraper: Click Start and let the scraper execute.
  5. Review Results: Access structured JSON data via the Apify dataset. Export to CSV, JSON, XLSX, etc.

Use Cases

  • E-commerce: Prices, descriptions, and product reviews.
  • Market Research: Competitor price tracking.
  • Lead Generation: Extract B2B information from directories.
  • News & Blogs: Scrape headlines, article summaries, or authors.

Integrations

  • Seamlessly integrates with Apify’s cloud services.
  • Automate data processing with Apify tasks, actors, and APIs.

Feedback & Issues

We welcome feedback! Report bugs or suggest enhancements through the Issues section on the Actor’s Apify page.

Thanks for choosing Ai Web Scraper!

Pricing

Pricing model

Pay per event 

This Actor is paid per result. You are not charged for the Apify platform usage, but only a fixed price for each dataset of 1,000 items in the Actor outputs.

Vision call to LLM

$0.010

Vision Call to LLM for processing image of resolution 1920x1080

Startup costs

$0.020

Startup costs for starting the actor