Universal Web Extractor V8

Pricing

Pay per event

Universal Web Extractor V8

Flexible web extractor using Python + Playwright or HTTP. Supports CSS-based field extraction, HTML snapshots, screenshots, metadata, monitoring mode, and link-following. Ideal for scraping product pages, listings, news articles, tech profiles, or universal structured data from any website.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Leoncio Jr Coronado

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

12 hours ago

Last modified

Categories

Automation

Developer tools

E-commerce

📘 Universal Web Extractor V8 Hybrid Playwright + BeautifulSoup Web Scraper

Flexible. Powerful. Universal. Extract structured data from any website — static or dynamic — in seconds.

✨ Overview

Universal Web Extractor V8 is a hybrid web scraping engine designed for maximum reliability and flexibility:

Dynamic websites → Uses Playwright (headless browser)

Static websites → Uses BeautifulSoup (super-fast HTML parsing)

This Actor automatically:

Extracts custom fields using CSS selectors

Follows pagination

Supports lists, product pages, article content, job listings, profiles, price tracking, and more

Stores structured data in the dataset

Captures HTML snapshots and screenshots (optional)

🚀 Use Cases 🛒 E-commerce

Titles, prices, images, descriptions

Pagination across multi-page listings

📰 News & Articles

Headlines, authors, publish dates

Article body extraction

🏢 Business Data

Company names, reviews, contact details

Tech stack profiling (via selectors)

📊 Analytics & Automation

Monitoring pages periodically

Creating datasets for machine learning models

Feed data into CRMs, APIs, or workflows

🧠 Features Feature Playwright Mode Soup Mode Handles JavaScript ✅ ❌ Fast & lightweight ❌ ✅ CSS field extraction ✅ ✅ HTML snapshots ✔ Optional ✔ Optional Screenshots ✔ Optional ❌ Pagination support ✅ ✅

🛠 How It Works

You simply provide:

✔ start_urls ✔ fields (e.g., title=h1, price=.product-price) ✔ link_selector (optional pagination) ✔ mode (use_playwright: true|false)

The extractor will:

Fetch each start URL

Extract desired fields

Follow pagination (if enabled)

Save results to the dataset

Save HTML snapshots / screenshots (optional)

🔧 Input Schema { "start_urls": ["https://example.com"], "fields": ["title=h1", "price=.price-tag"], "link_selector": ".next a", "use_playwright": false, "block_resources": true, "max_requests": 30, "max_depth": 3, "save_html_snapshot": true, "save_screenshot": false }

📤 Output Format Each dataset item contains: { "url": "https://example.com/product-1", "title": ["Product Name"], "price": ["$29.99"], "timestamp": "2025-01-01T12:00:00Z" }

🧩 Field Extraction Guide Provide CSS selectors in this format:

title=h1 price=.price description=.product-description p author=.post-author quote=.text

You can extract any HTML element.

🧭 Pagination

Enable automatic pagination by using: "link_selector": ".next a"

Increase depth if you want more pages: "max_depth": 5

🖼 Snapshots & Screenshots

Enable full page snapshots: "save_html_snapshot": true

Enable screenshots (Playwright only): "save_screenshot": true

Snapshots are stored in the Key-value Store.

⚡ Modes Explained Use Playwright when:

JS-heavy website

Infinite scroll

Protected elements

Dynamic rendering

Use BeautifulSoup when:

Fast crawling needed

Static HTML

API-like speed desired

🔐 Advanced Tips

Block images + fonts (faster) "block_resources": true

Limit detection "max_requests": 1 "max_depth": 0

Perfect for testing.

🏁 Example: Quotes to Scrape { "start_urls": ["http://quotes.toscrape.com"], "fields": ["title=h1", "quote=.text"], "link_selector": ".next a", "use_playwright": false, "max_requests": 10, "max_depth": 3 } 🧨 Notes / Limitations

Some sites may block Playwright (rare)

Large HTML snapshots may slow down KV storage

CAPTCHA-handled sites are unsupported

❤️ Created by Leoncio Jr Coronado

Apify Developer • Web Scraping Engineer • Automation Specialist

If you need custom scraping solutions: LinkedIn / Upwork / Fiverr — Available for projects

Cdiscount Product Details Scraper

ecomscrape/cdiscount-product-page-details-scraper

Cdiscount Product Details Scraper extracts detailed product data from all Cdiscount country sites, including name, brand, price, rating, image URLs, etc., in structured formats like JSON by providing product details page urls.

ecomscrape

Hybrid Vision Spider | AI-Powered Universal Web Scraper BETA

tuguidragos/hybrid-vision-spider-ai-powered-universal-web-scraper

AI-driven hybrid web scraper that merges Playwright and Vision intelligence to extract structured data from any dynamic site. Schema-aware, proxy-ready, budget-safe, and fully compatible with Apify datasets.

Țugui Dragoș

5.0

(1)

Ai Web Scraper - Natural language and Vision scraper

eloquent_mountain/ai-universal-web-scraper-natural-language

Powerful AI Web Scraper using Google's Gemini Vision. Specify data extraction in natural language. Supports infinite scroll, above-the-fold analysis, automatic cookie consent, pay-per-event pricing, and screenshot storage for debugging.

Paco

337

3.0

(1)

Web Text Extractor

rl1987/web-text-extractor

R.L.

Universal Article Scraper

universal_scraping/universal-article-scraper

Universal article scraper for news websites, blogs, etc. It can scrape articles from multiple websites simultaneously, including metadata such as title, content, publication date, image, and author.

Michael Novak

5.0

(2)

GraphQL Extractor

jupri/graphql-extractor

💫 Universal GraphQL Scraper

cat

Universal Apify Email & Metadata Scraper (Puppeteer + Crawlee)

lucrateresults/universal-apify-email-metadata-scraper-puppeteer-crawlee

Description: A production-ready Apify actor built with PuppeteerCrawler (Crawlee) to extract emails and metadata from public websites. Optimized for parallel crawling, JavaScript rendering, and IP rotation. Disclaimer: Scrape only public data. Respect each site’s terms.

Lucrate Results

Web Auto Scraper

ribtools/web-auto-scraper

Extract vehicle listings from WebAuto.com.py including cars and motorcycles with comprehensive details like pricing, specifications, seller information, and images. Supports search across different brands and automatically handles pagination to collect complete datasets.

RibTools

Universal Website to API – Any Site → JSON

inquisitive_zeppelin/universal-website-to-api---any-site---json

Multi-URL Web Scraper is a fast, reliable, and highly flexible scraping tool designed to extract structured data from multiple web pages in a single run.

Hamza Ahmed

Price scraper - Extract prices, availability from any url/EAN

s-r/price-scraper---extract-prices-availability-from-any-url

Cheapest advanced price scraping tool for PDP URLs (works with any site!)/EAN. Extract real-time product prices, even from those that are blocked! Automated web scraper for monitoring webshop prices across multiple URLs. Perfect for price comparison, competitor analysis & dynamic pricing strategies.