Universal Web Extractor V8 avatar
Universal Web Extractor V8

Pricing

Pay per event

Go to Apify Store
Universal Web Extractor V8

Universal Web Extractor V8

Flexible web extractor using Python + Playwright or HTTP. Supports CSS-based field extraction, HTML snapshots, screenshots, metadata, monitoring mode, and link-following. Ideal for scraping product pages, listings, news articles, tech profiles, or universal structured data from any website.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Leoncio Jr Coronado

Leoncio Jr Coronado

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

1

Monthly active users

12 hours ago

Last modified

Share

📘 Universal Web Extractor V8 Hybrid Playwright + BeautifulSoup Web Scraper

Flexible. Powerful. Universal. Extract structured data from any website — static or dynamic — in seconds.

✨ Overview

Universal Web Extractor V8 is a hybrid web scraping engine designed for maximum reliability and flexibility:

Dynamic websites → Uses Playwright (headless browser)

Static websites → Uses BeautifulSoup (super-fast HTML parsing)

This Actor automatically:

Extracts custom fields using CSS selectors

Follows pagination

Supports lists, product pages, article content, job listings, profiles, price tracking, and more

Stores structured data in the dataset

Captures HTML snapshots and screenshots (optional)

🚀 Use Cases 🛒 E-commerce

Titles, prices, images, descriptions

Pagination across multi-page listings

📰 News & Articles

Headlines, authors, publish dates

Article body extraction

🏢 Business Data

Company names, reviews, contact details

Tech stack profiling (via selectors)

📊 Analytics & Automation

Monitoring pages periodically

Creating datasets for machine learning models

Feed data into CRMs, APIs, or workflows

🧠 Features Feature Playwright Mode Soup Mode Handles JavaScript ✅ ❌ Fast & lightweight ❌ ✅ CSS field extraction ✅ ✅ HTML snapshots ✔ Optional ✔ Optional Screenshots ✔ Optional ❌ Pagination support ✅ ✅

🛠 How It Works

You simply provide:

✔ start_urls ✔ fields (e.g., title=h1, price=.product-price) ✔ link_selector (optional pagination) ✔ mode (use_playwright: true|false)

The extractor will:

Fetch each start URL

Extract desired fields

Follow pagination (if enabled)

Save results to the dataset

Save HTML snapshots / screenshots (optional)

🔧 Input Schema { "start_urls": ["https://example.com"], "fields": ["title=h1", "price=.price-tag"], "link_selector": ".next a", "use_playwright": false, "block_resources": true, "max_requests": 30, "max_depth": 3, "save_html_snapshot": true, "save_screenshot": false }

📤 Output Format Each dataset item contains: { "url": "https://example.com/product-1", "title": ["Product Name"], "price": ["$29.99"], "timestamp": "2025-01-01T12:00:00Z" }

🧩 Field Extraction Guide Provide CSS selectors in this format:

title=h1 price=.price description=.product-description p author=.post-author quote=.text

You can extract any HTML element.

🧭 Pagination

Enable automatic pagination by using: "link_selector": ".next a"

Increase depth if you want more pages: "max_depth": 5

🖼 Snapshots & Screenshots

Enable full page snapshots: "save_html_snapshot": true

Enable screenshots (Playwright only): "save_screenshot": true

Snapshots are stored in the Key-value Store.

⚡ Modes Explained Use Playwright when:

JS-heavy website

Infinite scroll

Protected elements

Dynamic rendering

Use BeautifulSoup when:

Fast crawling needed

Static HTML

API-like speed desired

🔐 Advanced Tips

Block images + fonts (faster) "block_resources": true

Limit detection "max_requests": 1 "max_depth": 0

Perfect for testing.

🏁 Example: Quotes to Scrape { "start_urls": ["http://quotes.toscrape.com"], "fields": ["title=h1", "quote=.text"], "link_selector": ".next a", "use_playwright": false, "max_requests": 10, "max_depth": 3 } 🧨 Notes / Limitations

Some sites may block Playwright (rare)

Large HTML snapshots may slow down KV storage

CAPTCHA-handled sites are unsupported

❤️ Created by Leoncio Jr Coronado

Apify Developer • Web Scraping Engineer • Automation Specialist

If you need custom scraping solutions: LinkedIn / Upwork / Fiverr — Available for projects