Universal Web Scraper — Custom Data Extraction Starter Template
Pricing
Pay per usage
Universal Web Scraper — Custom Data Extraction Starter Template
Customizable web scraping template with proxy support, rate limiting, and error handling. Fork it and make it yours.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Creator Fusion
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
7 hours ago
Last modified
Categories
Share
My Actor
Universal web scraper template. Customizable extraction, proxy support, rate limiting, and error handling. Fork and customize for any site.
Don't want to build scraping from scratch? This is a fully-featured template with everything production scraping needs: proxy rotation, smart retry logic, rate limiting, JavaScript rendering, structured output, and error handling. Fork it, customize the CSS selectors, set your target URL, and deploy. Works for e-commerce, real estate, job boards, directories—any site with structured data.
⚡ What You Get
Universal Web Scraping Template├── Features Included (Production-Ready)│ ├── Configuration│ │ ├── Target URL Input: Customizable ✓│ │ ├── Proxy Support: Enabled ✓│ │ ├── Rate Limiting: Configurable ✓│ │ ├── Timeout Handling: Included ✓│ │ └── Retry Logic: Intelligent ✓│ ├── Extraction│ │ ├── CSS Selectors: Fully customizable│ │ ├── XPath Support: Yes│ │ ├── Regex Patterns: Yes│ │ ├── JavaScript Rendering: Full Chromium│ │ └── Dynamic Content: Handled│ ├── Data Processing│ │ ├── Text Cleaning: Automatic│ │ ├── Data Normalization: Included│ │ ├── Type Conversion: Automatic│ │ ├── Duplicate Removal: Optional│ │ └── JSON Validation: Built-in│ └── Error Handling│ ├── Network Failures: Auto-retry with backoff│ ├── Missing Elements: Graceful degradation│ ├── Timeout Recovery: Configurable│ ├── Proxy Rotation on Block: Automatic│ └── Error Logging: Detailed├── Example Use Cases 👈 All possible with minimal customization│ ├── E-commerce: Product titles, prices, availability│ ├── Real Estate: Listings, prices, property details│ ├── Job Boards: Job titles, companies, salaries│ ├── Directories: Names, contacts, addresses│ ├── News: Articles, dates, authors│ ├── Reviews: Ratings, review text, reviewer names│ └── Custom Sites: Any structured HTML data├── Customization Process (30 minutes)│ ├── Step 1: Set target URL│ ├── Step 2: Inspect element, grab CSS selectors│ ├── Step 3: Update selector variables in code│ ├── Step 4: Test extraction locally│ ├── Step 5: Deploy and run│ └── Result: Fully working scraper├── Output Format│ ├── Format: Clean JSON│ ├── Schema: Validates automatically│ ├── Fields: Customizable│ ├── Pagination: Auto-handled│ └── Ready for: Databases, APIs, downstream processing└── Built-in Defaults├── Proxy Type: Residential (included)├── Rate Limit: 1 req/2s (respectful)├── Timeout: 30s per request├── Retries: 3 with exponential backoff├── JavaScript Rendering: Enabled by default└── Output: Validated JSON
🎯 Use Cases
- E-commerce Data: Scrape product catalogs, prices, availability. Feed into price comparison apps or inventory tracking.
- Real Estate: Extract listings, prices, property features. Build your own MLS alternative.
- Job Boards: Scrape job listings your favorite boards don't have APIs for. Aggregate and resell.
- Business Directories: Compile lists of companies, contacts, information from scattered sources.
- Market Research: Gather pricing, features, reviews from competitor sites systematically.
- Lead Generation: Scrape B2B directories, compile prospect lists, extract contact info.
- Content Aggregation: Pull articles, news, research from multiple sources into one feed.
📊 Sample Output
{"actor_configuration": {"target_url": "https://example-ecommerce.com/products","proxy_enabled": true,"proxy_type": "residential","rate_limit_requests_per_second": 0.5,"timeout_seconds": 30,"max_retries": 3,"javascript_rendering": true},"extraction_config": {"selectors": {"product_container": ".product-item","product_title": "h2.product-name","product_price": ".price-current","product_rating": ".rating-value","product_url": "a.product-link"},"pagination": {"next_page_selector": "a.next-page","max_pages": 10}},"scrape_results": {"total_items_scraped": 487,"successful_requests": 487,"failed_requests": 0,"execution_time_seconds": 284,"items_per_minute": 103},"extracted_data": [{"title": "Wireless Headphones Pro","price": 199.99,"currency": "USD","rating": 4.5,"reviews_count": 234,"in_stock": true,"url": "https://example-ecommerce.com/products/headphones-pro"},{"title": "USB-C Charging Cable","price": 24.99,"currency": "USD","rating": 4.7,"reviews_count": 1245,"in_stock": true,"url": "https://example-ecommerce.com/products/usb-c-cable"}],"data_quality": {"validation_passed": true,"missing_fields_count": 0,"type_errors": 0,"duplicate_count": 0,"data_integrity": "excellent"},"performance": {"average_request_time_ms": 587,"average_extraction_time_ms": 245,"proxy_rotation_count": 12,"blocks_encountered": 0,"ban_risk": "very_low"},"next_steps": ["Customize selectors for your target site","Test extraction on live site","Configure proxy and rate limiting","Deploy and schedule regular runs"]}
Field Descriptions:
selectors: CSS selectors for each data field you want to extractpagination: Configuration for multi-page scrapingextracted_data: Array of clean, structured objectsdata_quality: Validation results (missing fields, errors)performance: Speed metrics and proxy rotation statistics
🔗 Integrations & Automation
Email Results: Daily email with scraped data, summaries, and status.
Webhook to API: Push results directly to your backend database.
Schedule Recurring: Run daily, weekly, or monthly. Always-fresh data.
Slack Updates: Get notifications when scrapes complete or hit errors.
REST API: Trigger custom scraping jobs on-demand.
MCP Compatible: AI agents can run custom scraping tasks.
🔌 Works Great With
- Super Stealth Scraper — If your target site blocks bots, use stealth version instead.
- Website Tech Stack Detector — Analyze target site's structure before building scraper.
- Product Review Aggregator — Template works great for review scraping too.
- Invoice Receipt Extractor — Template handles document extraction as well.
💰 Cost & Performance
Typical run: Scrape 500 items from one site in 5 minutes for ~$1.80 (includes proxies).
That's $0.0036 per item — cheaper than one person copy-pasting data for 10 minutes.
Compare to manual: One person manually scraping 500 items = 3+ hours. At $25/hour, that's $75+. We do it for $1.80. Plus our data is always fresh if you schedule daily.
🛡️ Built Right
- Proxy rotation prevents IP bans on any site
- Smart retries with exponential backoff
- JavaScript rendering for dynamic content
- Rate limiting respects target server
- Error handling doesn't fail on missing fields
- Data validation ensures clean output
- Duplicate detection optional, configurable
- Timeout protection prevents hanging requests
Getting Started
- Fork this actor to your Apify account
- Update CSS selectors to match your target site
- Test locally (we provide test scripts)
- Configure proxy (residential proxies included)
- Set rate limit (default: 1 req/2s, respectful)
- Deploy and run your first scrape
- Schedule recurring if you need fresh data daily
Fresh data. Zero guesswork. Be the first to know.
📧 Email alerts · 🔗 Webhook triggers · 🤖 MCP compatible · 📡 API access
Built by Creator Fusion — OSINT tools that actually work.