
Web Scraper
Pricing
$35.00 / 1,000 results

Web Scraper
Simple web scraper. Extract titles, paragraphs, links, images, tables and more from websites. Supports custom CSS selectors and batch collection. For large needs, try Apify's Web Content Crawler.
0.0 (0)
Pricing
$35.00 / 1,000 results
2
4
4
Last modified
3 days ago
A simple and powerful web scraping tool with comprehensive data extraction capabilities.
⚠️ Usage Notice
This tool is for educational and research purposes only. Users must comply with website terms of service, robots.txt specifications, and relevant laws.
🚀 Quick Start
The simplest way to use - just provide URLs:
{"startUrls": [{"url": "https://example.com"}]}
📊 Output Fields
All data is output with English field names:
- Basic Info: url, title, scrapedAt, processingTimeMs
- Page Content: headings, paragraphs, links, images
- Structured Data: tables, forms, videos, buttons, navigation
- SEO Data: metadata, openGraph, structuredData, pageLanguage
⚙️ Main Settings
Setting | Description | Default |
---|---|---|
startUrls | List of URLs to scrape | Required |
maxRequestsPerCrawl | Maximum pages to scrape | 100 |
maxConcurrency | Concurrent scraping | 2 |
scrollToBottom | Auto-scroll to load content | false |
blockResources | Block resource types for speed | [] |
🎯 Custom Extraction Rules
Use CSS selectors to extract specific content:
{"extractionRules": {"article_title": "h1.article-title","author": ".author-name","publish_date": "time.publish-date","content": "article.content"}}
💡 Usage Examples
Scrape Multiple Pages
{"startUrls": [{"url": "https://example.com"},{"url": "https://example.org"}],"maxRequestsPerCrawl": 10}
Dynamic Content Websites
{"scrollToBottom": true,"waitForSelector": ".content-loaded","pageLoadTimeoutSecs": 20}
Speed Optimization
{"blockResources": ["image", "stylesheet", "font"],"smartMode": true,"maxConcurrency": 1}
📝 Output Example
{"url": "https://example.com/article","title": "Article Title","headings": {"h1": ["Main Title"],"h2": ["Subtitle One", "Subtitle Two"]},"paragraphs": ["First paragraph content...", "Second paragraph content..."],"links": [{"text": "Link text","href": "https://example.com/link"}],"images": [{"src": "https://example.com/image.jpg","alt": "Image description","width": 800,"height": 600}],"processingTimeMs": 2345,"scrapedAt": "2025-08-11T10:30:00.000Z"}
❓ FAQ
Q: What if scraping fails?
- Reduce
maxConcurrency
to 1 - Increase
pageLoadTimeoutSecs
- Disable
smartMode
Q: How to scrape dynamically loaded content?
- Set
scrollToBottom: true
- Use
waitForSelector
to wait for elements
Q: How to speed up scraping?
- Use
blockResources
to block unnecessary resources - Enable
smartMode
for automatic optimization
📜 License
MIT License
Version: 0.1.0
Updated: August 11, 2025
Developer: FuturizeRush