3.3
Anti-Detection Enhancement Release
Major improvements to browser fingerprinting and anti-detection capabilities for more reliable scraping.
Added
PlaywrightCrawler with advanced fingerprinting support
Sec-Fetch headers (Sec-Fetch-Site, Sec-Fetch-Mode, Sec-Fetch-User, Sec-Fetch-Dest) for realistic browser requests
Warmup flow to establish natural browsing patterns before scraping
Enhanced browser context configuration for better stealth
Changed
Upgraded crawler to use fingerprinting-enabled PlaywrightCrawler
Improved request headers to match real browser behavior
Enhanced anti-bot detection bypass mechanisms
3.2
Maintenance Release
Minor version bump for deployment updates.
Changed
3.1
Bug Fix and Documentation Release
This release addresses critical bugs, improves reliability, and provides a complete documentation rewrite.
Fixed
Fixed Docker container startup command syntax
Added proper error handling for actor initialization
Fixed input validation to make maxResults optional with default value of 100
Fixed race condition in concurrent request counting
Fixed pagination logic to track enqueued vs scraped items separately
Fixed link enqueueing to respect configured limits
Fixed hostname matching for international Yelp domains
Fixed result count reporting accuracy
Fixed configuration file paths
Changed
Updated Playwright dependency version for better compatibility
Documentation
Completely rewrote README with comprehensive documentation
Added usage examples, supported domains list, and technical details
3.0
Major Release - Production-Ready Multi-Country Scraper
Complete rewrite focused on reliability, international support, and robust error handling.
Added
Support for 32 international Yelp domains
Language-specific selectors for 13 non-English countries
Localized search terms for all supported countries
Increased concurrent requests for better performance
Support for up to 200 results per search
Smart pagination with automatic stopping at result limit
Dynamic request calculation based on max results
Triple retry logic with exponential backoff
Fallback selectors for robust data extraction
Partial data saving when pages fail partially
Graceful degradation for individual page errors
Extended timeouts for slow pages
Comprehensive data extraction with 17 fields
Apify residential proxy integration
Realistic browser headers for anti-detection
Automation detection bypass
Browser fingerprinting protection
Automated test suites for configuration validation
Changed
Increased default max results from 5 to 50
Added input validation with min/max limits
Improved crawler configuration for better performance
Updated selector strategy with fallback support
Fixed
Resolved 403 errors with proper proxy configuration
Fixed pagination infinite loops
Addressed memory leaks by limiting photos and reviews
Improved selector failure handling
Fixed country URL matching for all TLDs
[2.2.0] - Previous
Added
Basic support for 32 countries
Simple selector configuration
Changed
Single concurrent request processing
Basic error handling