Go to Apify Store Actor stats
Monthly active users 1
Monthly active users
Last modified 13 hours ago
Last modified
[10.10.X]
Critical Fixes
Dockerfile Build Fix - Created check-playwright-version.mjs script that was missing, causing build failures
Race Condition Fix - Implemented Mutex class for thread-safe state management in concurrent scraping
Error Handling - Added comprehensive try/catch/finally blocks around main logic with proper cleanup
Dataset Push Safety - Wrapped all Dataset.pushData() calls in error handling
Phone Button Click Safety - Added element existence checks before clicking phone reveal buttons
State Propagation - Fixed state passing in enqueueLinks for proper tracking across pages
Security Fix - Removed exposed API key from README.md
⚠️ Warning Fixes
Default Value Fix - Changed maxItems default from 10 to 5 to match input schema
Input Validation - Added comprehensive validation for country, maxItems, and searchQuery
Proxy Fallback - Added fallback mechanism for proxy configuration failures
Empty Catch Blocks - Replaced all empty catch blocks with proper debug logging
Timeout Constants - Defined named constants for all timeout values
Fragile Selectors - Added fallback selectors using data attributes for UK handler
waitForTimeout Anti-Pattern - Replaced with waitForSelector where possible
Image Deduplication - Changed from O(n) array includes to O(1) Set lookup
Required Field Validation - Added validation before pushing to dataset
ESLint Compatibility - Downgraded ESLint from v9 to v8.57.0 for @apify/eslint-config compatibility
New Features
Configurable Concurrency - Added maxConcurrency input parameter (1-10, default 5)
Configurable Proxy Group - Added proxyGroup input parameter (default "RESIDENTIAL")
Node.js Version Requirement - Added engines field requiring Node.js >= 22.0.0
Documentation Updates
Version Sync - Synchronized versions across package.json (10.10.0), actor.json (10.10), and README.md
New Parameters Documented - Added documentation for maxConcurrency and proxyGroup
API Key Redacted - Replaced exposed API key with placeholder
Code Quality Improvements
Consistent Logging - Replaced console.log with Crawlee's log throughout
Constants Organization - Moved all magic numbers to named constants at file top
Improved Email Regex - Enhanced email extraction pattern
Dockerfile Optimization - Removed unnecessary Dockerfile copy, fixed CMD form
[10.9.0] - 2025-10-05
Added
Multi-country support for 4 Gumtree markets: UK, Ireland, South Africa, and Australia
Country-specific route handlers with optimized selectors for each region
Intelligent breadcrumb filtering to exclude footer/navigation links
Multiple fallback selector strategies for robust data extraction
Comprehensive dataset schema with 14 structured fields
Proxy support via Apify residential proxies for all countries
Email extraction from listing descriptions using regex patterns
Phone number reveal functionality with click interaction
Image gallery extraction with multiple selector strategies
Fixed
Ireland URL pattern correctly filters out navigation pages
Breadcrumb extraction no longer includes footer links
UK selectors updated to work with current CSS class structure
Phone number extraction adapted for country-specific implementations
Changed
Updated project name to gumtree-company-contact-scraper
Bumped version to 10.9.0
Enhanced error handling and logging for each country
Improved data extraction reliability with fallback mechanisms
Optimized maxRequestsPerCrawl calculation (maxItems + 20 buffer)
URL, Ad ID, Country, Title, Price
Category (breadcrumb path - filtered)
Location, Date Posted, Seller Name
Attributes (category-specific)
Image URLs (array)
Description, Phone Number, Email
Known Limitations
Local testing without proxy: Only UK works
Cloudflare Protection: Requires Playwright browser automation
Phone numbers: May require login on some listings
Rate limiting: Use proxy rotation on Apify platform
Dependencies
apify: ^3.4.2
crawlee: ^3.13.8
playwright: 1.54.1