
Gumtree Business Contact Scraper
Pricing
$8.00 / 1,000 leads

Gumtree Business Contact Scraper
Under maintenanceScrape business contact details from Gumtree classified ads across all categories. Extract phone numbers, email addresses, company websites, and physical addresses from UK, Australia, and international listings. Perfect for B2B lead generation, sales prospecting, and outreach campaigns.
5.0 (1)
Pricing
$8.00 / 1,000 leads
1
1
1
Last modified
4 days ago
Gumtree Multi-Country Company Contact Scraper v10.9
📋 TL;DR
Powerful Apify Actor that scrapes company contact information and listings from Gumtree across 4 countries: UK, Ireland, South Africa, and Australia. Extracts phone numbers, emails, prices, descriptions, images, and more with intelligent country-specific selectors and proxy support.
Quick Start:
npm installnpm start
🌍 Supported Countries
- 🇬🇧 United Kingdom -
gumtree.com
- 🇮🇪 Ireland -
gumtree.ie
- 🇿🇦 South Africa -
gumtree.co.za
- 🇦🇺 Australia -
gumtree.com.au
✨ Features
✅ Multi-Country Support - Automatically adapts to each country's website structure ✅ Contact Extraction - Reveals and extracts phone numbers and email addresses ✅ Smart Selectors - Uses JSON-LD schema, data-q attributes, and CSS fallbacks ✅ Proxy Support - Built-in Apify residential proxy configuration ✅ Comprehensive Data - Title, price, location, category, images, attributes, and more ✅ Robust Error Handling - Multiple fallback strategies for each field ✅ Structured Output - Consistent dataset schema across all countries
📊 Extracted Data
Each listing includes:
Field | Description |
---|---|
01_url | Full URL of the listing page |
02_ad_id | Unique advertisement ID |
03_country | Country code (UK, IE, ZA, AU) |
04_title | Listing title/headline |
05_price | Price or salary information |
06_category | Category breadcrumb path |
07_location | Geographic location (city, region) |
08_date_posted | Date when listing was posted |
09_seller_name | Seller or company name |
10_attributes | Additional attributes (year, make, model, etc.) |
11_image_urls | Array of image URLs |
12_description | Full text description |
13_phone_number | Contact phone number (if available) |
14_email | Contact email address (if found) |
🚀 Usage
Running Locally
-
Install dependencies:
$npm install -
Create input file
.actor/INPUT.json
:{"country": "uk","searchQuery": "laptop","maxItems": 10} -
Run the scraper:
$npm start
Running on Apify Platform
-
Upload to Apify Console or use Apify CLI:
$apify push -
Configure input:
- Country: Select UK, Australia, South Africa, or Ireland
- Search Query: Enter your search term (e.g., "car", "apartment", "jobs")
- Max Items: Number of listings to scrape (1-200)
-
Run the Actor and download results as JSON, CSV, Excel, or HTML
🔧 Configuration
Input Parameters
{"country": "uk", // Options: "uk", "ie", "za", "au""searchQuery": "laptop", // Your search term"maxItems": 10 // Maximum listings to scrape (1-200)}
Proxy Configuration
The actor uses Apify residential proxies by default. To configure your own:
const proxyConfiguration = await Actor.createProxyConfiguration({groups: ['RESIDENTIAL'],countryCode: 'GB' // or 'IE', 'ZA', 'AU'});
Or use your Apify API key for residential proxies:
apify_api_ZZq2ZK353IifC85OqTJ8okjESKuwEf2VfZeK
🏗️ Architecture
Country-Specific Handlers
The scraper uses intelligent routing to handle different website structures:
- UK (
detail_uk
): Uses CSS class selectors (e.g.,css-1utqs9u-header-block
) - Ireland (
detail_ie
): Primary JSON-LD schema.org extraction with CSS fallbacks - South Africa & Australia (
detail
): Usesdata-q
attribute selectors
Extraction Strategy
- JSON-LD First (Ireland) - Most reliable structured data
- data-q Attributes (ZA, AU) - Semantic attribute selectors
- CSS Classes (UK) - Specific class-based selectors
- Fallback Selectors - Multiple alternatives for each field
- Text Pattern Matching - Email regex extraction from descriptions
📁 Project Structure
gumtree-company-contact-scraper/├── .actor/│ ├── actor.json # Actor configuration (v10.9)│ ├── input_schema.json # Input validation schema│ └── dataset_schema.json # Output dataset schema├── src/│ ├── main.js # Entry point with country configs│ └── routes.js # Request handlers (UK, IE, ZA/AU)├── package.json # Dependencies and scripts├── Dockerfile # Container configuration└── README.md # This file
🔍 How It Works
- Start Handler - Processes search results page
- Enqueue Links - Finds all listing detail page URLs
- Country-Specific Handler - Routes to appropriate extraction logic
- Data Extraction:
- Text fields (title, price, location, etc.)
- Interactive elements (phone number reveal)
- Image galleries
- Structured attributes
- Contact information
- Data Validation - Ensures consistent output format
- Dataset Push - Saves to Apify dataset
🛠️ Development
Running Tests
Test different countries and search queries:
# Test UKecho '{"country":"uk","searchQuery":"laptop","maxItems":5}' > .actor/INPUT.jsonnpm start# Test Irelandecho '{"country":"ie","searchQuery":"jobs","maxItems":5}' > .actor/INPUT.jsonnpm start# Test South Africaecho '{"country":"za","searchQuery":"furniture","maxItems":5}' > .actor/INPUT.jsonnpm start# Test Australiaecho '{"country":"au","searchQuery":"bicycle","maxItems":5}' > .actor/INPUT.jsonnpm start
Debugging
Enable debug logs in main.js:
const crawler = new PlaywrightCrawler({proxyConfiguration,requestHandler: router,maxRequestsPerCrawl: maxItems + 20,launchContext: {launchOptions: {args: ['--disable-gpu'],headless: false, // Set to false to see browser},},// Add this for debug logs:log: log.child({ prefix: 'PlaywrightCrawler' }),});
🌐 Country-Specific Notes
🇬🇧 United Kingdom
- URL Pattern:
/p/**/*
- Selectors: CSS class-based (auto-generated classes)
- Phone Reveal: Anchor tag with
/reveal/number/
endpoint
🇮🇪 Ireland
- URL Pattern:
/**/*.html
- Selectors: JSON-LD schema.org (most reliable)
- Special Features: Rich JobPosting and BreadcrumbList schemas
🇿🇦 South Africa
- URL Pattern:
/a-/**/*
- Selectors: data-q attributes
- Phone Reveal: Button with
data-q="reveal-phone-number"
🇦🇺 Australia
- URL Pattern:
/s-ad/**/*
- Selectors: data-q attributes (similar to ZA)
- Phone Reveal: Button with
data-q="reveal-phone-number"
⚠️ Anti-Scraping Considerations
Gumtree implements several anti-bot measures:
- Cloudflare Protection - Requires browser automation (✅ handled by Playwright)
- reCAPTCHA - May trigger on suspicious patterns (✅ mitigated by proxies)
- Rate Limiting - IP-based throttling (✅ use proxy rotation)
- JavaScript Rendering - Heavy client-side rendering (✅ Playwright handles)
- Phone Number Protection - Requires click interaction (✅ implemented)
Best Practices:
- Use residential proxies (already configured)
- Respect rate limits (adjust
maxRequestsPerCrawl
) - Add random delays if needed
- Monitor for CAPTCHA challenges
📦 Dependencies
- apify
^3.4.2
- Apify SDK for actors - crawlee
^3.13.8
- Web scraping and crawling library - playwright
1.54.1
- Browser automation
🔗 Resources
- Apify Platform Documentation
- Crawlee Documentation
- Playwright Documentation
- Gumtree UK
- Gumtree Ireland
- Gumtree South Africa
- Gumtree Australia
📝 Version History
v10.9 (Current)
- ✅ Multi-country support (UK, IE, ZA, AU)
- ✅ Country-specific route handlers
- ✅ JSON-LD extraction for Ireland
- ✅ Enhanced fallback selectors
- ✅ Improved contact extraction
- ✅ Dataset schema validation
- ✅ Proxy configuration
📄 License
ISC
👤 Author
It's not you it's me
🤝 Contributing
Contributions, issues, and feature requests are welcome!
⭐ Show Your Support
Give a ⭐️ if this project helped you!
Built with ❤️ using Apify + Crawlee + Playwright