Glassdoor Jobs Scraper
Pricing
from $3.99 / 1,000 results
Glassdoor Jobs Scraper
Extract job listings and company insights from Glassdoor, the leading employer review and job search platform.
Pricing
from $3.99 / 1,000 results
Rating
0.0
(0)
Developer
Jobs Scraper
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
20 hours ago
Last modified
Categories
Share
Overview
Penetrate Glassdoor's combined job search and employer review platform to extract employment data enriched with company insights. This actor gathers job listings alongside employer ratings, salary transparency data, and workplace culture indicators from one of the most trusted career research destinations.
Features
- Job listings paired with employer review scores
- Glassdoor salary estimates and reported compensation
- Company culture and work-life balance indicators
- Interview process insights from employee reviews
- Proxy rotation with automatic fallback (residential → datacenter)
- CAPTCHA detection and session rotation
- Automatic retry on failures with exponential backoff
- Deduplication of results by application URL
- Dataset validation with auto-fix capability
Supported Inputs
| Field | Type | Default | Description |
|---|---|---|---|
keyword | string | "software engineer" | Search terms for job discovery |
location | string | "" | Geographic filter for results |
country | string | "US" | Country code for proxy routing |
maxItems | integer | 50 | Upper limit on extracted listings |
proxyEnabled | boolean | true | Toggle proxy rotation on/off |
sortBy | string | "relevance" | Result ordering (relevance/date/salary) |
jobType | string | "" | Employment type filter |
experienceLevel | string | "" | Seniority level filter |
datePosted | string | "" | Recency filter (24h/3d/7d/14d/30d) |
remoteOnly | boolean | false | Restrict to remote positions only |
includeCompanyDetails | boolean | true | Fetch extra company information |
includeSalary | boolean | true | Include compensation data |
Output Format
Each scraped listing produces a JSON object with these fields:
{"jobTitle": "Senior Software Engineer","companyName": "Example Corp","location": "US","salary": "$120,000 - $160,000","jobType": "Full-time","experienceLevel": "Senior","postedDate": "2 days ago","applyUrl": "https://www.glassdoor.com/job/12345","companyUrl": "https://www.glassdoor.com/company/example","description": "We are looking for a skilled engineer...","requirements": ["JavaScript", "Node.js", "React"],"benefits": ["Health Insurance", "Remote Work"],"sourcePortal": "Glassdoor","country": "US","scrapedAt": "2025-01-15T10:30:00.000Z"}
Proxy Handling
Proxy management follows a graduated fallback pattern for optimal success rates.
- Apify Residential Proxy (country-targeted) — First choice for Glassdoor
- Apify Residential Proxy (any region) — Fallback if country proxy unavailable
- Apify Datacenter Proxy — Secondary fallback for cost efficiency
- Direct Connection — Last resort when all proxies fail
Proxies auto-rotate on each request. Blocked sessions are discarded and replaced automatically.
Retry Logic
Built-in resilience retries unsuccessful requests with new proxy identities.
- Maximum 5 retries per request
- Fresh browser session on each retry
- Automatic proxy rotation between attempts
- Blocked status codes (401, 403, 429) trigger session refresh
- Configurable request timeout (120 seconds)
Anti-block Handling
Advanced evasion methods reduce the likelihood of being flagged as automated.
navigator.webdriverproperty masked- Human-like delays between page interactions (2–5 seconds)
- Browser language and plugin fingerprints normalised
- Session pool with automatic rotation on blocks
- CAPTCHA detection with graceful retry
- Rate limit detection (HTTP 429) with backoff
Sample Input
{"keyword": "data analyst","location": "US","maxItems": 25,"proxyEnabled": true,"sortBy": "date","remoteOnly": false}
Sample Output
{"jobTitle": "Data Analyst","companyName": "TechCorp International","location": "US","salary": "Competitive","jobType": "Full-time","experienceLevel": "Mid-level","postedDate": "1 day ago","applyUrl": "https://www.glassdoor.com/job/example-123","companyUrl": "","description": "Seeking a detail-oriented data analyst to join our growing team...","requirements": ["SQL", "Python", "Tableau"],"benefits": ["Health Insurance", "Flexible Hours"],"sourcePortal": "Glassdoor","country": "US","scrapedAt": "2025-01-15T14:22:00.000Z"}
Usage
Local Development
# Install dependenciesnpm install# Set Apify token (required for proxy)export APIFY_TOKEN=your_token_here# Run the actornpm start# Validate scraped datanode dataset-validator.js
Apify Platform
# Login to Apifyapify login# Push actor to platformapify push# Run from Apify Console or API
Deployment
- Ensure all dependencies are installed:
npm install - Authenticate with Apify:
apify login - Deploy the actor:
apify push - Configure input in the Apify Console
- Schedule runs or trigger via API / webhooks
Limitations
- Results depend on the portal's current HTML structure; layout changes may require selector updates
- Some job details (salary, benefits) may not be available for all listings
- Rate limiting by the portal may reduce throughput during high-volume scrapes
- CAPTCHA challenges may interrupt scraping on heavily protected pages
- Glassdoor may modify their anti-bot measures, requiring periodic updates
- Maximum items per run is capped at 1000 to prevent excessive resource usage
- Proxy costs apply when using Apify residential or datacenter proxies