# Glassdoor Scraper — Company Reviews, Salaries & Jobs (`bovi/glassdoor-scraper`) Actor

Scrape Glassdoor job listings with salary ranges (p10/p90) plus recent company reviews with sub-ratings (culture, WLB, leadership), pros/cons and CEO approval. Resilient HTML parsing, flat JSON, parse\_confidence per record. Search by company name, URL or employer ID.

- **URL**: https://apify.com/bovi/glassdoor-scraper.md
- **Developed by:** [Vitalii Bondarev](https://apify.com/bovi) (community)
- **Categories:** Jobs, Business
- **Stats:** 3 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$0.90 / 1,000 glassdoor items

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Glassdoor Scraper — Reviews, Jobs & Salaries

Scrape Glassdoor company reviews, job listings with salary ranges, and company ratings. The official Glassdoor API was deprecated in 2024 — this actor uses resilient HTML parsing of the React Server Components (RSC) data embedded in Glassdoor pages.

### What you get

**Reviews mode:**
- Review ID, headline/summary, pros, cons, advice to management
- Overall rating + sub-ratings (culture, career, compensation, work-life balance, leadership)
- Reviewer's job title, location, employment status (current/former)
- CEO approval, business outlook, recommend to a friend flag
- Review date (ISO-8601)
- `parse_confidence` score and `warnings` per record

**Jobs mode:**
- Job title, location, listing ID
- Salary ranges (p10 / p90 — employer-provided or estimated)
- Salary currency, pay period
- Easy Apply flag, approximate posting date
- Canonical Glassdoor job URL

### Proxy requirements

| Mode | Without proxy | With residential proxy + login |
|---|---|---|
| Reviews (page 1) | ✓ 3 reviews/company | ✓ Full pagination |
| Reviews (page 2+) | ✗ Auth required | ✓ Full |
| Jobs (all pages) | ✓ ~40 jobs/page | ✓ Full |
| Salaries page | ✗ CF 403 block | Not yet implemented |

The actor **degrades gracefully**: if a page requires auth it logs a warning and moves on without crashing.

### Input

| Field | Type | Default | Description |
|---|---|---|---|
| `companies` | string[] | — | Company names or Glassdoor URLs |
| `mode` | select | `reviews` | `reviews` / `jobs` / `both` |
| `maxItems` | integer | 50 | Max items per company per mode |
| `proxyConfiguration` | proxy | — | Residential proxy for full pagination |
| `sortReviews` | select | `DATE` | Review sort: `DATE` / `RELEVANCE` / `HELPFULNESS` |
| `language` | string | `eng` | ISO 639-3 language code |
| `filterCurrentEmployeesOnly` | boolean | `false` | Current employees only |

### Output schema

Flat dataset — one row per review or job. Key fields:

```json
{
  "company": "Google",
  "employer_id": 9079,
  "entity_type": "review",
  "review_id": "104181215",
  "rating": 5.0,
  "review_title": "Best job I've ever had",
  "pros": "- people\n- culture/benefits/pay",
  "cons": "Changing culture, aggressive targets",
  "job_title": "Account Executive",
  "location": "San Francisco, CA",
  "date": "2026-05-29T19:24:22.267",
  "recommend": "POSITIVE",
  "parse_confidence": 1.0,
  "warnings": []
}
````

### Pricing (PPE)

`$0.0009` per item — **$0.90 per 1,000 items** (reviews + jobs billed at one flat rate). Competitive vs. the fragmented incumbents (mostly 2–3★, built before the API deprecation).

### Why this actor

The official Glassdoor API was deprecated in early 2024, causing a **demand spike** for scraping alternatives. Existing actors on Apify Store are fragmented (8+ options, none comprehensive) and rated 2–3★. This actor delivers all three data types (reviews + jobs + salary) in a single unified flat schema with `parse_confidence` tracking.

### Technical notes

- Extracts data from `self.__next_f.push()` Next.js RSC streaming payloads
- `gdToken` CSRF token auto-extracted from homepage per session
- Exponential backoff on 429; graceful degradation on CF/auth blocks
- `parse_confidence` field (0.0–1.0) signals parse quality per record

### Integrations

Built for HR-analytics and talent-intelligence teams tracking company reviews, salary ranges, and job postings — the JSON/dataset output drops into the tools you already run, no glue code:

- **n8n / Make / Zapier** — trigger a run or pipe every new dataset item into 500+ apps (Google Sheets, Airtable, Slack, HubSpot, your database) with no code: [n8n](https://docs.apify.com/platform/integrations/n8n), [Make](https://docs.apify.com/platform/integrations/make), [Zapier](https://docs.apify.com/platform/integrations/zapier).
- **Webhooks** — fire your own endpoint the moment a run finishes, to push results straight into your pipeline ([docs](https://docs.apify.com/platform/integrations/webhooks)).
- **MCP server** — expose this actor as a tool to Claude, Cursor, or any [MCP client](https://mcp.apify.com) so an AI agent can pull this data mid-conversation ([guide](https://blog.apify.com/how-to-use-mcp/)).
- **API & SDKs** — fetch the dataset as JSON, CSV, or Excel through the Apify REST API or the Python / JS SDKs.

See all [Apify integrations](https://apify.com/integrations).

### Legal & compliance

This actor collects only **publicly visible** data from Glassdoor — the same information any visitor sees when browsing the site without logging in (page-1 review excerpts and publicly listed job postings). It does not access private, gated, or authenticated content, and it does not store or transmit personal employee data beyond what Glassdoor already exposes publicly.

Use this actor in accordance with Glassdoor's [Terms of Use](https://www.glassdoor.com/about/terms.htm) and applicable data-protection regulations (GDPR, CCPA, etc.). The operator is solely responsible for ensuring their use case complies with relevant laws and platform policies. Data collected via this actor should not be used to identify, contact, or target individual employees without their consent.

### Frequently asked questions

**Do I need a Glassdoor API key or login?**

No API key — the official Glassdoor API was deprecated in early 2024. Page-1 reviews and job listings work with no proxy; full review pagination uses a residential proxy (configured through the standard proxy input, billed to your own Apify account).

**What data can I extract?**

Company reviews (overall and sub-ratings, pros, cons, advice, CEO approval), job listings with salary ranges, and company ratings — by company name or Glassdoor URL.

**What does the output look like?**

A flat dataset, one row per review or job, with a `parse_confidence` score on every record. Export as JSON, CSV or Excel, or pull it via the Apify API.

**Can I run it on a schedule?**

Yes. Use Apify Scheduler to refresh reviews or job listings daily and wire the dataset to a webhook, Google Sheet or your database.

**Is scraping Glassdoor legal?**

This actor collects only publicly visible data — the same a logged-out visitor sees. Use it in line with Glassdoor's Terms of Use and applicable data-protection law (GDPR, CCPA); do not use it to identify or contact individual employees.

### More scrapers from our toolkit

Building a data pipeline? These actors pair well with this one — each runs on your own Apify account with the same pay-per-result pricing, no subscription:

- [Greenhouse, Lever & Ashby Job Scraper](https://apify.com/bovi/greenhouse-lever-ashby-job-scraper)
- [Seek Jobs Scraper](https://apify.com/bovi/seek-jobs-scraper)
- [Naukri Jobs Scraper](https://apify.com/bovi/naukri-jobs-scraper)
- [Indeed Employer Intelligence](https://apify.com/bovi/indeed-employer-intelligence)
- [Hiring Signal Monitor](https://apify.com/bovi/hiring-signal-monitor)
- [Ashby Job Scraper](https://apify.com/bovi/ashby-job-scraper)

Chain any of them together from the **Integrations** tab (the *Run succeeded* trigger) to build a multi-step workflow — one actor's output feeds the next.

# Actor input Schema

## `companies` (type: `array`):

List of company names or Glassdoor URLs. Examples: "Google", "amazon", "https://www.glassdoor.com/Reviews/Apple-Reviews-E1138.htm". The scraper resolves names via Glassdoor search.

## `mode` (type: `string`):

What data to collect: reviews (company reviews with ratings and pros/cons), jobs (open positions with salary ranges), or both.

## `maxItems` (type: `integer`):

Maximum number of reviews or jobs to collect per company. 0 = unlimited (proxy required for full pagination beyond page 1).

## `proxyConfiguration` (type: `object`):

Proxy settings. Glassdoor blocks datacenter IPs for pages beyond page 1 and for the Salaries sub-section. Residential proxy required for full scraping. Leave empty to collect page-1 SSR data only (free).

## `sortReviews` (type: `string`):

How to sort reviews fetched from Glassdoor.

## `filterCurrentEmployeesOnly` (type: `boolean`):

When true, only reviews from current (not former) employees are returned.

## `language` (type: `string`):

ISO 639-3 language code for filtering reviews. Use 'eng' for English-only. Leave blank for all languages.

## Actor input object example

```json
{
  "companies": [
    "Google"
  ],
  "mode": "reviews",
  "maxItems": 50,
  "sortReviews": "DATE",
  "filterCurrentEmployeesOnly": false,
  "language": "eng"
}
```

# Actor output Schema

## `results` (type: `string`):

Dataset containing Glassdoor Scraper records (company, entity\_type, rating, review\_title, job\_title, salary\_base, salary\_total, location, date, url, parse\_confidence).

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "companies": [
        "Google"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("bovi/glassdoor-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "companies": ["Google"] }

# Run the Actor and wait for it to finish
run = client.actor("bovi/glassdoor-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "companies": [
    "Google"
  ]
}' |
apify call bovi/glassdoor-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=bovi/glassdoor-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Glassdoor Scraper — Company Reviews, Salaries & Jobs",
        "description": "Scrape Glassdoor job listings with salary ranges (p10/p90) plus recent company reviews with sub-ratings (culture, WLB, leadership), pros/cons and CEO approval. Resilient HTML parsing, flat JSON, parse_confidence per record. Search by company name, URL or employer ID.",
        "version": "0.1",
        "x-build-id": "l0Z7bmldyMfUAHCyD"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/bovi~glassdoor-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-bovi-glassdoor-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/bovi~glassdoor-scraper/runs": {
            "post": {
                "operationId": "runs-sync-bovi-glassdoor-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/bovi~glassdoor-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-bovi-glassdoor-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "companies": {
                        "title": "Companies to scrape",
                        "type": "array",
                        "description": "List of company names or Glassdoor URLs. Examples: \"Google\", \"amazon\", \"https://www.glassdoor.com/Reviews/Apple-Reviews-E1138.htm\". The scraper resolves names via Glassdoor search.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "mode": {
                        "title": "Scrape mode",
                        "enum": [
                            "reviews",
                            "jobs",
                            "both"
                        ],
                        "type": "string",
                        "description": "What data to collect: reviews (company reviews with ratings and pros/cons), jobs (open positions with salary ranges), or both.",
                        "default": "reviews"
                    },
                    "maxItems": {
                        "title": "Max items per company per mode",
                        "minimum": 0,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of reviews or jobs to collect per company. 0 = unlimited (proxy required for full pagination beyond page 1).",
                        "default": 50
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy settings. Glassdoor blocks datacenter IPs for pages beyond page 1 and for the Salaries sub-section. Residential proxy required for full scraping. Leave empty to collect page-1 SSR data only (free)."
                    },
                    "sortReviews": {
                        "title": "Review sort order",
                        "enum": [
                            "DATE",
                            "RELEVANCE",
                            "HELPFULNESS"
                        ],
                        "type": "string",
                        "description": "How to sort reviews fetched from Glassdoor.",
                        "default": "DATE"
                    },
                    "filterCurrentEmployeesOnly": {
                        "title": "Current employees only",
                        "type": "boolean",
                        "description": "When true, only reviews from current (not former) employees are returned.",
                        "default": false
                    },
                    "language": {
                        "title": "Review language",
                        "type": "string",
                        "description": "ISO 639-3 language code for filtering reviews. Use 'eng' for English-only. Leave blank for all languages.",
                        "default": "eng"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
