# Greenhouse Job Scraper — Stripe, Airbnb & 10K+ Companies (`bovi/greenhouse-job-scraper`) Actor

Scrape job postings from any Greenhouse-powered company board via the public boards-api. Get title, location, department, seniority, remote-type, descriptions and parse\_confidence. Multi-company batch, keyword filters, zero auth, zero proxy.

- **URL**: https://apify.com/bovi/greenhouse-job-scraper.md
- **Developed by:** [Vitalii Bondarev](https://apify.com/bovi) (community)
- **Categories:** Jobs, Lead generation, AI
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.45 / 1,000 job results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Greenhouse Job Scraper — Careers API

For growth teams and data engineers who need Greenhouse job data at scale — scraping Stripe, Airbnb, Anthropic, and thousands more in one run.

**Pay per result — $1.50 / 1,000 jobs. First jobs free on the Apify free plan.**

Scrape job postings from **any Greenhouse-powered company board** via the
official public API. Give it a list of company slugs — get back clean, enriched
job data in seconds. **No API key. No login. No proxy. No browser. Runs in Apify cloud.**

Get started instantly: use the pre-filled example (stripe, airbnb, notion) or
swap in any Greenhouse company slug. **Try it free** — Apify's $5/month free
tier covers thousands of jobs.

### What this Greenhouse scraper does

- Fetches every open job from any company that uses **Greenhouse ATS**, using
  the official `boards-api.greenhouse.io` endpoint.
- Returns a **clean flat schema** — identical fields every run, no parsing
  surprises.
- Enriches each job with **seniority** (intern / entry / mid / senior / lead /
  staff / principal / manager / director / vp / executive) inferred from the
  job title.
- Detects **remote type** (remote / hybrid) from location text.
- Returns full **job descriptions** as both plain text and HTML (togglable).
- Includes a **parse_confidence score** (0–1.0) and `warnings` list in every
  record — you see exactly how clean the data is.
- Supports **multi-company batch runs** (scrape 100 companies in one run).
- Filters by **title keyword, location keyword, or remote-only**.
- **maxJobsPerCompany** cap (default 50) prevents surprise bills on first runs.

### How to find a Greenhouse slug

Look at the company's careers page URL:
- `boards.greenhouse.io/stripe` → slug is `stripe`
- `boards.greenhouse.io/airbnb` → slug is `airbnb`
- `boards.greenhouse.io/anthropic` → slug is `anthropic`

Thousands of tech companies use Greenhouse: Stripe, Airbnb, Notion, Anthropic,
Figma, Coinbase, Datadog, Discord, Doctolib, Mistral, and more.

### What data you get

One flat row per job — 15 structured fields:

```json
{
  "title": "Senior Software Engineer, Payments",
  "company": "stripe",
  "location": "San Francisco, CA",
  "remote_type": null,
  "seniority": "senior",
  "salary": null,
  "department": "Engineering",
  "employment_type": null,
  "posted_at": "2026-05-20T16:58:18-04:00",
  "url": "https://stripe.com/jobs/search?gh_jid=7546284",
  "apply_url": "https://stripe.com/jobs/search?gh_jid=7546284",
  "job_id": "7546284",
  "global_id": "greenhouse:stripe:7546284",
  "description_text": "Stripe is looking for a Senior Software Engineer...",
  "description_html": "<p>Stripe is looking for...</p>",
  "parse_confidence": 1.0,
  "warnings": [],
  "scraped_at": "2026-05-31T12:00:00+00:00"
}
````

**Field notes:**

- `salary` and `employment_type` are `null` — the Greenhouse public boards API
  does not expose these fields. For salary data, the company must publish it
  voluntarily in the description.
- `parse_confidence` is `1.0` for a fully-populated job; small deductions for
  missing optional fields. Check `warnings` for the reason codes.
- `global_id` format: `greenhouse:{slug}:{job_id}` — stable identifier for
  deduplication across runs.

### Input schema

| Field | Type | Default | Description |
|---|---|---|---|
| `companies` | string array | \[stripe, airbnb, notion] | Greenhouse company slugs to scrape |
| `titleKeyword` | string | — | Case-insensitive title filter |
| `locationKeyword` | string | — | Case-insensitive location filter |
| `remoteOnly` | boolean | false | Return only remote jobs |
| `maxJobsPerCompany` | integer | 50 | Cap per company (0 = unlimited) |
| `includeDescriptions` | boolean | true | Include description\_text + description\_html |

### Pricing

Pay-per-result: **$1.50 / 1,000 jobs** ($0.0015 each). You pay only for jobs returned.

**Worked example:** 3 companies × 50 jobs = 150 results = **$0.23**. 10 companies × 200 jobs = 2,000 results = **$3.00**.

### FAQ

**Do I need an API key or proxy?**
No. The Greenhouse `boards-api.greenhouse.io` endpoint is publicly accessible — no auth, no proxy, no browser.

**What output formats are available?**
JSON, JSONL, CSV, and Excel via the Apify dataset export, plus the Apify REST API.

**Can I run it on a schedule?**
Yes — use Apify Scheduler. Combine with your own dedup on `global_id` (`greenhouse:<slug>:<job_id>`) to track only new postings.

**What if a company slug is wrong or returns 0 jobs?**
The actor logs a warning, records the slug in `failedCompanies`, and continues scraping other targets — it never crashes on a single bad slug.

### Reliability

The Greenhouse boards API is a stable, officially public endpoint with no
authentication, no rate limiting, and no HTML parsing. This actor uses zero
browser, zero proxy — just direct HTTP to an official JSON API. Parse
confidence is consistently 1.0 on well-populated boards.

Typical run: 500 jobs for one company in under 5 seconds.

### Use cases

- **Job aggregation** — build a searchable careers board for tech companies.
- **Hiring intelligence** — track headcount growth, new departments, role types.
- **AI agent input** — feed job data to an LLM agent for matching, summarizing
  or alerting.
- **Lead generation** — identify companies actively hiring in specific functions.
- **Competitive analysis** — monitor competitor hiring signals in real time.

### Alternatives and comparison

Several Greenhouse scrapers exist on Apify Store. This one differentiates by:

| Feature | This scraper | greenhouse-job-scraper (epctex) | i-scraper |
|---|---|---|---|
| `parse_confidence` score | Yes | No | No |
| Seniority enrichment | Yes (11 levels) | No | Partial |
| Remote type detection | Yes | Rarely | Rarely |
| Multi-company batch | Yes | Limited | Yes |
| Descriptions (text + HTML) | Yes | Some | Some |
| `global_id` for dedup | Yes | No | No |
| Price | $1.50/1k | varies | $1.90/1k |

`parse_confidence` (0.0–1.0) and `warnings` are in every record — deductions for missing job\_id (0.15), title (0.15), url (0.10), posted\_at (0.05), description (0.05). Your pipeline sees data quality signals before it sees broken output.

### Use with AI agents (MCP)

This actor is MCP-compatible. Use it as a data source in n8n, Make, or any LLM agent pipeline.

```
https://mcp.apify.com/?tools=bovi/greenhouse-job-scraper
```

The `global_id` field (`greenhouse:<slug>:<job_id>`) is a stable cross-run key. Feed this actor's output directly to vector databases, job-matching LLMs, or n8n/Make workflows. Use with the flagship [Greenhouse, Lever & Ashby Jobs API](https://apify.com/bovi/greenhouse-lever-ashby-job-scraper) to cover all ATS platforms in one pipeline.

### Integrations

Built for hiring-intel and labor-market analysts tracking open roles across Greenhouse-powered companies at scale — the JSON/dataset output drops into the tools you already run, no glue code:

- **n8n / Make / Zapier** — trigger a run or pipe every new dataset item into 500+ apps (Google Sheets, Airtable, Slack, HubSpot, your database) with no code: [n8n](https://docs.apify.com/platform/integrations/n8n), [Make](https://docs.apify.com/platform/integrations/make), [Zapier](https://docs.apify.com/platform/integrations/zapier).
- **Webhooks** — fire your own endpoint the moment a run finishes, to push results straight into your pipeline ([docs](https://docs.apify.com/platform/integrations/webhooks)).
- **MCP server** — expose this actor as a tool to Claude, Cursor, or any [MCP client](https://mcp.apify.com) so an AI agent can pull this data mid-conversation ([guide](https://blog.apify.com/how-to-use-mcp/)).
- **API & SDKs** — fetch the dataset as JSON, CSV, or Excel through the Apify REST API or the Python / JS SDKs.

See all [Apify integrations](https://apify.com/integrations).

### Not affiliated with Greenhouse

This actor uses the public Greenhouse Boards API, which is freely accessible
to anyone. It is not affiliated with, endorsed by, or partnered with Greenhouse
or its parent company Cornerstone OnDemand.

# Actor input Schema

## `companies` (type: `array`):

Greenhouse company slugs to scrape. The slug is what appears after boards.greenhouse.io/ in the careers URL. Examples: "stripe", "airbnb", "anthropic". Each item can be a plain string ("stripe") or an object ({"slug": "stripe"}).

## `titleKeyword` (type: `string`):

Keep only jobs whose title contains this text (case-insensitive). Example: "engineer". Leave blank to return all jobs.

## `locationKeyword` (type: `string`):

Keep only jobs whose location contains this text (case-insensitive). Example: "San Francisco". Leave blank for all locations.

## `remoteOnly` (type: `boolean`):

When enabled, only jobs detected as fully remote are returned (based on location text containing "remote").

## `maxJobsPerCompany` (type: `integer`):

Cap on jobs pushed per company slug after filtering. Default 50 keeps trial runs cheap. Set to 0 for unlimited.

## `includeDescriptions` (type: `boolean`):

Return full job description as plain text and HTML. Greenhouse includes descriptions in the list response so this adds no extra API calls — it only controls whether the fields are in your output.

## Actor input object example

```json
{
  "companies": [
    "stripe",
    "airbnb",
    "notion"
  ],
  "remoteOnly": false,
  "maxJobsPerCompany": 50,
  "includeDescriptions": true
}
```

# Actor output Schema

## `results` (type: `string`):

Dataset containing Greenhouse Job Scraper records (title, company, location, remote\_type, seniority, department, posted\_at, url, parse\_confidence, apply\_url, global\_id).

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "companies": [
        "stripe",
        "airbnb",
        "notion"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("bovi/greenhouse-job-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "companies": [
        "stripe",
        "airbnb",
        "notion",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("bovi/greenhouse-job-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "companies": [
    "stripe",
    "airbnb",
    "notion"
  ]
}' |
apify call bovi/greenhouse-job-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=bovi/greenhouse-job-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Greenhouse Job Scraper — Stripe, Airbnb & 10K+ Companies",
        "description": "Scrape job postings from any Greenhouse-powered company board via the public boards-api. Get title, location, department, seniority, remote-type, descriptions and parse_confidence. Multi-company batch, keyword filters, zero auth, zero proxy.",
        "version": "0.1",
        "x-build-id": "fqHEiD8PhB9Vo11Kp"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/bovi~greenhouse-job-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-bovi-greenhouse-job-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/bovi~greenhouse-job-scraper/runs": {
            "post": {
                "operationId": "runs-sync-bovi-greenhouse-job-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/bovi~greenhouse-job-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-bovi-greenhouse-job-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "companies": {
                        "title": "Company slugs",
                        "type": "array",
                        "description": "Greenhouse company slugs to scrape. The slug is what appears after boards.greenhouse.io/ in the careers URL. Examples: \"stripe\", \"airbnb\", \"anthropic\". Each item can be a plain string (\"stripe\") or an object ({\"slug\": \"stripe\"}).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "titleKeyword": {
                        "title": "Title keyword filter",
                        "type": "string",
                        "description": "Keep only jobs whose title contains this text (case-insensitive). Example: \"engineer\". Leave blank to return all jobs."
                    },
                    "locationKeyword": {
                        "title": "Location keyword filter",
                        "type": "string",
                        "description": "Keep only jobs whose location contains this text (case-insensitive). Example: \"San Francisco\". Leave blank for all locations."
                    },
                    "remoteOnly": {
                        "title": "Remote only",
                        "type": "boolean",
                        "description": "When enabled, only jobs detected as fully remote are returned (based on location text containing \"remote\").",
                        "default": false
                    },
                    "maxJobsPerCompany": {
                        "title": "Max jobs per company",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Cap on jobs pushed per company slug after filtering. Default 50 keeps trial runs cheap. Set to 0 for unlimited.",
                        "default": 50
                    },
                    "includeDescriptions": {
                        "title": "Include job descriptions",
                        "type": "boolean",
                        "description": "Return full job description as plain text and HTML. Greenhouse includes descriptions in the list response so this adds no extra API calls — it only controls whether the fields are in your output.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
