# Greenhouse Job Board API — Jobs, Departments & Offices (`logiover/greenhouse-job-board-scraper`) Actor

Unofficial Greenhouse Job Board API in one Apify actor. Scrape jobs, full descriptions, departments and offices from any company on Greenhouse — Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, Datadog, Notion. Pure HTTP, no auth, parallel batch. For HR tech, ATS, lead gen and AI agents.

- **URL**: https://apify.com/logiover/greenhouse-job-board-scraper.md
- **Developed by:** [Logiover](https://apify.com/logiover) (community)
- **Categories:** Jobs, Lead generation
- **Stats:** 2 total users, 1 monthly users, 92.9% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Greenhouse Job Board API — Career Pages Scraper

> **The unofficial Greenhouse Job Board API in a single Apify actor.** Scrape every open job, full description, department and office from any company hosted on Greenhouse — Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, Datadog, Cohere, Plaid, Ramp, Retool, HuggingFace and **thousands more** — all with one input. Pure HTTP, no login, no captcha, no anti-bot, no scraping war.

[![Apify Actor](https://img.shields.io/badge/Apify-Actor-blue)](https://apify.com) [![Greenhouse API](https://img.shields.io/badge/Greenhouse-Job%20Board%20API-22a06b)](https://developers.greenhouse.io/job-board.html) [![Pure HTTP](https://img.shields.io/badge/runtime-pure%20HTTP-success)](https://github.com)

**Most ATS scrapers fight cookie banners, JavaScript renders, rotating IPs and Cloudflare.** This one doesn't need to — Greenhouse publishes a clean, public, **rate-limit-free** REST API for every company's job board. We wrap it, normalize it, give it sensible filters, and ship it as an Apify actor you can call from Make, n8n, Zapier, your CRM, your AI agent or your data warehouse.

---

### Why this actor exists — and why it's the highest-leverage ATS scraper you'll buy

Greenhouse powers career pages for **5,000+ tech companies**. One actor, one input array, you cover them all. Here's the asymmetry:

| | Typical company-by-company scraper | **Greenhouse Job Board API actor** |
|---|---|---|
| Coverage per actor | 1 company | Every Greenhouse customer (Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, …) |
| Authentication | OAuth / login / cookies | None — fully public |
| Anti-bot | Captcha, Cloudflare, fingerprinting | None — first-party API |
| Rate limits | Frequent | None documented |
| Job description | Often partial / paywalled | Full HTML, decoded |
| Salary ranges | Rarely exposed | Available via `pay_input_ranges=true` |
| Departments & offices | Scraped from sidebars | Returned as structured trees |
| Custom fields (employment type, visa, …) | Lost in scraping | Structured `metadata` objects |
| GDPR / compliance flags | Lost | Returned as `data_compliance` |
| Application questions | Rarely captured | Full `questions` array on jobDetail |

**Real numbers from a single run:** 3 boards (anthropic, mistralai, ramp) → 200+ open jobs, full HTML descriptions, sub-2-second cold start, zero retries.

---

### What it does

Five modes, one actor. Pick the mode that matches your use case:

#### 1. `jobs` — list every open job for one or more boards (the default)

The bread-and-butter mode. Send a list of board tokens, get every open job with descriptions, departments, offices, custom metadata, and compliance flags. Filter client-side by department, office, location, title keyword or language.

```json
{
  "mode": "jobs",
  "boardTokens": ["anthropic", "mistralai", "ramp"],
  "fullContent": true,
  "filterDepartments": ["Engineering", "Research"],
  "filterLocations": ["San Francisco", "London", "Paris", "Remote"]
}
````

#### 2. `jobDetail` — rich detail for specific job IDs

When you already have a list of IDs (from `jobs` mode or your own database) and need the application questions, EEOC compliance fields, demographic questions or pay ranges.

```json
{
  "mode": "jobDetail",
  "boardTokens": ["anthropic"],
  "jobIds": ["5987708004", "5987708005"],
  "includeQuestions": true,
  "includePayRanges": true
}
```

#### 3. `board` — company-level board profile

The board's name and welcome content (the introduction text shown at `boards.greenhouse.io/{token}`). Useful for company-intelligence aggregations.

```json
{ "mode": "board", "boardTokens": ["anthropic"] }
```

#### 4. `departments` — the company's department tree

Every department with its parent/child relationships and the jobs nested under it. Perfect for building department-level dashboards or department-filtered hiring trend reports.

```json
{ "mode": "departments", "boardTokens": ["stripe", "datadog"] }
```

#### 5. `offices` — the company's office tree

Geographic hierarchy: continent → country → city → site, with the departments and jobs nested in each. Use this to map where a company is hiring globally.

```json
{ "mode": "offices", "boardTokens": ["airbnb"] }
```

***

### Who this is for

If you build, sell to, or operate any of these — this actor saves you weeks of scraping engineering:

- **HR tech & job aggregators** — power your meta-search with comprehensive, near-real-time tech job inventory.
- **Sales intelligence & lead generation** — "hiring signals" are the strongest growth proxy in B2B sales. A spike in Engineering hires at a startup → that startup is your lead.
- **Recruitment agencies** — track which companies open which roles in which cities, with what salary bands, before your competitors notice.
- **VC / scout tools** — hiring velocity is leading-indicator data for portfolio monitoring. This actor gives you the raw signal at scale.
- **Compensation intelligence platforms** — combine `pay_input_ranges` data with job titles and locations to build salary benchmarks.
- **ATS & HRIS integrations** — sync Greenhouse-hosted jobs into your platform without dealing with OAuth or per-customer onboarding.
- **AI agents & LLM apps** — feed structured job data into your assistant for career-coaching, comp-research or company-research workflows.
- **Career sites & newsletters** — power your own niche job board (tech, fintech, climate, AI…) with up-to-date inventory from the companies that matter.

***

### Companies on Greenhouse you can scrape today

This is a small, **non-exhaustive** selection — Greenhouse hosts thousands of companies. Take the URL slug after `boards.greenhouse.io/` and pop it into `boardTokens`.

| Category | Sample board tokens |
|---|---|
| **AI / Foundation models** | `anthropic`, `mistralai`, `cohere`, `huggingface`, `runwayml`, `mosaicml`, `character` |
| **Big tech / Travel / Marketplaces** | `airbnb`, `dropbox`, `pinterest`, `lyft`, `instacart`, `doordash`, `glovo`, `blockchain`, `affirm` |
| **Fintech / Payments / Banking** | `stripe`, `plaid`, `ramp`, `brex`, `mercury`, `klarna`, `wise`, `revolut`, `nubank`, `flexport` |
| **Infrastructure / DevTools** | `datadog`, `gitlab`, `retool`, `hashicorp`, `airbyte`, `confluent`, `fastly`, `cloudflare` |
| **SaaS / Productivity / Collaboration** | `notion`, `figma`, `linear`, `vercel`, `posthog`, `loom`, `front`, `airtable` |
| **HealthTech** | `doctolib`, `oscar`, `cohere-health`, `included-health` |
| **E-commerce / Marketplaces** | `backmarket`, `vinted`, `etsy`, `chewy`, `whoop` |
| **EU / French tech ecosystem** | `mistralai`, `doctolib`, `backmarket`, `qonto`, `swile`, `payfit`, `algolia` |
| **Crypto / Web3** | `coinbase`, `kraken`, `chainalysis` |

> **How to find the token for a company you care about:** open the company's careers page. If you land on a URL like `boards.greenhouse.io/<slug>` or `<company>.com/careers` that embeds a Greenhouse iframe, view source and look for `boards-api.greenhouse.io/v1/boards/<slug>/jobs`. The `<slug>` is your token.

***

### Quick start

#### Run from the Apify console

1. Open the actor's page in the [Apify Store](https://apify.com/store).
2. Paste the JSON below into the **Input** tab.
3. Hit **Start**.
4. Open the **Dataset** tab when it finishes. Export to CSV, JSON, Excel or hit the **API** tab for an integration URL.

```json
{
  "mode": "jobs",
  "boardTokens": ["anthropic", "mistralai", "ramp"],
  "fullContent": true,
  "stripHtml": true,
  "filterDepartments": ["Engineering", "Research"],
  "concurrency": 5
}
```

#### Run from the API

Use the standard Apify run-sync-get-dataset-items endpoint:

```bash
curl -X POST "https://api.apify.com/v2/acts/<USERNAME>~greenhouse-job-board-scraper/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "jobs",
    "boardTokens": ["anthropic", "mistralai"],
    "fullContent": true
  }'
```

#### Run from a workflow tool (Make, Zapier, n8n)

Every workflow tool that supports Apify or HTTP can call this actor. Wire it to a daily schedule, push results into Airtable / Sheets / Postgres / S3, downstream-filter as needed.

***

### Input reference

| Field | Type | Default | Description |
|---|---|---|---|
| `mode` | string | `jobs` | One of `jobs`, `jobDetail`, `board`, `departments`, `offices`. |
| `boardTokens` | string\[] | – | List of board slugs. Required for every mode. |
| `jobIds` | string\[] | `[]` | Numeric job IDs (jobDetail mode only). Scoped to the first `boardTokens` entry. |
| `fullContent` | boolean | `true` | jobs mode — append `?content=true` for full descriptions, departments, offices. |
| `includeQuestions` | boolean | `false` | jobDetail mode — include application/EEOC/demographic questions. |
| `includePayRanges` | boolean | `false` | jobDetail mode — include `pay_input_ranges` array. |
| `decodeContent` | boolean | `true` | Decode HTML entities in `content` (server returns `&lt;p&gt;`-encoded). |
| `stripHtml` | boolean | `false` | Also produce a `contentText` plain-text field. |
| `filterDepartments` | string\[] | `[]` | Substring (case-insensitive) match against department names. |
| `filterOffices` | string\[] | `[]` | Substring match against office names. |
| `filterLocations` | string\[] | `[]` | Substring match against the job's `location.name`. |
| `filterTitleKeywords` | string\[] | `[]` | Substring match against job titles. |
| `filterLanguages` | string\[] | `[]` | Match the job's `language` field (e.g. `en`, `fr`, `de`). |
| `maxResultsPerBoard` | integer | `0` | Cap per board (0 = no cap). |
| `concurrency` | integer | `5` | Parallel board fetches (1–20). |

All filters are applied **client-side** because the Greenhouse public API does not accept query parameters — there is no server-side filter primitive. The actor fetches the full list (one request per board) and applies filters before pushing to the dataset.

***

### Output reference (jobs mode)

A real example item from a live run against `datsolutions` (DAT Freight & Analytics):

```json
{
  "_mode": "jobs",
  "boardToken": "datsolutions",
  "id": 5987708004,
  "internalJobId": 5149982004,
  "title": "Account Executive - Enterprise Broker Automation",
  "companyName": "DAT",
  "requisitionId": "1445",
  "location": { "name": "Remote - USA" },
  "locationName": "Remote - USA",
  "absoluteUrl": "https://careers.dat.com/jobs/5987708004?gh_jid=5987708004",
  "language": "en",
  "updatedAt": "2026-05-08T19:19:46-04:00",
  "firstPublished": "2026-05-08T19:19:46-04:00",
  "applicationDeadline": null,
  "content": "<p><strong>About DAT</strong></p><p>DAT Freight & Analytics is…",
  "contentText": "About DAT DAT Freight & Analytics is an award-winning employer…",
  "departments": [],
  "departmentNames": [],
  "offices": [],
  "officeNames": [],
  "metadata": [
    { "id": 4200041004, "name": "Employment Type",      "value": "Regular",   "value_type": "single_select" },
    { "id": 4209572004, "name": "Full-time/ Part-time", "value": "Full-time", "value_type": "single_select" }
  ],
  "metadataMap": {
    "Employment Type": "Regular",
    "Full-time/ Part-time": "Full-time"
  },
  "dataCompliance": [
    { "type": "gdpr", "requires_consent": false, "requires_processing_consent": false, "requires_retention_consent": false, "retention_period": null, "demographic_data_consent_applies": false }
  ],
  "scrapedAt": "2026-05-13T11:00:00.000Z"
}
```

#### Field guide

- **`id` vs `internalJobId`** — `id` is the public job-post ID (what the URL uses, what you POST applications to). `internalJobId` is the underlying job in Greenhouse, useful for cross-referencing with the Harvest API. Prospect posts have `internalJobId: null`.
- **`content` is decoded HTML by default.** Set `decodeContent: false` to keep the raw `&lt;p&gt;`-style server response. Set `stripHtml: true` to get a plain-text `contentText` for AI embeddings or CSV.
- **`metadata` is a structured array, not a string.** Greenhouse exposes custom job fields here (Employment Type, Full-time/Part-time, visa sponsorship, …). The actor builds a `metadataMap` so you can read it like a hash.
- **`dataCompliance` reveals GDPR rules** the employer has configured for the post (consent requirements, retention period, etc.).
- **`departments` and `offices` are full structured objects**, not just names. Each has `id`, `name`, `parent_id`, `child_ids` so you can reconstruct hierarchies.
- **All timestamps are ISO 8601** with the employer's local offset preserved (e.g. `2026-05-08T19:19:46-04:00`).

#### jobDetail mode adds

When you call `mode: "jobDetail"` with `includeQuestions: true` and/or `includePayRanges: true`:

```json
{
  "questions": [
    {
      "fields": [
        { "name": "resume", "type": "input_file", "values": [] }
      ],
      "label": "Resume/CV",
      "required": true,
      "description_preface": null,
      "description": null
    }
  ],
  "payInputRanges": [
    {
      "min_cents": 18000000,
      "max_cents": 26000000,
      "currency_type": "USD",
      "interval": "year",
      "applicable_to_remote_locations": "USA only"
    }
  ],
  "locationQuestions": [/* … */],
  "compliance": [/* EEOC questions when enabled */],
  "demographicQuestions": { /* Greenhouse Inclusion data */ }
}
```

***

### Real-world recipes

#### Recipe 1 — Daily AI-startup hiring digest

Track AI labs' Engineering and Research hires. Schedule daily, pipe results into Slack or email.

```json
{
  "mode": "jobs",
  "boardTokens": ["anthropic", "mistralai", "cohere", "huggingface", "character"],
  "fullContent": true,
  "filterDepartments": ["Engineering", "Research", "ML", "Applied"],
  "stripHtml": true,
  "concurrency": 5
}
```

#### Recipe 2 — French tech ecosystem (Paris + Remote)

Map open tech jobs in the French scaleup ecosystem.

```json
{
  "mode": "jobs",
  "boardTokens": ["mistralai", "doctolib", "backmarket", "qonto", "swile", "payfit", "algolia"],
  "filterLocations": ["Paris", "Remote", "France"],
  "stripHtml": true
}
```

#### Recipe 3 — Senior+ infrastructure roles across DevTools

Hunt senior infra/SRE/platform roles across DevTools companies.

```json
{
  "mode": "jobs",
  "boardTokens": ["datadog", "gitlab", "hashicorp", "fastly", "airbyte", "confluent", "retool"],
  "filterTitleKeywords": ["senior", "staff", "principal", "platform", "infrastructure", "SRE"],
  "stripHtml": true
}
```

#### Recipe 4 — Build a salary-intelligence dataset

Pull pay-range data for all jobs at companies that publish it. Two-step: collect IDs in jobs mode, then re-run jobDetail on those IDs.

```json
{
  "mode": "jobs",
  "boardTokens": ["ramp", "brex", "mercury", "plaid", "stripe"],
  "filterLocations": ["United States", "New York", "San Francisco", "Remote"],
  "fullContent": false
}
```

Then for each `id` you collect:

```json
{
  "mode": "jobDetail",
  "boardTokens": ["ramp"],
  "jobIds": ["[id1]", "[id2]", "..."],
  "includePayRanges": true
}
```

#### Recipe 5 — Company-level org snapshot

How is a target company organized? Pull the office and department tree side-by-side.

```json
{ "mode": "offices",     "boardTokens": ["airbnb"] }
```

```json
{ "mode": "departments", "boardTokens": ["airbnb"] }
```

***

### Performance characteristics

- **Cold start**: ~1.5–3 seconds (Apify container boot + first request).
- **Per-board fetch**: typically 100–500 ms for the jobs endpoint, depending on board size. Greenhouse caches aggressively.
- **No rate limits documented.** That said, the actor defaults to concurrency 5 to stay polite. Raise to 20 if you're scraping dozens of boards in one run.
- **No pagination needed.** Greenhouse's jobs endpoint returns the full list in one response, no Link-header walking.
- **Retries:** 3 attempts with linear backoff (500 ms × attempt). 404s are not retried.
- **Filters are O(n) over the in-memory job list.** Even at 5,000 jobs per board, filtering completes in single-digit milliseconds.

***

### Cross-sell — pair with these actors

If you bought this actor because you needed European startup talent data, you almost certainly want one or more of:

- **Welcome to the Jungle Jobs Scraper** — French/EU tech-startup-focused job board, Algolia-backed. Complementary coverage to Greenhouse-hosted companies.
- **Apple App Store Data API** — when you're enriching company intelligence with their iOS app presence, ratings and privacy labels.
- **Google Play Data API** — same for Android.

Same architecture (pure HTTP, public APIs, sub-3-second cold start, batch + parallel), same monetization model, same author. Combine them in one workflow for a complete company-intelligence pipeline.

***

### FAQ

**Q: Do I need a Greenhouse API key?**\
A: No. The Job Board API GET endpoints are fully public. Only POST (application submission) requires Basic Auth, which this actor does **not** do — it's read-only.

**Q: How do I find a company's board token?**\
A: Open their careers page. If the URL is `boards.greenhouse.io/<slug>`, that's the token. If they embed a Greenhouse iframe on their own domain, view source and look for `boards-api.greenhouse.io/v1/boards/<slug>/...`. The slug is your token.

**Q: Why is the description HTML encoded by default?**\
A: That's how Greenhouse returns it (`&lt;p&gt;` instead of `<p>`). The actor decodes it once by default (`decodeContent: true`) so you get clean renderable HTML. Set the flag to false if you need the raw server response.

**Q: Why don't filters use the API directly?**\
A: Because the API doesn't accept filter query parameters — there's literally no `?department=...` or `?location=...` primitive. Every Greenhouse-scraping tool, including this one, fetches the full list and filters client-side. With Greenhouse's caching this is still fast.

**Q: Does this work on internal job boards?**\
A: No. Only public boards (the `https://boards.greenhouse.io/<token>` ones). Internal boards require Harvest API authentication, which is out of scope.

**Q: What about LinkedIn jobs / Indeed / Glassdoor?**\
A: Different actors (different sources, different anti-bot challenges). This one is laser-focused on Greenhouse because the API quality is unparalleled — clean, public, rate-limit-free.

**Q: Is there a salary field?**\
A: Sometimes. Look at:

- `metadata` / `metadataMap` — some employers publish salary as a custom metadata field.
- `payInputRanges` — populated when you call `jobDetail` with `includePayRanges: true` AND the employer has configured pay ranges in Greenhouse.
- The job's `content` HTML often contains compensation text in free form.

**Q: How fresh is the data?**\
A: Real-time. The Greenhouse Job Board API serves the current state of each board the moment you call it. There is no scraping lag.

**Q: Can I detect newly posted jobs since my last run?**\
A: Yes — diff against `id` or use `updatedAt` / `firstPublished`. The `firstPublished` field gives you a true "this is brand new" signal on boards that populate it.

**Q: What happens if a board token doesn't exist?**\
A: The actor logs the 404 as a per-board error and continues with the rest. It does not abort the whole run.

**Q: Why is `internalJobId` sometimes null?**\
A: For "prospect" posts (Greenhouse's term for general-interest landing pages that aren't tied to a specific requisition).

***

### Pricing & monetization

This actor is billed on a **pay-per-result** basis. Every job, board, department or office item written to the dataset counts as one result. Free runs included per Apify's standard policy.

Cost is dominated by storage + compute, not by the API call itself (Greenhouse is free). For very large boards (10,000+ jobs in a single run), set `maxResultsPerBoard` to cap the spend.

***

### Changelog

- **v1.0.0 (2026-05)** — Initial release.
  - 5 modes: jobs, jobDetail, board, departments, offices.
  - HTML-entity decoding with single-pass semantics.
  - Structured metadata + GDPR compliance fields.
  - Client-side filters on department, office, location, title, language.
  - Bounded-concurrency parallel fetching (1–20).
  - Retries with linear backoff, 404 short-circuit.
  - Live tested against `anthropic`, `mistralai`, `datsolutions`, `ramp`, `airbnb`.

***

### Support

- File an issue on the actor's Apify page (Issues tab).
- Apify docs: <https://docs.apify.com/>
- Greenhouse Job Board API reference: <https://developers.greenhouse.io/job-board.html>

***

### Legal

This actor consumes the **public** Greenhouse Job Board API. Greenhouse explicitly documents that the GET endpoints are publicly accessible without authentication and not rate-limited. Data published through this API is data that employers have chosen to make public on `boards.greenhouse.io`. Respect each employer's terms of use and your local data-protection laws when re-distributing scraped data.

This actor is not affiliated with, endorsed by, or sponsored by Greenhouse Software, Inc.

# Actor input Schema

## `mode` (type: `string`):

Which endpoint family to call. `jobs` returns the full job list for each board (most common). `jobDetail` returns rich detail (questions, pay ranges) for specific job IDs. `board` returns the board profile/description. `departments` and `offices` return the company hierarchy.

## `boardTokens` (type: `array`):

Greenhouse board tokens — the slug after boards.greenhouse.io/. For example for https://boards.greenhouse.io/airbnb the token is `airbnb`. Add as many as you want; the actor fetches them in parallel. Examples: airbnb, stripe, anthropic, mistralai, datadog, doctolib, cohere, glovo, ramp, retool.

## `jobIds` (type: `array`):

Numeric job IDs to fetch in `jobDetail` mode. Only the FIRST board token is used in this mode — IDs are scoped to a single board.

## `fullContent` (type: `boolean`):

When ON, jobs mode appends `?content=true` so each job includes the full HTML description, department list and office list. Significantly richer output. When OFF, only the lightweight summary is returned (faster, smaller). Default: ON.

## `includeQuestions` (type: `boolean`):

When ON in `jobDetail` mode, appends `?questions=true` so the response includes the application form questions, EEOC compliance fields, location questions and demographic questions defined for the job.

## `includePayRanges` (type: `boolean`):

When ON in `jobDetail` mode, appends `?pay_input_ranges=true` so the response includes the pay ranges configured for the post. Only populated if the employer publishes salary info.

## `decodeContent` (type: `boolean`):

Greenhouse stores descriptions as HTML-entity-encoded text (`&lt;p&gt;…` rather than `<p>…`). With this ON (default), entities are decoded once so the `content` field contains clean HTML you can render directly. Turn OFF to keep the raw server response untouched.

## `stripHtml` (type: `boolean`):

When ON, adds a `contentText` field with HTML tags stripped and whitespace collapsed. Useful for keyword search, AI embeddings, CSV exports.

## `filterDepartments` (type: `array`):

Substring match (case-insensitive) against department names. Example: `Engineering`, `Product`. Empty = no filter. Filtering happens client-side because the Greenhouse public API does not accept query filters.

## `filterOffices` (type: `array`):

Substring match (case-insensitive) against office names. Example: `Paris`, `San Francisco`, `Remote`. Empty = no filter.

## `filterLocations` (type: `array`):

Substring match (case-insensitive) against the job's `location.name`. Each job has a single location string — this filters on that text. Example: `Remote - USA`, `London`. Empty = no filter.

## `filterTitleKeywords` (type: `array`):

Substring match (case-insensitive) against job titles. Example: `senior`, `manager`, `python`. Any match keeps the job. Empty = no filter.

## `filterLanguages` (type: `array`):

Match the job's `language` field (ISO 639-1 codes like `en`, `fr`, `de`). Useful when scraping multinational boards. Empty = no filter.

## `maxResultsPerBoard` (type: `integer`):

Cap the number of jobs saved per board token. The full list is still fetched (Greenhouse doesn't paginate) but only the first N items are pushed to the dataset. Set 0 for no cap.

## `concurrency` (type: `integer`):

How many board tokens to fetch in parallel. The Greenhouse API has no documented rate limit, but 5–10 is a polite default. Range 1–20.

## Actor input object example

```json
{
  "mode": "jobs",
  "boardTokens": [
    "anthropic",
    "mistralai",
    "ramp"
  ],
  "jobIds": [],
  "fullContent": true,
  "includeQuestions": false,
  "includePayRanges": false,
  "decodeContent": true,
  "stripHtml": false,
  "filterDepartments": [],
  "filterOffices": [],
  "filterLocations": [],
  "filterTitleKeywords": [],
  "filterLanguages": [],
  "concurrency": 5
}
```

# Actor output Schema

## `_mode` (type: `string`):

jobs | jobDetail | board | departments | offices

## `boardToken` (type: `string`):

Greenhouse board slug (boards.greenhouse.io/<token>)

## `id` (type: `string`):

Numeric job-post ID

## `internalJobId` (type: `string`):

Greenhouse internal job ID (for Harvest API joins)

## `title` (type: `string`):

Job post title

## `companyName` (type: `string`):

Human-readable company name

## `requisitionId` (type: `string`):

HR requisition ID (employer-defined)

## `locationName` (type: `string`):

Job location (flattened from location.name)

## `absoluteUrl` (type: `string`):

Public-facing URL of the job post

## `language` (type: `string`):

ISO 639-1 language code of the post

## `departmentNames` (type: `string`):

Flattened department names

## `officeNames` (type: `string`):

Flattened office names

## `updatedAt` (type: `string`):

ISO 8601 last-update timestamp

## `firstPublished` (type: `string`):

ISO 8601 first-publish timestamp

## `applicationDeadline` (type: `string`):

Application deadline (ISO 8601 or null)

## `contentText` (type: `string`):

Plain-text job description (stripHtml mode)

## `scrapedAt` (type: `string`):

ISO 8601 timestamp of the scrape

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "boardTokens": [
        "anthropic",
        "mistralai",
        "ramp"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("logiover/greenhouse-job-board-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "boardTokens": [
        "anthropic",
        "mistralai",
        "ramp",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("logiover/greenhouse-job-board-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "boardTokens": [
    "anthropic",
    "mistralai",
    "ramp"
  ]
}' |
apify call logiover/greenhouse-job-board-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=logiover/greenhouse-job-board-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Greenhouse Job Board API — Jobs, Departments & Offices",
        "description": "Unofficial Greenhouse Job Board API in one Apify actor. Scrape jobs, full descriptions, departments and offices from any company on Greenhouse — Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, Datadog, Notion. Pure HTTP, no auth, parallel batch. For HR tech, ATS, lead gen and AI agents.",
        "version": "1.0",
        "x-build-id": "ey8CBfQvdMUlFZMUc"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/logiover~greenhouse-job-board-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-logiover-greenhouse-job-board-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/logiover~greenhouse-job-board-scraper/runs": {
            "post": {
                "operationId": "runs-sync-logiover-greenhouse-job-board-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/logiover~greenhouse-job-board-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-logiover-greenhouse-job-board-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "jobs",
                            "jobDetail",
                            "board",
                            "departments",
                            "offices"
                        ],
                        "type": "string",
                        "description": "Which endpoint family to call. `jobs` returns the full job list for each board (most common). `jobDetail` returns rich detail (questions, pay ranges) for specific job IDs. `board` returns the board profile/description. `departments` and `offices` return the company hierarchy.",
                        "default": "jobs"
                    },
                    "boardTokens": {
                        "title": "Board tokens",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Greenhouse board tokens — the slug after boards.greenhouse.io/. For example for https://boards.greenhouse.io/airbnb the token is `airbnb`. Add as many as you want; the actor fetches them in parallel. Examples: airbnb, stripe, anthropic, mistralai, datadog, doctolib, cohere, glovo, ramp, retool.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "jobIds": {
                        "title": "Job IDs (jobDetail mode only)",
                        "type": "array",
                        "description": "Numeric job IDs to fetch in `jobDetail` mode. Only the FIRST board token is used in this mode — IDs are scoped to a single board.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "fullContent": {
                        "title": "Include full description + departments + offices",
                        "type": "boolean",
                        "description": "When ON, jobs mode appends `?content=true` so each job includes the full HTML description, department list and office list. Significantly richer output. When OFF, only the lightweight summary is returned (faster, smaller). Default: ON.",
                        "default": true
                    },
                    "includeQuestions": {
                        "title": "Include application questions (jobDetail mode)",
                        "type": "boolean",
                        "description": "When ON in `jobDetail` mode, appends `?questions=true` so the response includes the application form questions, EEOC compliance fields, location questions and demographic questions defined for the job.",
                        "default": false
                    },
                    "includePayRanges": {
                        "title": "Include pay ranges (jobDetail mode)",
                        "type": "boolean",
                        "description": "When ON in `jobDetail` mode, appends `?pay_input_ranges=true` so the response includes the pay ranges configured for the post. Only populated if the employer publishes salary info.",
                        "default": false
                    },
                    "decodeContent": {
                        "title": "Decode HTML entities in content",
                        "type": "boolean",
                        "description": "Greenhouse stores descriptions as HTML-entity-encoded text (`&lt;p&gt;…` rather than `<p>…`). With this ON (default), entities are decoded once so the `content` field contains clean HTML you can render directly. Turn OFF to keep the raw server response untouched.",
                        "default": true
                    },
                    "stripHtml": {
                        "title": "Also produce plain-text content",
                        "type": "boolean",
                        "description": "When ON, adds a `contentText` field with HTML tags stripped and whitespace collapsed. Useful for keyword search, AI embeddings, CSV exports.",
                        "default": false
                    },
                    "filterDepartments": {
                        "title": "Filter by department names",
                        "type": "array",
                        "description": "Substring match (case-insensitive) against department names. Example: `Engineering`, `Product`. Empty = no filter. Filtering happens client-side because the Greenhouse public API does not accept query filters.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "filterOffices": {
                        "title": "Filter by office names",
                        "type": "array",
                        "description": "Substring match (case-insensitive) against office names. Example: `Paris`, `San Francisco`, `Remote`. Empty = no filter.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "filterLocations": {
                        "title": "Filter by job location",
                        "type": "array",
                        "description": "Substring match (case-insensitive) against the job's `location.name`. Each job has a single location string — this filters on that text. Example: `Remote - USA`, `London`. Empty = no filter.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "filterTitleKeywords": {
                        "title": "Filter by title keywords",
                        "type": "array",
                        "description": "Substring match (case-insensitive) against job titles. Example: `senior`, `manager`, `python`. Any match keeps the job. Empty = no filter.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "filterLanguages": {
                        "title": "Filter by language",
                        "type": "array",
                        "description": "Match the job's `language` field (ISO 639-1 codes like `en`, `fr`, `de`). Useful when scraping multinational boards. Empty = no filter.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxResultsPerBoard": {
                        "title": "Max results per board",
                        "minimum": 0,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Cap the number of jobs saved per board token. The full list is still fetched (Greenhouse doesn't paginate) but only the first N items are pushed to the dataset. Set 0 for no cap."
                    },
                    "concurrency": {
                        "title": "Parallel fetches",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "How many board tokens to fetch in parallel. The Greenhouse API has no documented rate limit, but 5–10 is a polite default. Range 1–20.",
                        "default": 5
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```