Greenhouse Job Board API — Jobs, Departments & Offices avatar

Greenhouse Job Board API — Jobs, Departments & Offices

Pricing

from $1.50 / 1,000 results

Go to Apify Store
Greenhouse Job Board API — Jobs, Departments & Offices

Greenhouse Job Board API — Jobs, Departments & Offices

Unofficial Greenhouse Job Board API in one Apify actor. Scrape jobs, full descriptions, departments and offices from any company on Greenhouse — Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, Datadog, Notion. Pure HTTP, no auth, parallel batch. For HR tech, ATS, lead gen and AI agents.

Pricing

from $1.50 / 1,000 results

Rating

0.0

(0)

Developer

Logiover

Logiover

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

18 hours ago

Last modified

Share

Greenhouse Job Board API — Career Pages Scraper

The unofficial Greenhouse Job Board API in a single Apify actor. Scrape every open job, full description, department and office from any company hosted on Greenhouse — Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, Datadog, Cohere, Plaid, Ramp, Retool, HuggingFace and thousands more — all with one input. Pure HTTP, no login, no captcha, no anti-bot, no scraping war.

Apify Actor Greenhouse API Pure HTTP

Most ATS scrapers fight cookie banners, JavaScript renders, rotating IPs and Cloudflare. This one doesn't need to — Greenhouse publishes a clean, public, rate-limit-free REST API for every company's job board. We wrap it, normalize it, give it sensible filters, and ship it as an Apify actor you can call from Make, n8n, Zapier, your CRM, your AI agent or your data warehouse.


Why this actor exists — and why it's the highest-leverage ATS scraper you'll buy

Greenhouse powers career pages for 5,000+ tech companies. One actor, one input array, you cover them all. Here's the asymmetry:

Typical company-by-company scraperGreenhouse Job Board API actor
Coverage per actor1 companyEvery Greenhouse customer (Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, …)
AuthenticationOAuth / login / cookiesNone — fully public
Anti-botCaptcha, Cloudflare, fingerprintingNone — first-party API
Rate limitsFrequentNone documented
Job descriptionOften partial / paywalledFull HTML, decoded
Salary rangesRarely exposedAvailable via pay_input_ranges=true
Departments & officesScraped from sidebarsReturned as structured trees
Custom fields (employment type, visa, …)Lost in scrapingStructured metadata objects
GDPR / compliance flagsLostReturned as data_compliance
Application questionsRarely capturedFull questions array on jobDetail

Real numbers from a single run: 3 boards (anthropic, mistralai, ramp) → 200+ open jobs, full HTML descriptions, sub-2-second cold start, zero retries.


What it does

Five modes, one actor. Pick the mode that matches your use case:

1. jobs — list every open job for one or more boards (the default)

The bread-and-butter mode. Send a list of board tokens, get every open job with descriptions, departments, offices, custom metadata, and compliance flags. Filter client-side by department, office, location, title keyword or language.

{
"mode": "jobs",
"boardTokens": ["anthropic", "mistralai", "ramp"],
"fullContent": true,
"filterDepartments": ["Engineering", "Research"],
"filterLocations": ["San Francisco", "London", "Paris", "Remote"]
}

2. jobDetail — rich detail for specific job IDs

When you already have a list of IDs (from jobs mode or your own database) and need the application questions, EEOC compliance fields, demographic questions or pay ranges.

{
"mode": "jobDetail",
"boardTokens": ["anthropic"],
"jobIds": ["5987708004", "5987708005"],
"includeQuestions": true,
"includePayRanges": true
}

3. board — company-level board profile

The board's name and welcome content (the introduction text shown at boards.greenhouse.io/{token}). Useful for company-intelligence aggregations.

{ "mode": "board", "boardTokens": ["anthropic"] }

4. departments — the company's department tree

Every department with its parent/child relationships and the jobs nested under it. Perfect for building department-level dashboards or department-filtered hiring trend reports.

{ "mode": "departments", "boardTokens": ["stripe", "datadog"] }

5. offices — the company's office tree

Geographic hierarchy: continent → country → city → site, with the departments and jobs nested in each. Use this to map where a company is hiring globally.

{ "mode": "offices", "boardTokens": ["airbnb"] }

Who this is for

If you build, sell to, or operate any of these — this actor saves you weeks of scraping engineering:

  • HR tech & job aggregators — power your meta-search with comprehensive, near-real-time tech job inventory.
  • Sales intelligence & lead generation — "hiring signals" are the strongest growth proxy in B2B sales. A spike in Engineering hires at a startup → that startup is your lead.
  • Recruitment agencies — track which companies open which roles in which cities, with what salary bands, before your competitors notice.
  • VC / scout tools — hiring velocity is leading-indicator data for portfolio monitoring. This actor gives you the raw signal at scale.
  • Compensation intelligence platforms — combine pay_input_ranges data with job titles and locations to build salary benchmarks.
  • ATS & HRIS integrations — sync Greenhouse-hosted jobs into your platform without dealing with OAuth or per-customer onboarding.
  • AI agents & LLM apps — feed structured job data into your assistant for career-coaching, comp-research or company-research workflows.
  • Career sites & newsletters — power your own niche job board (tech, fintech, climate, AI…) with up-to-date inventory from the companies that matter.

Companies on Greenhouse you can scrape today

This is a small, non-exhaustive selection — Greenhouse hosts thousands of companies. Take the URL slug after boards.greenhouse.io/ and pop it into boardTokens.

CategorySample board tokens
AI / Foundation modelsanthropic, mistralai, cohere, huggingface, runwayml, mosaicml, character
Big tech / Travel / Marketplacesairbnb, dropbox, pinterest, lyft, instacart, doordash, glovo, blockchain, affirm
Fintech / Payments / Bankingstripe, plaid, ramp, brex, mercury, klarna, wise, revolut, nubank, flexport
Infrastructure / DevToolsdatadog, gitlab, retool, hashicorp, airbyte, confluent, fastly, cloudflare
SaaS / Productivity / Collaborationnotion, figma, linear, vercel, posthog, loom, front, airtable
HealthTechdoctolib, oscar, cohere-health, included-health
E-commerce / Marketplacesbackmarket, vinted, etsy, chewy, whoop
EU / French tech ecosystemmistralai, doctolib, backmarket, qonto, swile, payfit, algolia
Crypto / Web3coinbase, kraken, chainalysis

How to find the token for a company you care about: open the company's careers page. If you land on a URL like boards.greenhouse.io/<slug> or <company>.com/careers that embeds a Greenhouse iframe, view source and look for boards-api.greenhouse.io/v1/boards/<slug>/jobs. The <slug> is your token.


Quick start

Run from the Apify console

  1. Open the actor's page in the Apify Store.
  2. Paste the JSON below into the Input tab.
  3. Hit Start.
  4. Open the Dataset tab when it finishes. Export to CSV, JSON, Excel or hit the API tab for an integration URL.
{
"mode": "jobs",
"boardTokens": ["anthropic", "mistralai", "ramp"],
"fullContent": true,
"stripHtml": true,
"filterDepartments": ["Engineering", "Research"],
"concurrency": 5
}

Run from the API

Use the standard Apify run-sync-get-dataset-items endpoint:

curl -X POST "https://api.apify.com/v2/acts/<USERNAME>~greenhouse-job-board-scraper/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"mode": "jobs",
"boardTokens": ["anthropic", "mistralai"],
"fullContent": true
}'

Run from a workflow tool (Make, Zapier, n8n)

Every workflow tool that supports Apify or HTTP can call this actor. Wire it to a daily schedule, push results into Airtable / Sheets / Postgres / S3, downstream-filter as needed.


Input reference

FieldTypeDefaultDescription
modestringjobsOne of jobs, jobDetail, board, departments, offices.
boardTokensstring[]List of board slugs. Required for every mode.
jobIdsstring[][]Numeric job IDs (jobDetail mode only). Scoped to the first boardTokens entry.
fullContentbooleantruejobs mode — append ?content=true for full descriptions, departments, offices.
includeQuestionsbooleanfalsejobDetail mode — include application/EEOC/demographic questions.
includePayRangesbooleanfalsejobDetail mode — include pay_input_ranges array.
decodeContentbooleantrueDecode HTML entities in content (server returns &lt;p&gt;-encoded).
stripHtmlbooleanfalseAlso produce a contentText plain-text field.
filterDepartmentsstring[][]Substring (case-insensitive) match against department names.
filterOfficesstring[][]Substring match against office names.
filterLocationsstring[][]Substring match against the job's location.name.
filterTitleKeywordsstring[][]Substring match against job titles.
filterLanguagesstring[][]Match the job's language field (e.g. en, fr, de).
maxResultsPerBoardinteger0Cap per board (0 = no cap).
concurrencyinteger5Parallel board fetches (1–20).

All filters are applied client-side because the Greenhouse public API does not accept query parameters — there is no server-side filter primitive. The actor fetches the full list (one request per board) and applies filters before pushing to the dataset.


Output reference (jobs mode)

A real example item from a live run against datsolutions (DAT Freight & Analytics):

{
"_mode": "jobs",
"boardToken": "datsolutions",
"id": 5987708004,
"internalJobId": 5149982004,
"title": "Account Executive - Enterprise Broker Automation",
"companyName": "DAT",
"requisitionId": "1445",
"location": { "name": "Remote - USA" },
"locationName": "Remote - USA",
"absoluteUrl": "https://careers.dat.com/jobs/5987708004?gh_jid=5987708004",
"language": "en",
"updatedAt": "2026-05-08T19:19:46-04:00",
"firstPublished": "2026-05-08T19:19:46-04:00",
"applicationDeadline": null,
"content": "<p><strong>About DAT</strong></p><p>DAT Freight & Analytics is…",
"contentText": "About DAT DAT Freight & Analytics is an award-winning employer…",
"departments": [],
"departmentNames": [],
"offices": [],
"officeNames": [],
"metadata": [
{ "id": 4200041004, "name": "Employment Type", "value": "Regular", "value_type": "single_select" },
{ "id": 4209572004, "name": "Full-time/ Part-time", "value": "Full-time", "value_type": "single_select" }
],
"metadataMap": {
"Employment Type": "Regular",
"Full-time/ Part-time": "Full-time"
},
"dataCompliance": [
{ "type": "gdpr", "requires_consent": false, "requires_processing_consent": false, "requires_retention_consent": false, "retention_period": null, "demographic_data_consent_applies": false }
],
"scrapedAt": "2026-05-13T11:00:00.000Z"
}

Field guide

  • id vs internalJobIdid is the public job-post ID (what the URL uses, what you POST applications to). internalJobId is the underlying job in Greenhouse, useful for cross-referencing with the Harvest API. Prospect posts have internalJobId: null.
  • content is decoded HTML by default. Set decodeContent: false to keep the raw &lt;p&gt;-style server response. Set stripHtml: true to get a plain-text contentText for AI embeddings or CSV.
  • metadata is a structured array, not a string. Greenhouse exposes custom job fields here (Employment Type, Full-time/Part-time, visa sponsorship, …). The actor builds a metadataMap so you can read it like a hash.
  • dataCompliance reveals GDPR rules the employer has configured for the post (consent requirements, retention period, etc.).
  • departments and offices are full structured objects, not just names. Each has id, name, parent_id, child_ids so you can reconstruct hierarchies.
  • All timestamps are ISO 8601 with the employer's local offset preserved (e.g. 2026-05-08T19:19:46-04:00).

jobDetail mode adds

When you call mode: "jobDetail" with includeQuestions: true and/or includePayRanges: true:

{
"questions": [
{
"fields": [
{ "name": "resume", "type": "input_file", "values": [] }
],
"label": "Resume/CV",
"required": true,
"description_preface": null,
"description": null
}
],
"payInputRanges": [
{
"min_cents": 18000000,
"max_cents": 26000000,
"currency_type": "USD",
"interval": "year",
"applicable_to_remote_locations": "USA only"
}
],
"locationQuestions": [/* … */],
"compliance": [/* EEOC questions when enabled */],
"demographicQuestions": { /* Greenhouse Inclusion data */ }
}

Real-world recipes

Recipe 1 — Daily AI-startup hiring digest

Track AI labs' Engineering and Research hires. Schedule daily, pipe results into Slack or email.

{
"mode": "jobs",
"boardTokens": ["anthropic", "mistralai", "cohere", "huggingface", "character"],
"fullContent": true,
"filterDepartments": ["Engineering", "Research", "ML", "Applied"],
"stripHtml": true,
"concurrency": 5
}

Recipe 2 — French tech ecosystem (Paris + Remote)

Map open tech jobs in the French scaleup ecosystem.

{
"mode": "jobs",
"boardTokens": ["mistralai", "doctolib", "backmarket", "qonto", "swile", "payfit", "algolia"],
"filterLocations": ["Paris", "Remote", "France"],
"stripHtml": true
}

Recipe 3 — Senior+ infrastructure roles across DevTools

Hunt senior infra/SRE/platform roles across DevTools companies.

{
"mode": "jobs",
"boardTokens": ["datadog", "gitlab", "hashicorp", "fastly", "airbyte", "confluent", "retool"],
"filterTitleKeywords": ["senior", "staff", "principal", "platform", "infrastructure", "SRE"],
"stripHtml": true
}

Recipe 4 — Build a salary-intelligence dataset

Pull pay-range data for all jobs at companies that publish it. Two-step: collect IDs in jobs mode, then re-run jobDetail on those IDs.

{
"mode": "jobs",
"boardTokens": ["ramp", "brex", "mercury", "plaid", "stripe"],
"filterLocations": ["United States", "New York", "San Francisco", "Remote"],
"fullContent": false
}

Then for each id you collect:

{
"mode": "jobDetail",
"boardTokens": ["ramp"],
"jobIds": ["[id1]", "[id2]", "..."],
"includePayRanges": true
}

Recipe 5 — Company-level org snapshot

How is a target company organized? Pull the office and department tree side-by-side.

{ "mode": "offices", "boardTokens": ["airbnb"] }
{ "mode": "departments", "boardTokens": ["airbnb"] }

Performance characteristics

  • Cold start: ~1.5–3 seconds (Apify container boot + first request).
  • Per-board fetch: typically 100–500 ms for the jobs endpoint, depending on board size. Greenhouse caches aggressively.
  • No rate limits documented. That said, the actor defaults to concurrency 5 to stay polite. Raise to 20 if you're scraping dozens of boards in one run.
  • No pagination needed. Greenhouse's jobs endpoint returns the full list in one response, no Link-header walking.
  • Retries: 3 attempts with linear backoff (500 ms × attempt). 404s are not retried.
  • Filters are O(n) over the in-memory job list. Even at 5,000 jobs per board, filtering completes in single-digit milliseconds.

Cross-sell — pair with these actors

If you bought this actor because you needed European startup talent data, you almost certainly want one or more of:

  • Welcome to the Jungle Jobs Scraper — French/EU tech-startup-focused job board, Algolia-backed. Complementary coverage to Greenhouse-hosted companies.
  • Apple App Store Data API — when you're enriching company intelligence with their iOS app presence, ratings and privacy labels.
  • Google Play Data API — same for Android.

Same architecture (pure HTTP, public APIs, sub-3-second cold start, batch + parallel), same monetization model, same author. Combine them in one workflow for a complete company-intelligence pipeline.


FAQ

Q: Do I need a Greenhouse API key?
A: No. The Job Board API GET endpoints are fully public. Only POST (application submission) requires Basic Auth, which this actor does not do — it's read-only.

Q: How do I find a company's board token?
A: Open their careers page. If the URL is boards.greenhouse.io/<slug>, that's the token. If they embed a Greenhouse iframe on their own domain, view source and look for boards-api.greenhouse.io/v1/boards/<slug>/.... The slug is your token.

Q: Why is the description HTML encoded by default?
A: That's how Greenhouse returns it (&lt;p&gt; instead of <p>). The actor decodes it once by default (decodeContent: true) so you get clean renderable HTML. Set the flag to false if you need the raw server response.

Q: Why don't filters use the API directly?
A: Because the API doesn't accept filter query parameters — there's literally no ?department=... or ?location=... primitive. Every Greenhouse-scraping tool, including this one, fetches the full list and filters client-side. With Greenhouse's caching this is still fast.

Q: Does this work on internal job boards?
A: No. Only public boards (the https://boards.greenhouse.io/<token> ones). Internal boards require Harvest API authentication, which is out of scope.

Q: What about LinkedIn jobs / Indeed / Glassdoor?
A: Different actors (different sources, different anti-bot challenges). This one is laser-focused on Greenhouse because the API quality is unparalleled — clean, public, rate-limit-free.

Q: Is there a salary field?
A: Sometimes. Look at:

  • metadata / metadataMap — some employers publish salary as a custom metadata field.
  • payInputRanges — populated when you call jobDetail with includePayRanges: true AND the employer has configured pay ranges in Greenhouse.
  • The job's content HTML often contains compensation text in free form.

Q: How fresh is the data?
A: Real-time. The Greenhouse Job Board API serves the current state of each board the moment you call it. There is no scraping lag.

Q: Can I detect newly posted jobs since my last run?
A: Yes — diff against id or use updatedAt / firstPublished. The firstPublished field gives you a true "this is brand new" signal on boards that populate it.

Q: What happens if a board token doesn't exist?
A: The actor logs the 404 as a per-board error and continues with the rest. It does not abort the whole run.

Q: Why is internalJobId sometimes null?
A: For "prospect" posts (Greenhouse's term for general-interest landing pages that aren't tied to a specific requisition).


Pricing & monetization

This actor is billed on a pay-per-result basis. Every job, board, department or office item written to the dataset counts as one result. Free runs included per Apify's standard policy.

Cost is dominated by storage + compute, not by the API call itself (Greenhouse is free). For very large boards (10,000+ jobs in a single run), set maxResultsPerBoard to cap the spend.


Changelog

  • v1.0.0 (2026-05) — Initial release.
    • 5 modes: jobs, jobDetail, board, departments, offices.
    • HTML-entity decoding with single-pass semantics.
    • Structured metadata + GDPR compliance fields.
    • Client-side filters on department, office, location, title, language.
    • Bounded-concurrency parallel fetching (1–20).
    • Retries with linear backoff, 404 short-circuit.
    • Live tested against anthropic, mistralai, datsolutions, ramp, airbnb.

Support


This actor consumes the public Greenhouse Job Board API. Greenhouse explicitly documents that the GET endpoints are publicly accessible without authentication and not rate-limited. Data published through this API is data that employers have chosen to make public on boards.greenhouse.io. Respect each employer's terms of use and your local data-protection laws when re-distributing scraped data.

This actor is not affiliated with, endorsed by, or sponsored by Greenhouse Software, Inc.