Greenhouse Job Board API — Jobs, Departments & Offices
Pricing
from $1.50 / 1,000 results
Greenhouse Job Board API — Jobs, Departments & Offices
Unofficial Greenhouse Job Board API in one Apify actor. Scrape jobs, full descriptions, departments and offices from any company on Greenhouse — Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, Datadog, Notion. Pure HTTP, no auth, parallel batch. For HR tech, ATS, lead gen and AI agents.
Pricing
from $1.50 / 1,000 results
Rating
0.0
(0)
Developer
Logiover
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
18 hours ago
Last modified
Categories
Share
Greenhouse Job Board API — Career Pages Scraper
The unofficial Greenhouse Job Board API in a single Apify actor. Scrape every open job, full description, department and office from any company hosted on Greenhouse — Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, Datadog, Cohere, Plaid, Ramp, Retool, HuggingFace and thousands more — all with one input. Pure HTTP, no login, no captcha, no anti-bot, no scraping war.
Most ATS scrapers fight cookie banners, JavaScript renders, rotating IPs and Cloudflare. This one doesn't need to — Greenhouse publishes a clean, public, rate-limit-free REST API for every company's job board. We wrap it, normalize it, give it sensible filters, and ship it as an Apify actor you can call from Make, n8n, Zapier, your CRM, your AI agent or your data warehouse.
Why this actor exists — and why it's the highest-leverage ATS scraper you'll buy
Greenhouse powers career pages for 5,000+ tech companies. One actor, one input array, you cover them all. Here's the asymmetry:
| Typical company-by-company scraper | Greenhouse Job Board API actor | |
|---|---|---|
| Coverage per actor | 1 company | Every Greenhouse customer (Airbnb, Stripe, Anthropic, Mistral AI, Doctolib, …) |
| Authentication | OAuth / login / cookies | None — fully public |
| Anti-bot | Captcha, Cloudflare, fingerprinting | None — first-party API |
| Rate limits | Frequent | None documented |
| Job description | Often partial / paywalled | Full HTML, decoded |
| Salary ranges | Rarely exposed | Available via pay_input_ranges=true |
| Departments & offices | Scraped from sidebars | Returned as structured trees |
| Custom fields (employment type, visa, …) | Lost in scraping | Structured metadata objects |
| GDPR / compliance flags | Lost | Returned as data_compliance |
| Application questions | Rarely captured | Full questions array on jobDetail |
Real numbers from a single run: 3 boards (anthropic, mistralai, ramp) → 200+ open jobs, full HTML descriptions, sub-2-second cold start, zero retries.
What it does
Five modes, one actor. Pick the mode that matches your use case:
1. jobs — list every open job for one or more boards (the default)
The bread-and-butter mode. Send a list of board tokens, get every open job with descriptions, departments, offices, custom metadata, and compliance flags. Filter client-side by department, office, location, title keyword or language.
{"mode": "jobs","boardTokens": ["anthropic", "mistralai", "ramp"],"fullContent": true,"filterDepartments": ["Engineering", "Research"],"filterLocations": ["San Francisco", "London", "Paris", "Remote"]}
2. jobDetail — rich detail for specific job IDs
When you already have a list of IDs (from jobs mode or your own database) and need the application questions, EEOC compliance fields, demographic questions or pay ranges.
{"mode": "jobDetail","boardTokens": ["anthropic"],"jobIds": ["5987708004", "5987708005"],"includeQuestions": true,"includePayRanges": true}
3. board — company-level board profile
The board's name and welcome content (the introduction text shown at boards.greenhouse.io/{token}). Useful for company-intelligence aggregations.
{ "mode": "board", "boardTokens": ["anthropic"] }
4. departments — the company's department tree
Every department with its parent/child relationships and the jobs nested under it. Perfect for building department-level dashboards or department-filtered hiring trend reports.
{ "mode": "departments", "boardTokens": ["stripe", "datadog"] }
5. offices — the company's office tree
Geographic hierarchy: continent → country → city → site, with the departments and jobs nested in each. Use this to map where a company is hiring globally.
{ "mode": "offices", "boardTokens": ["airbnb"] }
Who this is for
If you build, sell to, or operate any of these — this actor saves you weeks of scraping engineering:
- HR tech & job aggregators — power your meta-search with comprehensive, near-real-time tech job inventory.
- Sales intelligence & lead generation — "hiring signals" are the strongest growth proxy in B2B sales. A spike in Engineering hires at a startup → that startup is your lead.
- Recruitment agencies — track which companies open which roles in which cities, with what salary bands, before your competitors notice.
- VC / scout tools — hiring velocity is leading-indicator data for portfolio monitoring. This actor gives you the raw signal at scale.
- Compensation intelligence platforms — combine
pay_input_rangesdata with job titles and locations to build salary benchmarks. - ATS & HRIS integrations — sync Greenhouse-hosted jobs into your platform without dealing with OAuth or per-customer onboarding.
- AI agents & LLM apps — feed structured job data into your assistant for career-coaching, comp-research or company-research workflows.
- Career sites & newsletters — power your own niche job board (tech, fintech, climate, AI…) with up-to-date inventory from the companies that matter.
Companies on Greenhouse you can scrape today
This is a small, non-exhaustive selection — Greenhouse hosts thousands of companies. Take the URL slug after boards.greenhouse.io/ and pop it into boardTokens.
| Category | Sample board tokens |
|---|---|
| AI / Foundation models | anthropic, mistralai, cohere, huggingface, runwayml, mosaicml, character |
| Big tech / Travel / Marketplaces | airbnb, dropbox, pinterest, lyft, instacart, doordash, glovo, blockchain, affirm |
| Fintech / Payments / Banking | stripe, plaid, ramp, brex, mercury, klarna, wise, revolut, nubank, flexport |
| Infrastructure / DevTools | datadog, gitlab, retool, hashicorp, airbyte, confluent, fastly, cloudflare |
| SaaS / Productivity / Collaboration | notion, figma, linear, vercel, posthog, loom, front, airtable |
| HealthTech | doctolib, oscar, cohere-health, included-health |
| E-commerce / Marketplaces | backmarket, vinted, etsy, chewy, whoop |
| EU / French tech ecosystem | mistralai, doctolib, backmarket, qonto, swile, payfit, algolia |
| Crypto / Web3 | coinbase, kraken, chainalysis |
How to find the token for a company you care about: open the company's careers page. If you land on a URL like
boards.greenhouse.io/<slug>or<company>.com/careersthat embeds a Greenhouse iframe, view source and look forboards-api.greenhouse.io/v1/boards/<slug>/jobs. The<slug>is your token.
Quick start
Run from the Apify console
- Open the actor's page in the Apify Store.
- Paste the JSON below into the Input tab.
- Hit Start.
- Open the Dataset tab when it finishes. Export to CSV, JSON, Excel or hit the API tab for an integration URL.
{"mode": "jobs","boardTokens": ["anthropic", "mistralai", "ramp"],"fullContent": true,"stripHtml": true,"filterDepartments": ["Engineering", "Research"],"concurrency": 5}
Run from the API
Use the standard Apify run-sync-get-dataset-items endpoint:
curl -X POST "https://api.apify.com/v2/acts/<USERNAME>~greenhouse-job-board-scraper/run-sync-get-dataset-items?token=$APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"mode": "jobs","boardTokens": ["anthropic", "mistralai"],"fullContent": true}'
Run from a workflow tool (Make, Zapier, n8n)
Every workflow tool that supports Apify or HTTP can call this actor. Wire it to a daily schedule, push results into Airtable / Sheets / Postgres / S3, downstream-filter as needed.
Input reference
| Field | Type | Default | Description |
|---|---|---|---|
mode | string | jobs | One of jobs, jobDetail, board, departments, offices. |
boardTokens | string[] | – | List of board slugs. Required for every mode. |
jobIds | string[] | [] | Numeric job IDs (jobDetail mode only). Scoped to the first boardTokens entry. |
fullContent | boolean | true | jobs mode — append ?content=true for full descriptions, departments, offices. |
includeQuestions | boolean | false | jobDetail mode — include application/EEOC/demographic questions. |
includePayRanges | boolean | false | jobDetail mode — include pay_input_ranges array. |
decodeContent | boolean | true | Decode HTML entities in content (server returns <p>-encoded). |
stripHtml | boolean | false | Also produce a contentText plain-text field. |
filterDepartments | string[] | [] | Substring (case-insensitive) match against department names. |
filterOffices | string[] | [] | Substring match against office names. |
filterLocations | string[] | [] | Substring match against the job's location.name. |
filterTitleKeywords | string[] | [] | Substring match against job titles. |
filterLanguages | string[] | [] | Match the job's language field (e.g. en, fr, de). |
maxResultsPerBoard | integer | 0 | Cap per board (0 = no cap). |
concurrency | integer | 5 | Parallel board fetches (1–20). |
All filters are applied client-side because the Greenhouse public API does not accept query parameters — there is no server-side filter primitive. The actor fetches the full list (one request per board) and applies filters before pushing to the dataset.
Output reference (jobs mode)
A real example item from a live run against datsolutions (DAT Freight & Analytics):
{"_mode": "jobs","boardToken": "datsolutions","id": 5987708004,"internalJobId": 5149982004,"title": "Account Executive - Enterprise Broker Automation","companyName": "DAT","requisitionId": "1445","location": { "name": "Remote - USA" },"locationName": "Remote - USA","absoluteUrl": "https://careers.dat.com/jobs/5987708004?gh_jid=5987708004","language": "en","updatedAt": "2026-05-08T19:19:46-04:00","firstPublished": "2026-05-08T19:19:46-04:00","applicationDeadline": null,"content": "<p><strong>About DAT</strong></p><p>DAT Freight & Analytics is…","contentText": "About DAT DAT Freight & Analytics is an award-winning employer…","departments": [],"departmentNames": [],"offices": [],"officeNames": [],"metadata": [{ "id": 4200041004, "name": "Employment Type", "value": "Regular", "value_type": "single_select" },{ "id": 4209572004, "name": "Full-time/ Part-time", "value": "Full-time", "value_type": "single_select" }],"metadataMap": {"Employment Type": "Regular","Full-time/ Part-time": "Full-time"},"dataCompliance": [{ "type": "gdpr", "requires_consent": false, "requires_processing_consent": false, "requires_retention_consent": false, "retention_period": null, "demographic_data_consent_applies": false }],"scrapedAt": "2026-05-13T11:00:00.000Z"}
Field guide
idvsinternalJobId—idis the public job-post ID (what the URL uses, what you POST applications to).internalJobIdis the underlying job in Greenhouse, useful for cross-referencing with the Harvest API. Prospect posts haveinternalJobId: null.contentis decoded HTML by default. SetdecodeContent: falseto keep the raw<p>-style server response. SetstripHtml: trueto get a plain-textcontentTextfor AI embeddings or CSV.metadatais a structured array, not a string. Greenhouse exposes custom job fields here (Employment Type, Full-time/Part-time, visa sponsorship, …). The actor builds ametadataMapso you can read it like a hash.dataCompliancereveals GDPR rules the employer has configured for the post (consent requirements, retention period, etc.).departmentsandofficesare full structured objects, not just names. Each hasid,name,parent_id,child_idsso you can reconstruct hierarchies.- All timestamps are ISO 8601 with the employer's local offset preserved (e.g.
2026-05-08T19:19:46-04:00).
jobDetail mode adds
When you call mode: "jobDetail" with includeQuestions: true and/or includePayRanges: true:
{"questions": [{"fields": [{ "name": "resume", "type": "input_file", "values": [] }],"label": "Resume/CV","required": true,"description_preface": null,"description": null}],"payInputRanges": [{"min_cents": 18000000,"max_cents": 26000000,"currency_type": "USD","interval": "year","applicable_to_remote_locations": "USA only"}],"locationQuestions": [/* … */],"compliance": [/* EEOC questions when enabled */],"demographicQuestions": { /* Greenhouse Inclusion data */ }}
Real-world recipes
Recipe 1 — Daily AI-startup hiring digest
Track AI labs' Engineering and Research hires. Schedule daily, pipe results into Slack or email.
{"mode": "jobs","boardTokens": ["anthropic", "mistralai", "cohere", "huggingface", "character"],"fullContent": true,"filterDepartments": ["Engineering", "Research", "ML", "Applied"],"stripHtml": true,"concurrency": 5}
Recipe 2 — French tech ecosystem (Paris + Remote)
Map open tech jobs in the French scaleup ecosystem.
{"mode": "jobs","boardTokens": ["mistralai", "doctolib", "backmarket", "qonto", "swile", "payfit", "algolia"],"filterLocations": ["Paris", "Remote", "France"],"stripHtml": true}
Recipe 3 — Senior+ infrastructure roles across DevTools
Hunt senior infra/SRE/platform roles across DevTools companies.
{"mode": "jobs","boardTokens": ["datadog", "gitlab", "hashicorp", "fastly", "airbyte", "confluent", "retool"],"filterTitleKeywords": ["senior", "staff", "principal", "platform", "infrastructure", "SRE"],"stripHtml": true}
Recipe 4 — Build a salary-intelligence dataset
Pull pay-range data for all jobs at companies that publish it. Two-step: collect IDs in jobs mode, then re-run jobDetail on those IDs.
{"mode": "jobs","boardTokens": ["ramp", "brex", "mercury", "plaid", "stripe"],"filterLocations": ["United States", "New York", "San Francisco", "Remote"],"fullContent": false}
Then for each id you collect:
{"mode": "jobDetail","boardTokens": ["ramp"],"jobIds": ["[id1]", "[id2]", "..."],"includePayRanges": true}
Recipe 5 — Company-level org snapshot
How is a target company organized? Pull the office and department tree side-by-side.
{ "mode": "offices", "boardTokens": ["airbnb"] }
{ "mode": "departments", "boardTokens": ["airbnb"] }
Performance characteristics
- Cold start: ~1.5–3 seconds (Apify container boot + first request).
- Per-board fetch: typically 100–500 ms for the jobs endpoint, depending on board size. Greenhouse caches aggressively.
- No rate limits documented. That said, the actor defaults to concurrency 5 to stay polite. Raise to 20 if you're scraping dozens of boards in one run.
- No pagination needed. Greenhouse's jobs endpoint returns the full list in one response, no Link-header walking.
- Retries: 3 attempts with linear backoff (500 ms × attempt). 404s are not retried.
- Filters are O(n) over the in-memory job list. Even at 5,000 jobs per board, filtering completes in single-digit milliseconds.
Cross-sell — pair with these actors
If you bought this actor because you needed European startup talent data, you almost certainly want one or more of:
- Welcome to the Jungle Jobs Scraper — French/EU tech-startup-focused job board, Algolia-backed. Complementary coverage to Greenhouse-hosted companies.
- Apple App Store Data API — when you're enriching company intelligence with their iOS app presence, ratings and privacy labels.
- Google Play Data API — same for Android.
Same architecture (pure HTTP, public APIs, sub-3-second cold start, batch + parallel), same monetization model, same author. Combine them in one workflow for a complete company-intelligence pipeline.
FAQ
Q: Do I need a Greenhouse API key?
A: No. The Job Board API GET endpoints are fully public. Only POST (application submission) requires Basic Auth, which this actor does not do — it's read-only.
Q: How do I find a company's board token?
A: Open their careers page. If the URL is boards.greenhouse.io/<slug>, that's the token. If they embed a Greenhouse iframe on their own domain, view source and look for boards-api.greenhouse.io/v1/boards/<slug>/.... The slug is your token.
Q: Why is the description HTML encoded by default?
A: That's how Greenhouse returns it (<p> instead of <p>). The actor decodes it once by default (decodeContent: true) so you get clean renderable HTML. Set the flag to false if you need the raw server response.
Q: Why don't filters use the API directly?
A: Because the API doesn't accept filter query parameters — there's literally no ?department=... or ?location=... primitive. Every Greenhouse-scraping tool, including this one, fetches the full list and filters client-side. With Greenhouse's caching this is still fast.
Q: Does this work on internal job boards?
A: No. Only public boards (the https://boards.greenhouse.io/<token> ones). Internal boards require Harvest API authentication, which is out of scope.
Q: What about LinkedIn jobs / Indeed / Glassdoor?
A: Different actors (different sources, different anti-bot challenges). This one is laser-focused on Greenhouse because the API quality is unparalleled — clean, public, rate-limit-free.
Q: Is there a salary field?
A: Sometimes. Look at:
metadata/metadataMap— some employers publish salary as a custom metadata field.payInputRanges— populated when you calljobDetailwithincludePayRanges: trueAND the employer has configured pay ranges in Greenhouse.- The job's
contentHTML often contains compensation text in free form.
Q: How fresh is the data?
A: Real-time. The Greenhouse Job Board API serves the current state of each board the moment you call it. There is no scraping lag.
Q: Can I detect newly posted jobs since my last run?
A: Yes — diff against id or use updatedAt / firstPublished. The firstPublished field gives you a true "this is brand new" signal on boards that populate it.
Q: What happens if a board token doesn't exist?
A: The actor logs the 404 as a per-board error and continues with the rest. It does not abort the whole run.
Q: Why is internalJobId sometimes null?
A: For "prospect" posts (Greenhouse's term for general-interest landing pages that aren't tied to a specific requisition).
Pricing & monetization
This actor is billed on a pay-per-result basis. Every job, board, department or office item written to the dataset counts as one result. Free runs included per Apify's standard policy.
Cost is dominated by storage + compute, not by the API call itself (Greenhouse is free). For very large boards (10,000+ jobs in a single run), set maxResultsPerBoard to cap the spend.
Changelog
- v1.0.0 (2026-05) — Initial release.
- 5 modes: jobs, jobDetail, board, departments, offices.
- HTML-entity decoding with single-pass semantics.
- Structured metadata + GDPR compliance fields.
- Client-side filters on department, office, location, title, language.
- Bounded-concurrency parallel fetching (1–20).
- Retries with linear backoff, 404 short-circuit.
- Live tested against
anthropic,mistralai,datsolutions,ramp,airbnb.
Support
- File an issue on the actor's Apify page (Issues tab).
- Apify docs: https://docs.apify.com/
- Greenhouse Job Board API reference: https://developers.greenhouse.io/job-board.html
Legal
This actor consumes the public Greenhouse Job Board API. Greenhouse explicitly documents that the GET endpoints are publicly accessible without authentication and not rate-limited. Data published through this API is data that employers have chosen to make public on boards.greenhouse.io. Respect each employer's terms of use and your local data-protection laws when re-distributing scraped data.
This actor is not affiliated with, endorsed by, or sponsored by Greenhouse Software, Inc.