Greenhouse Jobs Scraper — Company Job Boards avatar

Greenhouse Jobs Scraper — Company Job Boards

Pricing

Pay per event

Go to Apify Store
Greenhouse Jobs Scraper — Company Job Boards

Greenhouse Jobs Scraper — Company Job Boards

Extract live job postings from any company's Greenhouse board (boards.greenhouse.io) via the official public API. Pass one or more company slugs and get clean JSON: title, department, location, posting date and apply URL. No login, no proxies, no HTML parsing — pure API reliability.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Nomad.Dev

Nomad.Dev

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

14 hours ago

Last modified

Categories

Share

Scrape any company's Greenhouse job board through the official public JSON API — pass company slugs, get clean structured postings. No login, no proxies, no breakage.

Input

FieldTypeDefaultDescription
companiesarray (required)Company slugs as used on boards.greenhouse.io/<slug> (full board URLs also accepted).
keywordstringCase-insensitive substring match on the job title.
titleExcludearrayDrop postings whose title contains any of these substrings (case-insensitive).
locationFilterstringCase-insensitive substring match on the location.
postedSinceintegerKeep only postings first published within this many days. Postings with no posting date are dropped when this is set.
remoteOnlybooleanfalseKeep only postings flagged remote by the source. Always returns nothing on this Actor — see FAQ.
includeDescriptionbooleantrueInclude a plain-text description snippet per posting.
maxItemsPerCompanyinteger100Cap postings returned per company (0 = no cap). Each result is a billed event.
maxItemsinteger200Hard cap on total postings returned (0 = no cap). Each result is a billed event.
onlyNewSinceLastRunbooleanfalseDelta/monitoring mode: only output postings not seen on a previous run made with this flag on (see "Delta mode / monitoring").
aiEnrichmentbooleanfalseAdds aiKeySkills/aiExperienceLevel/aiWorkArrangement/aiVisaSponsorship per posting via the Anthropic or Mistral API — BYOK (see "AI enrichment").
aiProviderstringanthropicWhich AI provider runs enrichment: anthropic (default, uses anthropicApiKey) or mistral (uses mistralApiKey).
anthropicApiKeystring (secret)Your Anthropic API key. Only used when aiEnrichment is on and aiProvider is anthropic; billed separately by Anthropic, not by this Actor.
aiModelstringclaude-haiku-4-5-20251001Claude model for AI enrichment (when aiProvider is anthropic): claude-haiku-4-5-20251001 (fast/cheap) or claude-sonnet-4-5 (higher quality).
mistralApiKeystring (secret)Your Mistral API key. Only used when aiEnrichment is on and aiProvider is mistral; billed separately by Mistral, not by this Actor.
mistralModelstringmistral-small-latestMistral model for AI enrichment (when aiProvider is mistral): mistral-small-latest (default, fast/cheap — matches larger Mistral models on this task), mistral-medium-latest, or mistral-large-latest.
concurrencyinteger8Companies fetched in parallel (advanced).

What Greenhouse jobs data does this scraper extract?

One flat JSON record per live posting:

FieldMeaning
atsWhich ATS served the posting (always "greenhouse" here)
companyReal company display name, resolved from Greenhouse's own data (falls back to the input slug if unresolvable)
idGreenhouse's internal job ID
titleJob title as posted
departmentDepartment or team where provided
locationLocation text (may include remote hints)
urlDirect link to the posting
postedAtFirst-published date (YYYY-MM-DD) where provided
snippetPlain-text description excerpt (optional)
globalIdStable composite id <ats>:<company-slug>:<id> — unique across the whole ATS-actor family, handy for merging with the Lever/Ashby/Workable Actors or the Company Careers Bundle
warningsArray of data-quality notes for this record (e.g. ["postedAt missing"]); empty array when there's nothing to flag
isNewOnly present when onlyNewSinceLastRun is on — always true (already-seen postings are dropped, never emitted with isNew: false)
aiKeySkillsOnly present when aiEnrichment is on — array of skills/technologies explicitly named in the posting text, never invented
aiExperienceLevelOnly present when aiEnrichment is on — one of entry/mid/senior/lead/unknown
aiWorkArrangementOnly present when aiEnrichment is on — one of onsite/hybrid/remote/unknown
aiVisaSponsorshipOnly present when aiEnrichment is on — true/false only if the posting explicitly states a policy, otherwise null

remote and employmentType are always null on this Actor. Both keys are still present on every record — kept for schema consistency with the Lever / Ashby / Workable / Company Careers Bundle Actors — but Greenhouse's public API exposes neither a remote/workplace-type signal nor an employment-type field on any endpoint, so this Actor never guesses. Use locationFilter (e.g. "remote") as the closest available proxy for remote roles.

How to scrape Greenhouse jobs with this Actor

  1. Enter one or more company slugs (stripe, gitlab, duolingo). Open the company's careers page and look for boards.greenhouse.io/
  2. Optionally set keyword / titleExclude / locationFilter / postedSince / caps.
  3. Run and export JSON, CSV or Excel — or call it over the API:
from apify_client import ApifyClient
client = ApifyClient("<YOUR_APIFY_TOKEN>")
run = client.actor("nomad-jobs/greenhouse-jobs-scraper").call(run_input={
"companies": ["stripe", "gitlab", "duolingo"],
"keyword": "engineer",
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item["company"], "|", item["title"], item["url"])
curl -X POST \
"https://api.apify.com/v2/acts/nomad-jobs~greenhouse-jobs-scraper/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"companies": ["stripe", "gitlab", "duolingo"]}'

Output example

{
"ats": "greenhouse",
"company": "Stripe",
"id": "7954688",
"title": "Senior Software Engineer",
"department": "Engineering",
"location": "San Francisco, CA",
"url": "https://stripe.com/jobs/search?gh_jid=7954688",
"postedAt": "2026-06-25",
"employmentType": null,
"remote": null,
"snippet": "We are hiring a Senior Software Engineer...",
"globalId": "greenhouse:stripe:7954688",
"warnings": []
}

Delta mode / monitoring

Set onlyNewSinceLastRun: true to turn this Actor into a "what's new" monitor. Postings already seen on a previous run made with this flag on are dropped before push — you are not billed for them, so pairing this with an Apify schedule (cron) means every run only returns, and only charges for, postings that showed up since the last flagged run.

How it works: seen postings are tracked in a dedicated key-value store, keyed by each posting's globalId, capped at roughly 50,000 entries (oldest evicted first). The first run made with the flag on has nothing to compare against yet, so it emits everything — all isNew: true. Every emitted record gets isNew: true stamped on it; there's no isNew: false in the output, since unseen postings just aren't included.

Runs made with the flag off never read or write this cache — turning it on and off between runs is safe and has no side effects on normal runs.

AI enrichment

Turn on aiEnrichment and supply your own anthropicApiKey (or mistralApiKey, with aiProvider: "mistral") to add four AI-extracted fields to every posting:

FieldMeaning
aiKeySkillsSpecific skills/technologies/tools explicitly named in the title or description — the model is instructed to never invent one.
aiExperienceLevelOne of entry / mid / senior / lead / unknown.
aiWorkArrangementOne of onsite / hybrid / remote / unknown.
aiVisaSponsorshiptrue / false only when the posting explicitly states a sponsorship policy, otherwise null.

The extraction prompt is explicit about never guessing: when the text doesn't clearly support a value you get "unknown" / null / an empty array, not a fabricated answer. Pick the model with aiModel (Anthropic: claude-haiku-4-5-20251001 default, fast/cheap, or claude-sonnet-4-5 higher quality) or mistralModel (Mistral: mistral-small-latest default — matches larger Mistral models on this task, mistral-medium-latest, mistral-large-latest).

Postings are batched (~12 per call) through whichever provider's API you picked (aiProvider). Your Anthropic or Mistral API key is billed separately by that provider, not by this Actor. Rough cost with Haiku or Mistral Small: enriching 100 postings runs well under $0.05 in provider token spend (short prompts, small JSON replies); Sonnet/Mistral Large cost roughly 4-5x that for the same batch.

Enrichment needs a posting's description text even when you have includeDescription off — this Actor fetches it internally for enrichment either way, then still honors your includeDescription choice for what actually ends up in the output snippet field.

If aiEnrichment is on but no matching key is available (anthropicApiKey/ANTHROPIC_API_KEY for aiProvider: "anthropic", or mistralApiKey/MISTRAL_API_KEY for aiProvider: "mistral"), enrichment is skipped: you get one extra dataset row explaining why (warnings: ["aiEnrichment skipped: no anthropicApiKey or mistralApiKey provided"]), a run status message, and every other posting is still returned normally, just without the ai* fields.

This is the same class of field fantastic-jobs' career-site-api prices a whole tier on (ai_key_skills, work arrangement, visa signals) — comparable output here, opt-in and BYOK instead of bundled into every row's price.

Integrations

Export results as JSON, CSV or Excel/XLSX, or pipe them straight into Make, Zapier or n8n. Call this Actor synchronously with run-sync-get-dataset-items, or plug it into any AI agent through the Apify MCP server.

Pricing

Pay per event: $0.05 per Actor start and $0.004 per posting returned. 100 postings ≈ $0.45. No subscription — pay only for what you fetch.

If you turn on aiEnrichment, your Anthropic or Mistral key is billed separately by that provider for the enrichment calls themselves — see "AI enrichment" above for a rough per-100-postings cost estimate. Delta mode (onlyNewSinceLastRun) only reduces cost: already-seen postings are dropped before the billed push step.

Use cases

  • Track hiring at specific companies (competitors, targets, portfolio)
  • Build company-careers pages and job boards without HTML scraping
  • Recruiting intelligence: who opens which roles, where, how fast
  • Feed AI matching agents with reliable ATS-direct data

FAQ

Is it legal to scrape Greenhouse jobs? The data comes from the ATS providers' official, public, unauthenticated JSON APIs — the same data any visitor sees on the company's careers page. Review the providers' terms for your use case.

Do I need an API key or login? No. These are public job-board APIs — no authentication of any kind.

What if a company isn't found? It is logged and skipped — the run continues with the other companies. Full board URLs are also accepted and reduced to slugs automatically.

Why is remote always null, and why does remoteOnly return nothing? Greenhouse's public API doesn't expose a remote/workplace-type field on any endpoint — not on the job list, not on the board root. Rather than guess from free-text location strings, this Actor reports null faithfully. Use locationFilter: "remote" to approximate it instead.

How fresh is the data? Every run hits the ATS APIs live. No caching layer in between.

Something broken or missing? Open an issue on the Actor's Issues tab — it is monitored and fixes ship fast.