Greenhouse Scraper - Career Site Jobs avatar

Greenhouse Scraper - Career Site Jobs

Pricing

from $2.00 / 1,000 results

Go to Apify Store
Greenhouse Scraper - Career Site Jobs

Greenhouse Scraper - Career Site Jobs

Scrape any Greenhouse-powered career site for structured job data. Application questions, department and office metadata, and multi-board batch scraping. Incremental mode detects new and changed listings. Compact output for AI agents and MCP workflows.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Black Falcon Data

Black Falcon Data

Maintained by Community

Actor stats

3

Bookmarked

9

Total users

3

Monthly active users

21 hours ago

Last modified

Share

What does Greenhouse Scraper do?

Greenhouse Scraper extracts structured job data from greenhouse.io — including salary data, contact details, company metadata, and full descriptions. It supports keyword search, location filters, and controllable result limits, so you can run the same query consistently over time. The actor also offers detail enrichment (full descriptions and company metadata) where the source provides them.

Key features

  • Incremental mode — recurring runs emit and charge only for listings that are new or whose tracked content changed. First run builds the baseline state; subsequent runs emit only new or changed records.
  • Detail enrichment — full descriptions and company metadata where the source provides them.
  • Compact mode — AI-agent and MCP-friendly payloads with core fields only.

What data can you extract from greenhouse.io?

Each result includes Core listing fields (jobId, greenhouseId, internalJobId, title, location, department, offices, and employmentType, and more), detail fields when enrichment is enabled (description), contact and apply information (applyUrl), and company metadata (company). In standard mode, all fields are always present — unavailable data points are returned as null, never omitted. In compact mode, only core fields are returned.

Enable detail enrichment in the input to get richer fields such as full descriptions and company metadata where the source provides them.

Input

The main inputs are a search keyword, an optional location filter, and a result limit. Additional filters and options are available in the input schema.

Key parameters:

  • boardTokens — Greenhouse board tokens or URLs. Example: 'airbnb' or 'https://boards.greenhouse.io/airbnb'.
  • query — Filter jobs by keyword (matched against title and description).
  • location — Filter by location (substring match, e.g. 'London' or 'Remote').
  • department — Filter by department name (substring match, e.g. 'Engineering').
  • maxResults — Maximum total results across all boards (0 = unlimited). (default: 0)
  • includeDetails — Fetch pay transparency and application questions per job (slower — one extra request per job). (default: false)
  • descriptionMaxLength — Truncate HTML description to N characters. 0 = no truncation. (default: 0)
  • compact — Return only core fields (for AI-agent/MCP workflows). (default: false)
  • incrementalMode — Only return new or changed jobs since last run. (default: false)
  • stateKey — Custom key for incremental state (default: auto-generated from board tokens).
  • skipReposts — Skip jobs detected as reposts of previously seen jobs (cross-run detection via content hash). (default: false)

Input examples

Basic search — Keyword-driven search with a result cap.

→ Full payload per result — all standard fields populated where the source provides them.

{
"query": "developer",
"maxResults": 50
}

Incremental tracking — Only emit jobs that changed since the previous run with this stateKey.

→ First run builds the baseline state. Subsequent runs emit only records that are new or whose tracked content changed. Set emitUnchanged: true to include unchanged records as well.

{
"query": "developer",
"maxResults": 200,
"incrementalMode": true,
"stateKey": "developer-tracker"
}

Compact output for AI agents — Return only core fields for AI-agent and MCP workflows.

→ Small payload with the most important fields — ideal for piping into LLMs without token overhead.

{
"query": "developer",
"maxResults": 50,
"compact": true
}

Output

Each run produces a dataset of structured job records. Results can be downloaded as JSON, CSV, or Excel from the Dataset tab in Apify Console.

Example job record

{
"jobId": "gh-7649441",
"greenhouseId": 7649441,
"internalJobId": 3369660,
"title": "Account Executive (12 Month FTC)",
"company": "Airbnb",
"location": "Paris, France",
"department": "Sales",
"offices": [
"Paris, France"
],
"description": "<div class="content-intro"><p><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Airbnb was born in 2007 when two hosts welcomed three gues...",
"employmentType": null,
"workplaceType": "Hybrid",
"url": "https://careers.airbnb.com/positions/7649441?gh_jid=7649441",
"applyUrl": "https://careers.airbnb.com/positions/7649441?gh_jid=7649441",
"requisitionId": "ONE",
"language": "en",
"salaryMin": 61000,
"salaryMax": 72000,
"salaryCurrency": "EUR",
"salaryPeriod": "yearly",
"questions": [
{
"label": "First Name",
"type": "input_text",
"required": true
},
{
"label": "Last Name",
"type": "input_text",
"required": true
},
{
"label": "Email",
"type": "input_text",
"required": true
},
{
"label": "Phone",
"type": "input_text",
"required": true
},
{
"label": "Resume/CV",
"type": "input_file",
"required": true
},
{
"label": "Cover Letter",
"type": "input_file",
"required": false
},
{
"label": "LinkedIn Profile",
"type": "input_text",
"required": false
},
{
"label": "Why have you chosen to apply to Airbnb?",
"type": "textarea",
"required": true
},
{
"label": "Which city are you based? Are you within easy community distance to Paris?",
"type": "input_text",
"required": true
},
{
"label": "Please indicate your language proficiencies and the level of fluency for each.",
"type": "input_text",
"required": true
},
{
"label": "Before submitting your application please review the points below:",
"type": "multi_value_multi_select",
"required": true
},
{
"label": "Gender",
"type": "multi_value_single_select",
"required": true
},
{
"label": "Are you legally authorized to work in the country where the job is located?\n",
"type": "multi_value_single_select",
"required": true
},
{
"label": "Will you now or in the future require company sponsorship to retain or extend your work authorization in the country where the job is located?",
"type": "multi_value_single_select",
"required": true
},
{
"label": "Are you currently subject to any non-compete or non-solicitation agreement that would impact your ability to work at Airbnb or prevent you from accepting a job offer from Airbnb? ",
"type": "multi_value_single_select",
"required": true
},
{
"label": "Are you currently or have you ever worked for Airbnb in any capacity? This could include, but is not limited to, a full-time employee, intern, apprentice, or contingent worker.",
"type": "multi_value_single_select",
"required": true
}
],
"metadata": [
{
"name": "Is this job part of ACC?",
"value": false
},
{
"name": "Workplace Type",
"value": "Hybrid"
}
],
"postedDate": "2026-02-24T09:04:33-05:00",
"updatedAt": "2026-02-24T09:25:19-05:00",
"scrapedAt": "2026-03-29T00:52:15.327Z",
"portalUrl": "https://boards.greenhouse.io/airbnb",
"source": "greenhouse.io"
}

Incremental fields

When incremental: true, each record also carries:

  • changeType — one of NEW, UPDATED, UNCHANGED, REAPPEARED, EXPIRED.
  • firstSeenAt, lastSeenAt — ISO-8601 timestamps tracking the listing across runs.
  • isRepost, repostOfId, repostDetectedAt — populated when a new listing matches the tracked content of a previously expired one. Set skipReposts: true to drop detected reposts from the output.

How to scrape greenhouse.io

  1. Go to Greenhouse Scraper in Apify Console.
  2. Enter a search keyword and optional location filter.
  3. Set maxResults to control how many results you need.
  4. Enable includeDetails if you need full descriptions, contact info, or company data.
  5. Click Start and wait for the run to finish.
  6. Export the dataset as JSON, CSV, or Excel.

Use cases

  • Extract job data from greenhouse.io for market research and competitive analysis.
  • Track salary trends across regions and categories over time.
  • Monitor new and changed listings on scheduled runs without processing the full dataset every time.
  • Build outreach lists using contact details and apply URLs from listings.
  • Research company hiring patterns, employer profiles, and industry distribution.
  • Feed structured data into AI agents, MCP tools, and automated pipelines using compact mode.
  • Export clean, structured data to dashboards, spreadsheets, or data warehouses.

How much does it cost to scrape greenhouse.io?

Greenhouse Scraper uses pay-per-event pricing. You pay a small fee when the run starts and then for each result that is actually produced.

  • Run start: $0.005 per run
  • Per result: $0.002 per job record

Example costs:

  • 10 results: $0.03
  • 100 results: $0.21
  • 500 results: $1

Example: recurring monitoring savings

These examples compare full re-scrapes with incremental runs at different churn rates. Churn is the share of listings that are new or whose tracked content changed since the previous run. Actual churn depends on your query breadth, source activity, and polling frequency — the scenarios below are examples, not predictions.

Example setup: 250 results per run, daily polling (30 runs/month). Event-pricing examples scale linearly with result count.

Churn rateFull re-scrape run costIncremental run costSavings vs full re-scrapeMonthly cost after baseline
5% — stable niche query$0.51$0.03$0.47 (94%)$0.90
15% — moderate broad query$0.51$0.08$0.42 (84%)$2.40
30% — high-volume aggregator$0.51$0.15$0.35 (69%)$4.65

Full re-scrape monthly cost at daily polling: $15.15. First month with incremental costs $1.38 / $2.82 / $5.00 for the 5% / 15% / 30% scenarios because the first run builds baseline state at full cost before incremental savings apply.

FAQ

How many results can I get from greenhouse.io?

The number of results depends on the search query and available listings on greenhouse.io. Use the maxResults parameter to control how many results are returned per run.

Does Greenhouse Scraper support recurring monitoring?

Yes. Enable incremental mode to only receive new or changed listings on subsequent runs. This is ideal for scheduled monitoring where you want to track changes over time without re-processing the full dataset.

Can I integrate Greenhouse Scraper with other apps?

Yes. Greenhouse Scraper works with Apify's integrations to connect with tools like Zapier, Make, Google Sheets, Slack, and more. You can also use webhooks to trigger actions when a run completes.

Can I use Greenhouse Scraper with the Apify API?

Yes. You can start runs, manage inputs, and retrieve results programmatically through the Apify API. Client libraries are available for JavaScript, Python, and other languages.

Can I use Greenhouse Scraper through an MCP Server?

Yes. Apify provides an MCP Server that lets AI assistants and agents call this actor directly. Use compact mode and descriptionMaxLength to keep payloads manageable for LLM context windows.

This actor extracts publicly available data from greenhouse.io. Web scraping of public information is generally considered legal, but you should always review the target site's terms of service and ensure your use case complies with applicable laws and regulations, including GDPR where relevant.

Your feedback

If you have questions, need a feature, or found a bug, please open an issue on the actor's page in Apify Console. Your feedback helps us improve.

You might also like

Getting started with Apify

New to Apify? Create a free account with $5 credit — no credit card required.

  1. Sign up — $5 platform credit included
  2. Open this actor and configure your input
  3. Click Start — export results as JSON, CSV, or Excel

Need more later? See Apify pricing.