Greenhouse Hiring Intelligence Scraper avatar

Greenhouse Hiring Intelligence Scraper

Pricing

from $1.80 / 1,000 job-results

Go to Apify Store
Greenhouse Hiring Intelligence Scraper

Greenhouse Hiring Intelligence Scraper

Scrape public Greenhouse job boards via the Greenhouse Job Board API and turn them into clean, flat, CSV-ready job + hiring-intelligence rows - no login, cookies, or browser required.

Pricing

from $1.80 / 1,000 job-results

Rating

0.0

(0)

Developer

Delowar Munna

Delowar Munna

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

11 days ago

Last modified

Share

Greenhouse Hiring Intelligence Scraper

Scrape public Greenhouse job boards straight from the Greenhouse Job Board API and turn them into clean, flat, CSV-ready rows — plus lightweight hiring-intelligence fields (remote/hybrid flags, seniority, department group, role family, salary text, hiring-signal score + reason tags). Built for lead-gen and sales teams, recruiters, staffing agencies, market researchers, and data analysts.

No login, no cookies, no Harvest API keys, no browser. The actor uses Greenhouse's public Job Board API over HTTP, so it stays fast and cost-predictable. You pay one flat event per unique job row that passes your filters.

✨ Why this scraper

  • Greenhouse-specific, hiring-intelligence focused — not a generic job scraper. Every row carries derived signals that make the data useful for sales triggers, recruiting, and market research.
  • Public API, one request per board — Greenhouse returns every published job (with descriptions, departments, offices, and company name) in a single JSON response. No pagination, no per-job page visits.
  • 34 flat fields — job identity, company, role metadata, location/remote, compensation, description, dates, and hiring signals. No nested objects — drops straight into Sheets/Excel/CRMs.
  • Pay-Per-Event — one flat job-result event per saved unique job. Duplicates and filtered rows are never charged.
  • No login / cookies / sessions / paid APIs — just board tokens or URLs.
  • Transparent hiring-signal score — rule-based (no AI), explained below.

🚀 Quick start — sample inputs

Example 1 — a couple of boards by token

{
"boards": ["airbnb", "stripe"],
"maxResults": 500,
"includeDescription": true,
"descriptionFormat": "text",
"deduplicate": true,
"proxyConfiguration": { "useApifyProxy": false }
}

Example 2 — filtered, remote senior roles across boards

{
"boards": ["airbnb", "https://boards.greenhouse.io/stripe"],
"maxResults": 1000,
"includeDescription": true,
"descriptionFormat": "text",
"keywordFilter": ["data", "software", "sales"],
"excludeKeywords": ["internship"],
"departmentFilter": [],
"locationFilter": ["remote", "United States"],
"remoteOnly": false,
"seniorityFilter": ["senior", "lead", "manager"],
"updatedAfter": "2026-01-01",
"requireUpdatedDate": false,
"deduplicate": true,
"proxyConfiguration": { "useApifyProxy": true }
}

🧾 Inputs

FieldTypeDefaultDescription
boardsarray["airbnb"]Board tokens (airbnb), board URLs (https://boards.greenhouse.io/airbnb), or Job Board API URLs. At least one required.
maxResultsinteger1000Max saved unique jobs across the run (1–50000).
includeDescriptionbooleantrueRequest full descriptions (content=true). Improves derived fields and the signal score.
descriptionFormatstringtexttext, html, or both.
keywordFilterarray[]Keep jobs matching any term (title/department/location/company/description).
excludeKeywordsarray[]Remove jobs matching any term. Exclusion wins over inclusion.
departmentFilterarray[]Keep jobs whose department or department group matches.
locationFilterarray[]Keep jobs whose location/city/country/workplace type matches.
remoteOnlybooleanfalseKeep only remote/hybrid jobs.
seniorityFilterarray[]Keep only matching seniority classes (internexecutive, unknown).
updatedAfterstring""Keep jobs updated/published on or after this ISO date (YYYY-MM-DD).
requireUpdatedDatebooleanfalseWith updatedAfter, drop jobs that have no source timestamp.
deduplicatebooleantrueRemove duplicate jobs across boards/inputs.
proxyConfigurationobject{ "useApifyProxy": true }No proxy / Apify Datacenter / custom URLs. Apify Residential is rejected at startup.

No input accepts cookies, login credentials, authorization headers, user API keys, session tokens, or Greenhouse Harvest API credentials.


📤 Output

One flat, CSV-friendly row per job. A row is valid with at least job_id, job_url, or title + company_board_token.

Job & hiring-intelligence — table view

Job & hiring-intelligence table view

Sample output row

{
"job_id": "7881559",
"job_url": "https://careers.airbnb.com/positions/7881559?gh_jid=7881559",
"apply_url": "https://careers.airbnb.com/positions/7881559?gh_jid=7881559",
"company_board_token": "airbnb",
"company_name": "Airbnb",
"title": "Associate Principal, SF&A - Host Products",
"department": "Financial Planning and Analysis",
"department_group": "Finance",
"role_family": "finance",
"location": "United States",
"locations_raw": "United States",
"country": "United States",
"city": null,
"workplace_type": "remote",
"is_remote": true,
"seniority": "lead",
"employment_type": "unknown",
"salary_text": "$157,000—$172,000",
"salary_min": 157000,
"salary_max": 172000,
"salary_currency": "USD",
"description_text": "Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home... The Associate Principal, Strategic Finance & Analytics, Host Product role will be responsible for partnering closely with Product, Supply, and Analytics leaders across Airbnb's host ecosystem... Pay Range $157,000—$172,000 USD",
"description_html": null,
"published_at": "2026-05-05T14:12:13.000Z",
"updated_at": "2026-05-05T14:12:13.000Z",
"requisition_id": "ONE",
"questions_count": 0,
"has_application_questions": false,
"hiring_signal_score": 100,
"hiring_signal_label": "high",
"reason_tags": "recent_job; remote_or_hybrid; senior_role; salary_visible; high_description_quality",
"source_type": "greenhouse_job_board_api",
"source_input": "airbnb",
"scraped_at": "2026-06-02T07:39:29.325Z"
}

Output fields (34)

job_id, job_url, apply_url, company_board_token, company_name, title, department, department_group, role_family, location, locations_raw, country, city, workplace_type, is_remote, seniority, employment_type, salary_text, salary_min, salary_max, salary_currency, description_text, description_html, published_at, updated_at, requisition_id, questions_count, has_application_questions, hiring_signal_score, hiring_signal_label, reason_tags, source_type, source_input, scraped_at.

Application questions: the Greenhouse Job Board jobs endpoint does not return application questions, so questions_count is 0 and has_application_questions is false in this version (the actor never makes extra per-job requests).

A run summary is stored in the default key-value store under key RUN_SUMMARY (inputs, normalized boards, raw vs saved counts, duplicates removed, filtered out, charged events, failed boards, runtime).


🧮 Hiring-signal score (transparent, no AI)

Each row gets a hiring_signal_score from 0–100, built only from visible/derived fields:

PointsCondition
+20Job title present
+15Department present
+15Location present
+10Description ≥ 500 characters
+10Remote or hybrid
+10Senior+ seniority (senior/lead/manager/director/executive)
+10Visible salary/compensation text
+10Updated/published within the last 45 days

Labels: 0–39low, 40–69medium, 70–100high.

Reason tags (semicolon-separated) include any of: recent_job, remote_or_hybrid, senior_role, salary_visible, engineering_hiring, sales_hiring, multi_location, high_description_quality, application_questions_present.


💸 Pricing — Pay-Per-Event

EventWhen it fires
job-resultOnce per valid unique job row that passed all filters and was successfully saved to the dataset.

Duplicates, filtered-out rows, and failed board fetches are never charged. The actor also respects the per-run spending limit you set on Apify — it stops saving once that limit is reached.

🚦 Proxy policy

The Greenhouse Job Board API is public and typically works with no proxy. Apify Datacenter proxy and no proxy both work reliably at this actor's conservative concurrency.

Apify Residential proxy is not supported. The actor fails at startup if apifyProxyGroups includes RESIDENTIAL. Reason: in pay-per-event actors, residential bandwidth (~$/GB) is billed to the developer, not the run user, so a single bandwidth-heavy run could exceed the per-result event revenue.

If you genuinely need residential routing, supply your own residential provider via the proxy editor's Custom proxy URLs field — that traffic goes through your provider, not Apify, and is unaffected:

http://user:pass@proxy.iproyal.com:12321
http://user:pass@proxy.brightdata.com:22225
http://user:pass@proxy.oxylabs.io:7777

⚙️ How it works

  1. Each board input is normalized to a Greenhouse board token, then to https://boards-api.greenhouse.io/v1/boards/{token}/jobs?content=true.
  2. One HTTP request per board returns all published jobs as JSON (no pagination, no browser).
  3. Each job is mapped to a flat row; descriptions are decoded from HTML to text; department group, role family, workplace type, seniority, employment type, salary, country/city are derived with deterministic rules.
  4. Filters (keyword/exclude/department/location/remote/seniority/updated-after) and deduplication are applied.
  5. Surviving unique rows are scored, saved, and charged. Failed boards are recorded; the run still returns partial results.

📌 Notes & limits

  • Public, published jobs only. No login, private, or closed listings; no candidate/application data.
  • Salary fields are best-effort extraction from visible description text.
  • This actor is built to extend cleanly into Lever, Ashby, and Workday-public variants.