Greenhouse Jobs Scraper & Intent Signals API
Pricing
from $4.90 / 1,000 results
Greenhouse Jobs Scraper & Intent Signals API
Scrape jobs from any Greenhouse career page instantly. Extract clean, English-only job data with AI intent tagging (e.g., 'Data & AI') and days-active filters. Perfect for B2B sales leads, lead generation, and feeding LLMs.
Pricing
from $4.90 / 1,000 results
Rating
0.0
(0)
Developer
Aether
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Greenhouse Hiring Intent Signals — The SDR's Secret Weapon
This is NOT a generic job scraper. It's a high-speed, API-native extractor purpose-built to feed clean, actionable hiring signals directly into LLM prompt chains and CRM outreach workflows.
A 200-person Series B company posting 6 new "Sales Engineer" roles this week isn't just hiring — they just got budget approval for a tool your software replaces. That's the signal. This Actor surfaces it in under 60 seconds, with zero HTML bloat and zero wasted tokens.
Why Choose This Actor?
⏱ Time-Based Filtering (max_days_old)
Generic scrapers dump every job posted since 2019 into your pipeline — noise that burns API credits and wastes SDR time. This Actor's hard freshness gate drops any job last updated more than N days ago (default: 7). You only see signals from companies actively spending budget right now.
🏷 Auto Intent Categorization
Every job is classified into a high-level intent_category via zero-cost regex rules — no LLM call required:
| Category | Example Signals | Your Play |
|---|---|---|
Data & AI | Data Scientist, ML Engineer, AI Product Manager | Pitch your data infra / analytics tool |
Revenue | Account Executive, SDR, VP of Sales | Pitch your CRM / sales engagement platform |
Engineering | Backend Engineer, DevOps, Security Engineer | Pitch your developer tool / cloud service |
Product | Product Manager, Program Manager | Pitch your project management / roadmapping tool |
Marketing | Growth Marketer, Demand Gen, Content Strategist | Pitch your marketing automation / SEO tool |
General | Everything else | Generic nurture |
Route each category to a different email sequence automatically — no extra AI processing needed.
🌐 English-Only Guarantee
Greenhouse boards for global companies often contain duplicate job postings in French, German, Japanese, and other APAC/EMEA languages. These localised listings add zero value to your English-language outbound campaigns and burn expensive LLM tokens. This Actor auto-detects and filters out non-English job descriptions using a fast Unicode character-range heuristic, keeping your output lean and your token costs flat.
🧹 LLM-Ready Clean Text
Forget raw HTML with nested <div> tags, inline styles, and 5,000-character walls of text. This Actor strips all HTML via Cheerio and produces a clean job_summary_clean field — the first 400 characters of plain text, word-boundary truncated. Drop it directly into your GPT prompt:
"Write a cold email to the Head of Engineering at {company}, referencing this job opening: {job_summary_clean}"
Input Configuration
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
company_boards | string[] | Yes | — | Greenhouse board tokens (the subdomain slug). e.g. "datadog" → https://boards.greenhouse.io/datadog |
max_days_old | integer | No | 7 | Skip jobs last updated more than this many days ago. Set to 1 for today-only signals, 14 for a broader sweep. Max: 90. |
target_departments | string[] | No | [] (all) | Case-insensitive department filter. Pass ["Engineering"] to only see engineering roles. Leave empty to scrape everything. |
Example input:
{"company_boards": ["datadog", "stripe", "figma", "vercel"],"max_days_old": 3,"target_departments": ["Engineering", "Data Science"]}
Sample Output
Here's exactly what you get — one flat, LLM-ready record per fresh signal:
{"board_name": "datadog","job_title": "Senior Software Engineer - Data Platform","department": "Engineering","location": "New York, NY, United States","updated_at": "2026-05-11T09:15:00-04:00","days_active": 2,"intent_category": "Engineering","job_summary_clean": "As a Senior Software Engineer on the Data Platform team, you will design and build the next generation of Datadog's petabyte-scale analytics infrastructure. You'll work closely with product teams to deliver real-time observability features used by thousands of enterprise customers. Strong experience with distributed systems, Kafka, and…","apply_url": "https://boards.greenhouse.io/datadog/jobs/9876543"}
Every field earns its place in your workflow:
| Field | What it unlocks |
|---|---|
days_active | Sort ascending. Hit 0–2 day signals first — those companies just opened budget. |
intent_category | Route "Data & AI" signals to your data tool pitch. Route "Revenue" signals to your CRM pitch. Zero-runtime classification. |
job_summary_clean | Drop into GPT-4o / Claude prompt. "Write a cold email referencing this opening…" No pre-processing needed. |
apply_url | Cross-reference with your CRM. Existing customer? Deprioritize. Net-new logo? Gold. |
department | Verify alignment with your ICP. You sell to Engineering leads? Filter target_departments: ["Engineering"]. |
location | Geo-qualify. Only selling in North America? Filter location downstream in Clay or Airtable. |
How SDR Teams Deploy This
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐│ Apify Scheduler │────▶│ This Actor │────▶│ Zapier / Make ││ (Daily 6 AM run) │ │ (Fresh signals) │ │ (Webhook trigger) │└──────────────────┘ └──────────────────┘ └──────────┬───────────┘│┌──────────▼───────────┐│ Clay / Airtable ││ (Enrich + ICP tag) │└──────────┬───────────┘│┌──────────▼───────────┐│ Smartlead / Outreach││ "{first_name}, saw ││ {company} is hiring││ a {job_title}..." │└──────────────────────┘
- Schedule daily with
max_days_old: 1to catch every new posting. - Webhook to your enrichment layer — tag each company (ICP fit? Existing customer? Competitor?).
- Auto-generate cold emails — pipe
job_title+job_summary_clean+intent_categoryinto your LLM prompt. - Prioritize by
days_activeascending — freshest signals get first contact.
Local Development
git clone <repo-url> && cd greenhouse-intent-signals-scrapernpm installnpm start # Run with tsx (no build required)npm run build # Compile TypeScript to dist/
When running locally without Apify input, the Actor auto-falls back to test boards (datadog, mailchimp) with max_days_old: 14 so you can verify end-to-end immediately.
Cost Efficiency
This Actor calls Greenhouse's public JSON API — one lightweight HTTP GET per company board. No headless browser. No Playwright. No expensive proxy rotation. Scraping 50 boards costs a fraction of what a single paginated website scrape would consume in Apify compute units.
Built for SDRs. Optimized for LLMs. No bloat. Just signal.