ATS Tech Stack Scraper
Pricing
Pay per event
ATS Tech Stack Scraper
Extract company tech stacks from Greenhouse, Lever and Ashby job posts. Detects 100+ frameworks, databases and cloud platforms from live job descriptions. B2B sales intel without the $39-$399/mo SaaS bill.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
ATS Tech Stack Scraper
💰 $5.05 / 1 000 rows · pay only for results · no credit card to try
We do the dirty work so your dataset stays clean. 😈
B2B tech-stack intel from public job posts — no $39-$399/mo SaaS bill. Point this Actor at any company's Greenhouse, Lever or Ashby board and get one structured row per active job, including a deduplicated list of canonical tech names (Postgres, Django, AWS, Kubernetes...) detected directly from the job description.
Sales reps qualifying accounts, recruiters sourcing by skill, and competitive-intelligence analysts mapping vendor footprint all need this signal. Commercial tools (TheirStack, Wappalyzer, BuiltWith) charge subscription fees for the same data. This Actor hits the public ATS APIs directly — no scraping, no browser, no auth — and emits a flat, Pydantic-validated dataset ready for any CRM enrichment pipeline.
🎯 What this scrapes
Three ATS platforms, one schema:
- Greenhouse —
boards-api.greenhouse.io/v1/boards/{token}/jobs?content=true(public, no auth) - Lever —
api.lever.co/v0/postings/{token}(public, no auth) - Ashby —
api.ashbyhq.com/posting-api/job-board/{token}(public, no auth; token is case-sensitive —Ramp, notramp)
For each active job posting the Actor pulls the title, location, department, full HTML/plain-text description, and the public URL — then runs a case-insensitive word-boundary regex against a curated vocabulary of ~110 canonical tech names. The result is a single column you can pivot or filter without per-company string normalization.
| Field | Type | Description |
|---|---|---|
ats | string | greenhouse, lever, or ashby |
company_token | string | Board slug passed to the ATS |
job_id | string | ATS-canonical job id |
title | string | Job posting title |
location | string | null | Location string (Remote, NYC, Berlin...) |
department | string | null | Department / team |
url | string | Public job-post URL |
description_text | string | Plain-text description (HTML stripped) |
detected_techs | string[] | Sorted, deduplicated canonical tech names |
posted_at | string | null | ISO 8601 UTC publication timestamp |
scraped_at | string | ISO 8601 UTC row-creation timestamp |
🔥 Features
- Three ATSs, one schema — Greenhouse, Lever, Ashby normalized to identical row shape.
- Curated tech vocabulary — ~110 canonical names spanning languages (Python, Go, Rust), frameworks (Django, React, FastAPI), databases (Postgres, MongoDB, Redis, Snowflake), cloud (AWS, GCP, Azure, Vercel), infra (Kubernetes, Terraform, Docker), CI/CD, observability, and ML/data tools.
- Per-company isolation — one bad token (404 from Lever, typo on Ashby) does not abort the run; the others still produce data.
- No browser, no auth — pure HTTP with
curl-cffibrowser-TLS impersonation; low compute footprint and fast runs. - Pydantic v2 validation — every input and every output row is model-validated.
- Filter knobs —
maxJobsPerCompanycap andminTechsDetectedfloor. - Exponential backoff on 429 / 503 with
Retry-Afterhonoured.
💡 Use cases
- B2B sales qualification — enrich every Salesforce/HubSpot account with the company's live tech stack: "they hire Django + Postgres + AWS" tells you whether to pitch your Postgres-tuning SaaS.
- Recruiter sourcing — pull every "Senior Backend Engineer" role from your target accounts and filter by
detected_techsto find teams that match your candidate's stack. - Competitive intelligence — track which competitors are hiring for
KubernetesorSnowflakeand infer their roadmap. - CRM enrichment — replace TheirStack / BuiltWith / Wappalyzer subscriptions for the subset of buyers who publish their stack in JDs anyway.
- Investment research — map private-company tech footprint over time without paying for a Crunchbase Pro seat.
- Open-source stack maps — feed a public site that ranks the most-hired-for technologies in YC-backed startups.
⚙️ How to use it
- Open the Actor input form.
- Add one or more
{companyToken, atsType}pairs under Companies to scrape (the form prefills withairtableon Greenhouse andRampon Ashby). - (Optional) Set Max jobs per company — leave empty for no cap.
- (Optional) Set Minimum detected techs —
2or3drops generic/non-engineering postings. - Leave Use Apify Proxy off unless you observe rate-limiting (public ATS APIs do not require it).
- Click Start. Rows stream into the default dataset.
Quick examples
Two engineering-heavy companies, default settings:
{"companies": [{ "companyToken": "airtable", "atsType": "greenhouse" },{ "companyToken": "Ramp", "atsType": "ashby" }]}
Single Lever company, drop sales/marketing roles:
{"companies": [{ "companyToken": "palantir", "atsType": "lever" }],"minTechsDetected": 3}
Three companies across all three ATSs, capped at 50 jobs each:
{"companies": [{ "companyToken": "airtable", "atsType": "greenhouse" },{ "companyToken": "palantir", "atsType": "lever" },{ "companyToken": "Ramp", "atsType": "ashby" }],"maxJobsPerCompany": 50,"minTechsDetected": 2}
Pricing (Pay-Per-Event)
| Event | Price | When |
|---|---|---|
actor-start | $0.05 | Once per run |
result-row | $0.005 | Per job row emitted |
A typical 50-job run costs $0.05 + 50 × $0.005 = $0.30. Compare to TheirStack at $39-$399/month for the same signal.
How to find a company's ATS token
- Greenhouse: the path segment in
boards.greenhouse.io/{token}or{token}.applytojob.com. Examples:airtable,figma,stripe. - Lever: the path segment in
jobs.lever.co/{token}. Examples:palantir,netflix,eventbrite. - Ashby: the path segment in
jobs.ashbyhq.com/{Token}. Case-sensitive —Ramp,PostHog,Linearwork; the lowercase variants return zero jobs.
Output and limits
- Rows are pushed to the default Apify dataset as they are produced — long runs surface rows immediately.
- Apify FREE-tier default storage retains datasets for 7 days; use
Actor.open_dataset(name="...")if you need to outlive that. - The Actor exits successfully when at least one company yields one or more jobs. If every company returns zero (e.g. all tokens were misspelled), the run fails loud with a non-zero exit code.
Notes and caveats
- Tech detection is regex-based, not LLM-based — it is deterministic, free, and high-precision but will miss tools whose names are not in the curated vocabulary. Open a feedback request to add more.
- Greenhouse
contentis double-HTML-encoded (e.g.<div>for<div>); the parser unescapes twice before stripping tags. - Lever's
descriptionPlainoften omits the "Requirements" bullet list; the parser concatenates everylists[].contentchunk so skills hidden there are still detected. - Ashby exposes both
descriptionHtmlanddescriptionPlain— the parser prefersdescriptionPlainwhen present.
Related Actors
- LLM Pricing Monitor — track live API pricing across OpenAI, Anthropic, Google, Mistral, Groq, Together AI, and DeepSeek in one normalized schema.
- GitHub Org Scraper — pair tech-stack signals from JDs with actual code-repo activity per org.
- Y Combinator Companies Scraper — combine YC funding signal with ATS-derived tech stack for a complete startup profile.
Support
- Author: DevilScrapes (profile)
- Issues / feature requests: open a feedback ticket on the Actor's Apify Store page.