ATS Tech Stack Detector
Pricing
Pay per event
ATS Tech Stack Detector
Extract company tech stacks from Greenhouse, Lever, and Ashby job posts via their public APIs — detects 100+ frameworks, databases, and cloud platforms from live job descriptions — export to JSON or CSV. B2B sales intel without the SaaS bill.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
ATS Tech Stack Detector — Greenhouse, Lever & Ashby
$5.05 / 1 000 rows · pay only for results · no credit card to try
We do the dirty work so your dataset stays clean. 😈
B2B tech-stack intel from public job posts — no SaaS subscription. Point this Actor at any company's Greenhouse, Lever, or Ashby board and get one structured row per active job, including a deduplicated list of canonical tech names (Postgres, Django, AWS, Kubernetes…) detected directly from the job description.
Sales reps qualifying accounts, recruiters sourcing by skill, and competitive-intelligence analysts mapping vendor footprint all need this signal. Commercial tools (TheirStack, Wappalyzer, BuiltWith) charge subscription fees for the same data — and they sniff front-end JS, not the back-end stack. This ATS tech stack detector reads the job descriptions where engineering teams declare their actual infrastructure.
🎯 What this scrapes
Three ATS platforms, one unified schema:
- Greenhouse —
boards-api.greenhouse.io/v1/boards/{token}/jobs?content=true - Lever —
api.lever.co/v0/postings/{token} - Ashby —
api.ashbyhq.com/posting-api/job-board/{token}(token is case-sensitive —Ramp, notramp)
For each active job posting the Actor pulls the title, location, department, full description, and public URL — then runs a case-insensitive word-boundary regex against a curated vocabulary of ~110 canonical tech names. The result is a single detected_techs column you can pivot or filter without any per-company string normalisation.
🔥 Features
- Three ATSs, one schema — Greenhouse, Lever, and Ashby normalised to an identical row shape.
- Curated tech vocabulary — ~110 canonical names spanning languages (Python, Go, Rust), frameworks (Django, React, FastAPI), databases (Postgres, MongoDB, Redis, Snowflake), cloud (AWS, GCP, Azure, Vercel), infra (Kubernetes, Terraform, Docker), CI/CD, observability, and ML/data tools.
- Per-company fault isolation — one bad token (404 from Lever, typo on Ashby) does not abort the run; the other companies still produce data.
- Pydantic v2 validation — every input and every output row is model-validated before landing in the dataset.
- Filter knobs —
maxJobsPerCompanycap andminTechsDetectedfloor let you drop generic non-engineering postings. - Exponential backoff — 408 / 429 / 503 retries with
Retry-Afterhonoured; up to 5 attempts per request. - Deterministic detection — regex + vocabulary, not LLM. High precision, zero hallucinated tools.
💡 Use cases
- B2B sales qualification — enrich every Salesforce or HubSpot account with the company's live back-end stack: "they hire Django + Postgres + AWS" tells you whether to pitch your Postgres-tuning SaaS.
- Recruiter sourcing — pull every senior backend role from your target accounts and filter by
detected_techsto find teams that match your candidate's stack. - Competitive intelligence — track which competitors are hiring for
KubernetesorSnowflakeand infer their roadmap before it's announced. - CRM enrichment — replace TheirStack / BuiltWith / Wappalyzer subscriptions for the segment of buyers who publish their stack in job descriptions anyway.
- Investment research — map private-company tech footprint over time without a Crunchbase Pro seat.
- Open-source stack maps — feed a public page that ranks the most-hired-for technologies among YC-backed startups.
⚙️ How to use it
- Open the Actor input form on the Apify Store page.
- Add one or more
{companyToken, atsType}pairs under Companies to scrape (the form prefills withairtableon Greenhouse andRampon Ashby). - (Optional) Set Max jobs per company — leave empty for no cap.
- (Optional) Set Minimum detected techs —
2or3drops generic non-engineering postings. - Configure Apify Proxy — leave the default residential proxy group on; we rotate sessions automatically if the ATS endpoint rate-limits.
- Click Start. Rows stream into the default dataset as they are produced.
Quick examples
Two engineering-heavy companies, default settings:
{"companies": [{ "companyToken": "airtable", "atsType": "greenhouse" },{ "companyToken": "Ramp", "atsType": "ashby" }]}
Single Lever company, drop sales/marketing roles:
{"companies": [{ "companyToken": "palantir", "atsType": "lever" }],"minTechsDetected": 3}
Three companies across all three ATSs, capped at 50 jobs each:
{"companies": [{ "companyToken": "airtable", "atsType": "greenhouse" },{ "companyToken": "palantir", "atsType": "lever" },{ "companyToken": "Ramp", "atsType": "ashby" }],"maxJobsPerCompany": 50,"minTechsDetected": 2}
📥 Input
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
companies | array | Yes | — | List of {companyToken, atsType} objects. atsType must be greenhouse, lever, or ashby. |
maxJobsPerCompany | integer | No | unlimited | Hard cap on jobs fetched per company. |
minTechsDetected | integer | No | 0 | Skip rows where detected_techs has fewer items than this value. |
proxyConfiguration | object | No | Apify Proxy | Proxy settings. Residential group recommended for rate-limited targets. |
How to find a company's ATS token:
- Greenhouse: the path segment in
boards.greenhouse.io/{token}or{token}.applytojob.com. Examples:airtable,figma,stripe. - Lever: the path segment in
jobs.lever.co/{token}. Examples:palantir,netflix,eventbrite. - Ashby: the path segment in
jobs.ashbyhq.com/{Token}. Case-sensitive —Ramp,PostHog,Linearwork; lowercase variants return zero jobs.
📤 Output
One row per active job posting. Rows are pushed to the default Apify dataset as they are produced — long runs surface results immediately.
| Field | Type | Description |
|---|---|---|
ats | string | greenhouse, lever, or ashby |
company_token | string | Board slug passed to the ATS |
job_id | string | ATS-canonical job identifier |
title | string | Job posting title |
location | string | null | Location string (Remote, NYC, Berlin…) |
department | string | null | Department or team |
url | string | Public job-post URL |
description_text | string | Plain-text description (HTML stripped) |
detected_techs | string[] | Sorted, deduplicated canonical tech names |
posted_at | string | null | ISO 8601 UTC publication timestamp |
scraped_at | string | ISO 8601 UTC row-creation timestamp |
Sample output row:
{"ats": "greenhouse","company_token": "airtable","job_id": "4812345","title": "Senior Backend Engineer","location": "Remote","department": "Engineering","url": "https://boards.greenhouse.io/airtable/jobs/4812345","description_text": "We use Python, Django, PostgreSQL, and AWS...","detected_techs": ["AWS", "Django", "Postgres", "Python"],"posted_at": "2026-05-01T09:00:00Z","scraped_at": "2026-06-01T12:00:00Z"}
💰 Pricing
| Event | Price | When it fires |
|---|---|---|
actor-start | $0.05 | Once per run (covers warm-up) |
result-row | $0.005 | Per job row emitted |
A typical 50-job run costs $0.05 + 50 × $0.005 = $0.30 on pay-per-result pricing — no monthly subscription, and the back-end stack is read straight from each job description.
You pay only for rows that land. If every token is misspelled and no jobs are returned, you pay only the $0.05 start fee.
🚧 Limitations
- Tech detection is regex-based, not LLM-based — it is deterministic and high-precision, but will miss tools not in the curated ~110-name vocabulary. Submit a feedback request to add more terms.
- ~85% recall — job descriptions that omit or abbreviate tool names will not surface every dependency in a company's stack. This is an inherent limit of job-description parsing, not an Actor bug.
- Ashby tokens are case-sensitive —
Rampworks;rampreturns zero jobs. Double-check the exact token from the company's Ashby board URL. - Greenhouse double-encoding — Greenhouse wraps its
contentfield in double HTML-encoding (<div>for<div>). The parser unescapes twice before stripping tags; unusual encoding in edge-case boards may still slip through. - Lever
descriptionPlaingaps — Lever sometimes omits the "Requirements" bullet list fromdescriptionPlain. The parser concatenates everylists[].contentchunk to recover skills listed there; however, lists in non-standard formats may be missed. - Default dataset retention — Apify FREE-tier default storage retains datasets for 7 days. Use
Actor.open_dataset(name="…")or export immediately if you need to outlive that window. - No personal-data extraction — this Actor reads public job posts and company-aggregated tech signals only. It does not extract candidate data or applicant details from any ATS.
❓ FAQ
Q: What makes this different from BuiltWith or Wappalyzer? A: BuiltWith and Wappalyzer sniff front-end signals — JavaScript libraries, tracking pixels, and Cloudflare headers. They are excellent for marketing-stack detection (HubSpot, Marketo, Segment) but blind to back-end infrastructure (Postgres, Kafka, Kubernetes, Snowflake). This ATS tech stack detector reads job descriptions, which is where engineering teams declare their actual data platform and server-side stack. They complement each other; they don't overlap.
Q: Is scraping these ATS job boards allowed? A: Greenhouse, Lever, and Ashby all publish their job-board APIs as official public endpoints — the same ones embedded in company career pages. We read public job post text and return company-level tech signals. No personal data, no applicant details, no authenticated endpoints.
Q: How do I find the token for a company on Greenhouse / Lever / Ashby?
A: Visit the company's public jobs page. The token is the path segment: jobs.lever.co/{token}, boards.greenhouse.io/{token}, or jobs.ashbyhq.com/{Token}. Ashby tokens are case-sensitive — copy them exactly as they appear in the URL.
Q: Can I scrape hundreds of companies in one run?
A: Yes. Add as many {companyToken, atsType} pairs as you need. One bad token (404 or empty board) does not abort the run — the others still produce data. Use maxJobsPerCompany to cap the volume and keep costs predictable.
Q: What happens when a company rate-limits my requests?
A: We handle it. The Actor retries with exponential backoff, rotates proxy sessions on 429s and 503s, and honours Retry-After headers. If a company hits a hard block mid-run, you'll see a set_status_message with a partial count — we never silently return an empty dataset.
Q: Why does the Actor fail loud instead of returning an empty dataset? A: A silent empty result is a lie. If every token is wrong, the run exits non-zero with a clear error message — so your pipeline knows to investigate rather than assuming there were just no open roles.
🗣 Your feedback
If a tech name you care about is missing from the vocabulary, open a feedback ticket on the Actor's Apify Store page — we add new terms in batches.
Found a bug or have a feature request? Use the Store feedback form or contact DevilScrapes at https://apify.com/DevilScrapes.
Related Actors:
- LLM Pricing Monitor — track live API pricing across OpenAI, Anthropic, Google, Mistral, Groq, Together AI, and DeepSeek in one normalised schema.
- GitHub Org Scraper — pair tech-stack signals from job descriptions with actual code-repo activity per org.
- Y Combinator Companies Scraper — combine YC funding signal with ATS-derived tech stack for a complete startup profile.