ATS Tech Stack Detector avatar

ATS Tech Stack Detector

Pricing

Pay per event

Go to Apify Store
ATS Tech Stack Detector

ATS Tech Stack Detector

Extract company tech stacks from Greenhouse, Lever, and Ashby job posts via their public APIs — detects 100+ frameworks, databases, and cloud platforms from live job descriptions — export to JSON or CSV. B2B sales intel without the SaaS bill.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Categories

Share

ATS Tech Stack Detector

ATS Tech Stack Detector — Greenhouse, Lever & Ashby

$5.05 / 1 000 rows  ·  pay only for results  ·  no credit card to try

We do the dirty work so your dataset stays clean. 😈

B2B tech-stack intel from public job posts — no SaaS subscription. Point this Actor at any company's Greenhouse, Lever, or Ashby board and get one structured row per active job, including a deduplicated list of canonical tech names (Postgres, Django, AWS, Kubernetes…) detected directly from the job description.

Sales reps qualifying accounts, recruiters sourcing by skill, and competitive-intelligence analysts mapping vendor footprint all need this signal. Commercial tools (TheirStack, Wappalyzer, BuiltWith) charge subscription fees for the same data — and they sniff front-end JS, not the back-end stack. This ATS tech stack detector reads the job descriptions where engineering teams declare their actual infrastructure.

🎯 What this scrapes

Three ATS platforms, one unified schema:

  1. Greenhouseboards-api.greenhouse.io/v1/boards/{token}/jobs?content=true
  2. Leverapi.lever.co/v0/postings/{token}
  3. Ashbyapi.ashbyhq.com/posting-api/job-board/{token} (token is case-sensitiveRamp, not ramp)

For each active job posting the Actor pulls the title, location, department, full description, and public URL — then runs a case-insensitive word-boundary regex against a curated vocabulary of ~110 canonical tech names. The result is a single detected_techs column you can pivot or filter without any per-company string normalisation.

🔥 Features

  • Three ATSs, one schema — Greenhouse, Lever, and Ashby normalised to an identical row shape.
  • Curated tech vocabulary — ~110 canonical names spanning languages (Python, Go, Rust), frameworks (Django, React, FastAPI), databases (Postgres, MongoDB, Redis, Snowflake), cloud (AWS, GCP, Azure, Vercel), infra (Kubernetes, Terraform, Docker), CI/CD, observability, and ML/data tools.
  • Per-company fault isolation — one bad token (404 from Lever, typo on Ashby) does not abort the run; the other companies still produce data.
  • Pydantic v2 validation — every input and every output row is model-validated before landing in the dataset.
  • Filter knobsmaxJobsPerCompany cap and minTechsDetected floor let you drop generic non-engineering postings.
  • Exponential backoff — 408 / 429 / 503 retries with Retry-After honoured; up to 5 attempts per request.
  • Deterministic detection — regex + vocabulary, not LLM. High precision, zero hallucinated tools.

💡 Use cases

  • B2B sales qualification — enrich every Salesforce or HubSpot account with the company's live back-end stack: "they hire Django + Postgres + AWS" tells you whether to pitch your Postgres-tuning SaaS.
  • Recruiter sourcing — pull every senior backend role from your target accounts and filter by detected_techs to find teams that match your candidate's stack.
  • Competitive intelligence — track which competitors are hiring for Kubernetes or Snowflake and infer their roadmap before it's announced.
  • CRM enrichment — replace TheirStack / BuiltWith / Wappalyzer subscriptions for the segment of buyers who publish their stack in job descriptions anyway.
  • Investment research — map private-company tech footprint over time without a Crunchbase Pro seat.
  • Open-source stack maps — feed a public page that ranks the most-hired-for technologies among YC-backed startups.

⚙️ How to use it

  1. Open the Actor input form on the Apify Store page.
  2. Add one or more {companyToken, atsType} pairs under Companies to scrape (the form prefills with airtable on Greenhouse and Ramp on Ashby).
  3. (Optional) Set Max jobs per company — leave empty for no cap.
  4. (Optional) Set Minimum detected techs2 or 3 drops generic non-engineering postings.
  5. Configure Apify Proxy — leave the default residential proxy group on; we rotate sessions automatically if the ATS endpoint rate-limits.
  6. Click Start. Rows stream into the default dataset as they are produced.

Quick examples

Two engineering-heavy companies, default settings:

{
"companies": [
{ "companyToken": "airtable", "atsType": "greenhouse" },
{ "companyToken": "Ramp", "atsType": "ashby" }
]
}

Single Lever company, drop sales/marketing roles:

{
"companies": [
{ "companyToken": "palantir", "atsType": "lever" }
],
"minTechsDetected": 3
}

Three companies across all three ATSs, capped at 50 jobs each:

{
"companies": [
{ "companyToken": "airtable", "atsType": "greenhouse" },
{ "companyToken": "palantir", "atsType": "lever" },
{ "companyToken": "Ramp", "atsType": "ashby" }
],
"maxJobsPerCompany": 50,
"minTechsDetected": 2
}

📥 Input

ParameterTypeRequiredDefaultDescription
companiesarrayYesList of {companyToken, atsType} objects. atsType must be greenhouse, lever, or ashby.
maxJobsPerCompanyintegerNounlimitedHard cap on jobs fetched per company.
minTechsDetectedintegerNo0Skip rows where detected_techs has fewer items than this value.
proxyConfigurationobjectNoApify ProxyProxy settings. Residential group recommended for rate-limited targets.

How to find a company's ATS token:

  • Greenhouse: the path segment in boards.greenhouse.io/{token} or {token}.applytojob.com. Examples: airtable, figma, stripe.
  • Lever: the path segment in jobs.lever.co/{token}. Examples: palantir, netflix, eventbrite.
  • Ashby: the path segment in jobs.ashbyhq.com/{Token}. Case-sensitive — Ramp, PostHog, Linear work; lowercase variants return zero jobs.

📤 Output

One row per active job posting. Rows are pushed to the default Apify dataset as they are produced — long runs surface results immediately.

FieldTypeDescription
atsstringgreenhouse, lever, or ashby
company_tokenstringBoard slug passed to the ATS
job_idstringATS-canonical job identifier
titlestringJob posting title
locationstring | nullLocation string (Remote, NYC, Berlin…)
departmentstring | nullDepartment or team
urlstringPublic job-post URL
description_textstringPlain-text description (HTML stripped)
detected_techsstring[]Sorted, deduplicated canonical tech names
posted_atstring | nullISO 8601 UTC publication timestamp
scraped_atstringISO 8601 UTC row-creation timestamp

Sample output row:

{
"ats": "greenhouse",
"company_token": "airtable",
"job_id": "4812345",
"title": "Senior Backend Engineer",
"location": "Remote",
"department": "Engineering",
"url": "https://boards.greenhouse.io/airtable/jobs/4812345",
"description_text": "We use Python, Django, PostgreSQL, and AWS...",
"detected_techs": ["AWS", "Django", "Postgres", "Python"],
"posted_at": "2026-05-01T09:00:00Z",
"scraped_at": "2026-06-01T12:00:00Z"
}

💰 Pricing

EventPriceWhen it fires
actor-start$0.05Once per run (covers warm-up)
result-row$0.005Per job row emitted

A typical 50-job run costs $0.05 + 50 × $0.005 = $0.30 on pay-per-result pricing — no monthly subscription, and the back-end stack is read straight from each job description.

You pay only for rows that land. If every token is misspelled and no jobs are returned, you pay only the $0.05 start fee.

🚧 Limitations

  • Tech detection is regex-based, not LLM-based — it is deterministic and high-precision, but will miss tools not in the curated ~110-name vocabulary. Submit a feedback request to add more terms.
  • ~85% recall — job descriptions that omit or abbreviate tool names will not surface every dependency in a company's stack. This is an inherent limit of job-description parsing, not an Actor bug.
  • Ashby tokens are case-sensitiveRamp works; ramp returns zero jobs. Double-check the exact token from the company's Ashby board URL.
  • Greenhouse double-encoding — Greenhouse wraps its content field in double HTML-encoding (&lt;div&gt; for <div>). The parser unescapes twice before stripping tags; unusual encoding in edge-case boards may still slip through.
  • Lever descriptionPlain gaps — Lever sometimes omits the "Requirements" bullet list from descriptionPlain. The parser concatenates every lists[].content chunk to recover skills listed there; however, lists in non-standard formats may be missed.
  • Default dataset retention — Apify FREE-tier default storage retains datasets for 7 days. Use Actor.open_dataset(name="…") or export immediately if you need to outlive that window.
  • No personal-data extraction — this Actor reads public job posts and company-aggregated tech signals only. It does not extract candidate data or applicant details from any ATS.

❓ FAQ

Q: What makes this different from BuiltWith or Wappalyzer? A: BuiltWith and Wappalyzer sniff front-end signals — JavaScript libraries, tracking pixels, and Cloudflare headers. They are excellent for marketing-stack detection (HubSpot, Marketo, Segment) but blind to back-end infrastructure (Postgres, Kafka, Kubernetes, Snowflake). This ATS tech stack detector reads job descriptions, which is where engineering teams declare their actual data platform and server-side stack. They complement each other; they don't overlap.

Q: Is scraping these ATS job boards allowed? A: Greenhouse, Lever, and Ashby all publish their job-board APIs as official public endpoints — the same ones embedded in company career pages. We read public job post text and return company-level tech signals. No personal data, no applicant details, no authenticated endpoints.

Q: How do I find the token for a company on Greenhouse / Lever / Ashby? A: Visit the company's public jobs page. The token is the path segment: jobs.lever.co/{token}, boards.greenhouse.io/{token}, or jobs.ashbyhq.com/{Token}. Ashby tokens are case-sensitive — copy them exactly as they appear in the URL.

Q: Can I scrape hundreds of companies in one run? A: Yes. Add as many {companyToken, atsType} pairs as you need. One bad token (404 or empty board) does not abort the run — the others still produce data. Use maxJobsPerCompany to cap the volume and keep costs predictable.

Q: What happens when a company rate-limits my requests? A: We handle it. The Actor retries with exponential backoff, rotates proxy sessions on 429s and 503s, and honours Retry-After headers. If a company hits a hard block mid-run, you'll see a set_status_message with a partial count — we never silently return an empty dataset.

Q: Why does the Actor fail loud instead of returning an empty dataset? A: A silent empty result is a lie. If every token is wrong, the run exits non-zero with a clear error message — so your pipeline knows to investigate rather than assuming there were just no open roles.

🗣 Your feedback

If a tech name you care about is missing from the vocabulary, open a feedback ticket on the Actor's Apify Store page — we add new terms in batches.

Found a bug or have a feature request? Use the Store feedback form or contact DevilScrapes at https://apify.com/DevilScrapes.


Related Actors:

  • LLM Pricing Monitor — track live API pricing across OpenAI, Anthropic, Google, Mistral, Groq, Together AI, and DeepSeek in one normalised schema.
  • GitHub Org Scraper — pair tech-stack signals from job descriptions with actual code-repo activity per org.
  • Y Combinator Companies Scraper — combine YC funding signal with ATS-derived tech stack for a complete startup profile.