ATS Tech Stack Scraper avatar

ATS Tech Stack Scraper

Pricing

Pay per event

Go to Apify Store
ATS Tech Stack Scraper

ATS Tech Stack Scraper

Extract company tech stacks from Greenhouse, Lever and Ashby job posts. Detects 100+ frameworks, databases and cloud platforms from live job descriptions. B2B sales intel without the $39-$399/mo SaaS bill.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

ATS Tech Stack Scraper

ATS Tech Stack Scraper

💰 $5.05 / 1 000 rows  ·  pay only for results  ·  no credit card to try

We do the dirty work so your dataset stays clean. 😈

B2B tech-stack intel from public job posts — no $39-$399/mo SaaS bill. Point this Actor at any company's Greenhouse, Lever or Ashby board and get one structured row per active job, including a deduplicated list of canonical tech names (Postgres, Django, AWS, Kubernetes...) detected directly from the job description.

Sales reps qualifying accounts, recruiters sourcing by skill, and competitive-intelligence analysts mapping vendor footprint all need this signal. Commercial tools (TheirStack, Wappalyzer, BuiltWith) charge subscription fees for the same data. This Actor hits the public ATS APIs directly — no scraping, no browser, no auth — and emits a flat, Pydantic-validated dataset ready for any CRM enrichment pipeline.

🎯 What this scrapes

Three ATS platforms, one schema:

  1. Greenhouseboards-api.greenhouse.io/v1/boards/{token}/jobs?content=true (public, no auth)
  2. Leverapi.lever.co/v0/postings/{token} (public, no auth)
  3. Ashbyapi.ashbyhq.com/posting-api/job-board/{token} (public, no auth; token is case-sensitiveRamp, not ramp)

For each active job posting the Actor pulls the title, location, department, full HTML/plain-text description, and the public URL — then runs a case-insensitive word-boundary regex against a curated vocabulary of ~110 canonical tech names. The result is a single column you can pivot or filter without per-company string normalization.

FieldTypeDescription
atsstringgreenhouse, lever, or ashby
company_tokenstringBoard slug passed to the ATS
job_idstringATS-canonical job id
titlestringJob posting title
locationstring | nullLocation string (Remote, NYC, Berlin...)
departmentstring | nullDepartment / team
urlstringPublic job-post URL
description_textstringPlain-text description (HTML stripped)
detected_techsstring[]Sorted, deduplicated canonical tech names
posted_atstring | nullISO 8601 UTC publication timestamp
scraped_atstringISO 8601 UTC row-creation timestamp

🔥 Features

  • Three ATSs, one schema — Greenhouse, Lever, Ashby normalized to identical row shape.
  • Curated tech vocabulary — ~110 canonical names spanning languages (Python, Go, Rust), frameworks (Django, React, FastAPI), databases (Postgres, MongoDB, Redis, Snowflake), cloud (AWS, GCP, Azure, Vercel), infra (Kubernetes, Terraform, Docker), CI/CD, observability, and ML/data tools.
  • Per-company isolation — one bad token (404 from Lever, typo on Ashby) does not abort the run; the others still produce data.
  • No browser, no auth — pure HTTP with curl-cffi browser-TLS impersonation; low compute footprint and fast runs.
  • Pydantic v2 validation — every input and every output row is model-validated.
  • Filter knobsmaxJobsPerCompany cap and minTechsDetected floor.
  • Exponential backoff on 429 / 503 with Retry-After honoured.

💡 Use cases

  • B2B sales qualification — enrich every Salesforce/HubSpot account with the company's live tech stack: "they hire Django + Postgres + AWS" tells you whether to pitch your Postgres-tuning SaaS.
  • Recruiter sourcing — pull every "Senior Backend Engineer" role from your target accounts and filter by detected_techs to find teams that match your candidate's stack.
  • Competitive intelligence — track which competitors are hiring for Kubernetes or Snowflake and infer their roadmap.
  • CRM enrichment — replace TheirStack / BuiltWith / Wappalyzer subscriptions for the subset of buyers who publish their stack in JDs anyway.
  • Investment research — map private-company tech footprint over time without paying for a Crunchbase Pro seat.
  • Open-source stack maps — feed a public site that ranks the most-hired-for technologies in YC-backed startups.

⚙️ How to use it

  1. Open the Actor input form.
  2. Add one or more {companyToken, atsType} pairs under Companies to scrape (the form prefills with airtable on Greenhouse and Ramp on Ashby).
  3. (Optional) Set Max jobs per company — leave empty for no cap.
  4. (Optional) Set Minimum detected techs2 or 3 drops generic/non-engineering postings.
  5. Leave Use Apify Proxy off unless you observe rate-limiting (public ATS APIs do not require it).
  6. Click Start. Rows stream into the default dataset.

Quick examples

Two engineering-heavy companies, default settings:

{
"companies": [
{ "companyToken": "airtable", "atsType": "greenhouse" },
{ "companyToken": "Ramp", "atsType": "ashby" }
]
}

Single Lever company, drop sales/marketing roles:

{
"companies": [
{ "companyToken": "palantir", "atsType": "lever" }
],
"minTechsDetected": 3
}

Three companies across all three ATSs, capped at 50 jobs each:

{
"companies": [
{ "companyToken": "airtable", "atsType": "greenhouse" },
{ "companyToken": "palantir", "atsType": "lever" },
{ "companyToken": "Ramp", "atsType": "ashby" }
],
"maxJobsPerCompany": 50,
"minTechsDetected": 2
}

Pricing (Pay-Per-Event)

EventPriceWhen
actor-start$0.05Once per run
result-row$0.005Per job row emitted

A typical 50-job run costs $0.05 + 50 × $0.005 = $0.30. Compare to TheirStack at $39-$399/month for the same signal.

How to find a company's ATS token

  • Greenhouse: the path segment in boards.greenhouse.io/{token} or {token}.applytojob.com. Examples: airtable, figma, stripe.
  • Lever: the path segment in jobs.lever.co/{token}. Examples: palantir, netflix, eventbrite.
  • Ashby: the path segment in jobs.ashbyhq.com/{Token}. Case-sensitiveRamp, PostHog, Linear work; the lowercase variants return zero jobs.

Output and limits

  • Rows are pushed to the default Apify dataset as they are produced — long runs surface rows immediately.
  • Apify FREE-tier default storage retains datasets for 7 days; use Actor.open_dataset(name="...") if you need to outlive that.
  • The Actor exits successfully when at least one company yields one or more jobs. If every company returns zero (e.g. all tokens were misspelled), the run fails loud with a non-zero exit code.

Notes and caveats

  • Tech detection is regex-based, not LLM-based — it is deterministic, free, and high-precision but will miss tools whose names are not in the curated vocabulary. Open a feedback request to add more.
  • Greenhouse content is double-HTML-encoded (e.g. &lt;div&gt; for <div>); the parser unescapes twice before stripping tags.
  • Lever's descriptionPlain often omits the "Requirements" bullet list; the parser concatenates every lists[].content chunk so skills hidden there are still detected.
  • Ashby exposes both descriptionHtml and descriptionPlain — the parser prefers descriptionPlain when present.
  • LLM Pricing Monitor — track live API pricing across OpenAI, Anthropic, Google, Mistral, Groq, Together AI, and DeepSeek in one normalized schema.
  • GitHub Org Scraper — pair tech-stack signals from JDs with actual code-repo activity per org.
  • Y Combinator Companies Scraper — combine YC funding signal with ATS-derived tech stack for a complete startup profile.

Support

  • Author: DevilScrapes (profile)
  • Issues / feature requests: open a feedback ticket on the Actor's Apify Store page.