Pricing

from $1.80 / 1,000 job-results

Tech Stack From Job Posts Scraper

Extract public job posts from Greenhouse, Lever, Ashby, and public career pages and detect the technologies, tools, and cloud platforms companies are hiring for - no login or cookies.

Pricing

from $1.80 / 1,000 job-results

Rating

0.0

(0)

Developer

Delowar Munna

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

✨ Why this scraper

Hiring-intent, not just job rows — generic job scrapers return listings; this turns each description into structured technology signals for B2B sales, GTM research, and recruiting.
JSON-first, no browser — reads Greenhouse / Lever / Ashby public APIs directly. One request per board returns every posting with its description.
Transparent detection — a local, in-code keyword dictionary (no AI, no paid API). You can see and extend exactly what is matched.
Flat 29-field output — no nested objects; drops straight into Sheets, Excel, or a CRM.
Pay-Per-Event — one flat job-result event per saved unique job. Duplicates and filtered rows are never charged.

🚀 Quick start — sample inputs

Example 1 — ATS boards + technology filter

{
    "startUrls": ["https://jobs.lever.co/example-company", "https://boards.greenhouse.io/examplecompany"],
    "maxResults": 200,
    "technologyKeywords": ["Snowflake", "dbt", "Kubernetes"],
    "requireTechnologyMatch": true,
    "technologyCategories": ["cloud", "data", "devops"],
    "keywordFilter": "data engineer",
    "locationFilter": "Australia",
    "remoteFilter": "any",
    "minTechSignalScore": 30,
    "includeDescriptionText": true,
    "dedupe": true,
    "proxyConfiguration": { "useApifyProxy": true }
}

Example 2 — single Ashby board, all jobs, custom residential proxy via your own provider

{
    "startUrls": ["https://jobs.ashbyhq.com/example"],
    "maxResults": 500,
    "includeDescriptionText": true,
    "dedupe": true,
    "proxyConfiguration": {
        "useApifyProxy": false,
        "proxyUrls": ["http://user:pass@proxy.iproyal.com:12321"]
    }
}

Supported ATS URLs use each platform's public JSON API. Other career pages are read via schema.org/JobPosting structured data when present, or by following an embedded Greenhouse/Lever/Ashby board. Pages with neither are reported as unsupported_inputs.

Apify Residential proxy is blocked; if you need residential routing, supply your own provider via proxyConfiguration.proxyUrls. See 🚦 Proxy policy below.

📦 Output

The dataset has one view: Jobs & detected tech stack — a 29-column flat table.

Tech Stack From Job Posts Scraper — Jobs & detected tech stack table view

Output fields (29)

source_input, source_platform, company_name, company_domain, job_id, job_title, department, location, remote_type, employment_type, posted_at, job_url, apply_url, description_text, detected_technologies, technology_categories, languages, frameworks, cloud_platforms, databases, devops_tools, data_ai_tools, business_tools, tech_signal_score, tech_signal_label, reason_tags, matched_user_keywords, raw_description_length, scraped_at.

Sample record — Jobs & detected tech stack

(Real run output; description_text is truncated here for readability.)

{
    "source_input": "https://jobs.ashbyhq.com/ramp",
    "source_platform": "ashby",
    "company_name": "Ramp",
    "company_domain": "ramp.com",
    "job_id": "41696f51-7b29-4e12-b528-46c2f6c4f5f7",
    "job_title": "Senior Data Scientist, Growth",
    "department": "Data",
    "location": "New York, NY (HQ)",
    "remote_type": "remote",
    "employment_type": "Full-time",
    "posted_at": "2026-02-02T14:46:45.488Z",
    "job_url": "https://jobs.ashbyhq.com/ramp/41696f51-7b29-4e12-b528-46c2f6c4f5f7",
    "apply_url": "https://jobs.ashbyhq.com/ramp/41696f51-7b29-4e12-b528-46c2f6c4f5f7/application",
    "description_text": "About Ramp Ramp is building the smart infrastructure for finance teams, embedded in the transaction flow of every dollar a business spends. We automate how over...",
    "detected_technologies": "Airflow, BigQuery, Dagster, dbt, Fivetran, Git, Looker, NumPy, Pandas, Prefect, Python, Redshift, scikit-learn, Snowflake, SQL",
    "technology_categories": "ai_ml, analytics, data, database, language, other",
    "languages": "Python, SQL",
    "frameworks": "",
    "cloud_platforms": "",
    "databases": "BigQuery, Redshift, Snowflake",
    "devops_tools": "",
    "data_ai_tools": "Airflow, Dagster, dbt, Fivetran, NumPy, Pandas, Prefect, scikit-learn",
    "business_tools": "Looker",
    "tech_signal_score": 90,
    "tech_signal_label": "high",
    "reason_tags": "multiple_technologies, data_stack, ai_ml_stack, crm_or_marketing_stack, user_keyword_match, engineering_role",
    "matched_user_keywords": "dbt, Snowflake",
    "raw_description_length": 6118,
    "scraped_at": "2026-06-05T00:50:33.198Z"
}

🎯 Tech signal score

Transparent rule-based score (0–100) computed from the detected technologies and the role title — no AI, no external enrichment.

Signal	Points
Each unique detected technology	+10 (capped at 50)
At least two technology categories detected	+10
Cloud platform or DevOps tool detected	+10
Database / data / AI-ML tool detected	+10
Engineering / data / IT / security / product role title	+10
A user-supplied `technologyKeywords` term matched	+10

Score is capped at 100. Labels: high (60–100) · medium (30–59) · low (1–29) · none (0).

reason_tags explains the score — e.g. multiple_technologies, cloud_mentioned, devops_stack, data_stack, ai_ml_stack, crm_or_marketing_stack, user_keyword_match, engineering_role.

⚙️ Filters

Filter	Effect
`keywordFilter`	Case-insensitive substring on title + department + description.
`requireTechnologyMatch`	Keep only rows with at least one detected technology.
`technologyCategories`	Keep rows with at least one detected technology in the selected categories.
`locationFilter`	Case-insensitive substring on location; jobs with no location are excluded when set.
`remoteFilter`	`any` / `remote` / `hybrid` / `onsite` against the derived `remote_type`.
`minTechSignalScore`	Keep rows with `tech_signal_score` ≥ threshold.
`dedupe`	Drop duplicates by platform job ID, canonical job URL, and title/company/location.

Filters are applied before any dataset push or event charge.

💰 Pricing

Pay-Per-Event. One flat event per saved row (the per-event price is configured on the Apify console):

Event	Charged when
`job-result`	Once per unique job row that passed all filters and was successfully written to the dataset.

Your bill is simply results_saved × price_per_event. The actor honors the user-configured per-run spending cap (Apify eventChargeLimitReached): it caps how many results it collects up-front to what the limit can pay for, and stops cleanly the moment the cap is reached during charging.

Not charged: duplicates, rows filtered out, rows missing a title/durable identifier, and failed or blocked requests.

🚦 Proxy policy

Use Apify Datacenter proxy or no proxy for normal runs — both work reliably for the public ATS JSON APIs at this actor's conservative concurrency.

Apify Residential proxy is not supported. The actor will fail at startup if proxyConfiguration.apifyProxyGroups includes RESIDENTIAL. Reason: in pay-per-event actors, residential bandwidth (~/GB) is billed to the developer, not the run user, so a single bandwidth-heavy run could exceed the per-result event revenue.

If you genuinely need residential routing, supply your own residential provider via the proxy editor's Custom proxy URLs field — that traffic goes through your provider, not Apify, and is unaffected:

http://user:pass@proxy.iproyal.com:12321
http://user:pass@proxy.brightdata.com:22225
http://user:pass@proxy.oxylabs.io:7777

📊 Run summary

After each run, a RUN_SUMMARY entry is written to the key-value store:

{
    "inputs_total": 3,
    "successful_inputs": 3,
    "failed_inputs": 0,
    "unsupported_inputs": 0,
    "raw_results_found": 420,
    "results_saved": 200,
    "duplicates_removed": 12,
    "filtered_out": 208,
    "charged_events": 200,
    "blocked_requests": 0,
    "retry_count": 1,
    "source_platform_counts": { "greenhouse": 120, "lever": 50, "ashby": 30 },
    "technology_counts": { "Python": 88, "AWS": 61, "Kubernetes": 44 },
    "category_counts": { "language": 180, "cloud": 110, "devops": 70 },
    "runtime_seconds": 41,
    "scraped_at": "2026-06-04T06:00:00.000Z"
}

charged_events equals the number of successfully saved unique rows.

🚧 Limitations (V1)

Public data only: no login, cookies, or member-only content. The actor reads each ATS's public JSON board API.
Supported sources: Greenhouse, Lever, and Ashby via public APIs; other career pages only when they expose schema.org/JobPosting JSON-LD or embed one of those boards.
Technology detection uses a transparent local dictionary — it detects common languages, frameworks, cloud platforms, databases, DevOps, data/AI, analytics, CRM, security, and mobile tooling, not every niche tool.
company_domain is best-effort and is often null for ATS-hosted boards (no company website is exposed).
No recruiter/contact extraction, email enrichment, company-site crawling, or AI scoring.

❓ FAQ

Do I need an account or cookies? No. The actor only uses public ATS JSON endpoints.

Which ATS platforms are supported? Greenhouse, Lever, and Ashby via their public APIs. Generic career pages work when they expose JSON-LD JobPosting data or embed one of those boards.

How are technologies detected? A local, transparent keyword dictionary with aliases and word-boundary rules runs against the job title, department, and description. Add your own terms with technologyKeywords.

Can I export to CSV? Yes — every field is flat (no nested objects). Use Apify's CSV / Excel export, or call the dataset API with format=csv.

🛠️ Technical notes

Stack: Node.js 22 · Apify SDK 3 · Crawlee CheerioCrawler (HTTP + JSON) · native fetch. No browser.
Endpoints: Greenhouse boards-api, Lever v0/postings, Ashby posting-api/job-board (all public, no auth).
Concurrency: min=1, max=5 (conservative; tune after real runs).
Memory: 1 GB min · 2 GB default · 4 GB max.
Proxy: Apify Proxy enabled by default; custom configs accepted; Apify Residential rejected at startup.

Company Domain → Career Page Job Finder

coregent/company-domain-career-page-job-finder

Turn company domains into live job listings: discover career pages, detect the ATS (Greenhouse, Lever, Ashby, Workday, generic), and extract clean, CSV-ready jobs with hiring signals - no login or cookies.

Delowar Munna

Multi-ATS Job Board Scraper (Greenhouse/Lever/Ashby)

chrisp1211/ats-jobs-scraper-max

Scrape live job openings from company career pages powered by Greenhouse, Lever and Ashby. Returns title, department, location, salary bands, and apply URL across multiple ATS platforms. No API key. Pay per job; empty runs are free.

Christian Pichichero

Multi-ATS Tech Hiring Signals

rex-law/tech-hiring-signal-monitor

Detect explainable technology hiring-demand changes from public Greenhouse, Lever, and Ashby job boards.

Rex Law

Job Postings Scraper (Greenhouse, Lever, Ashby)

trendlab/job-postings-scraper

Scrape live job postings straight from company career pages across Greenhouse, Lever, and Ashby. Filter by keyword, location, department. Clean, AI-ready JSON.

Cheoljae Lee

Career Site Job Listing API

213x/career-site-job-listing-api

Extract live job postings from company career pages, Greenhouse, Lever, Ashby, and generic JobPosting structured data.

ATS Job Scraper — Greenhouse, Lever & Ashby Careers

chilly_damask/company-careers-job-scraper

Scrape live job postings straight from Greenhouse, Lever, and Ashby public careers APIs, normalized into one schema. No login, no API keys, no browser. Pay per job.

Jaxx

Job Postings Monitor - Greenhouse, Lever & Ashby

hryndev/ats-hiring-monitor

Monitor job boards of any companies on Greenhouse, Lever or Ashby. Get new, changed and closed job postings as events — perfect hiring signals for outbound sales, recruiting and market intelligence.

Vladimir

Ashby + Greenhouse + Lever Job Scraper — Startup ATS API

quietloop-labs/ashby-greenhouse-lever-job-scraper

Extract every job posting from Ashby, Greenhouse or Lever career pages — validated against 335 real startup boards, most running Ashby. Clean normalized JSON, no login, no proxies, pay only for jobs returned.

Quietloop Labs

Multi-ATS Jobs API — Greenhouse, Lever & Ashby

neko_bobrik/multi-ats-jobs-api

One clean, normalized, de-duplicated feed of live job postings from company career pages across Greenhouse, Lever & Ashby. Public ATS endpoints only — no login, no anti-bot, legally clean. For recruiters, job boards, ATS aggregators & AI hiring tools.

Bobrik Dobrik

Greenhouse, Lever & Ashby Job Scraper + Hiring Signals

flawless_reishi/ats-job-scraper

Scrape all job postings from any company’s Greenhouse, Lever, Ashby or SmartRecruiters job board. Normalized output, salary data where published, plus hiring signals: roles by department, hiring velocity, remote share.