# Job Postings Scraper - Greenhouse, Lever & Ashby (Multi-ATS) (`bujhmml/ats-jobs-scraper`) Actor

Scrape live job postings from any Greenhouse, Lever, or Ashby careers board by slug, URL, or auto-detect. Returns title, location, department, employment type, remote flag, salary, apply URL and dates. Built-in keyword, location, department and remote filters. HTTP-first, no auth, no anti-bot.

- **URL**: https://apify.com/bujhmml/ats-jobs-scraper.md
- **Developed by:** [Ihor Bielievskiy](https://apify.com/bujhmml) (community)
- **Categories:** Jobs, Lead generation
- **Stats:** 3 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.40 / 1,000 jobs

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Job Postings Scraper - Greenhouse, Lever & Ashby (Multi-ATS)

Pull live job openings from any company that hires on **Greenhouse**, **Lever**, or **Ashby**. Give it a company slug, paste a careers-board URL, or let it auto-detect the platform for you. Every posting comes back as one clean, normalized row no matter which ATS it came from: title, location, department, team, employment type, remote flag, salary (where the ATS exposes it), the listing and apply URLs, created/updated dates, and the full description in both plain text and HTML. Export as JSON, CSV, or Excel.

It calls each ATS's own public job-board API directly instead of driving a headless browser, so it's fast, needs no login, and doesn't trip anti-bot. Mix as many boards and ATS types as you want in a single run.

### Three ways to point it at a board

You can put any of these in the `sources` array, and mix them freely:

1. **Explicit object** — `{ "ats": "greenhouse", "company": "stripe" }`
2. **Pasted board URL** — `"https://jobs.lever.co/spotify"` or `"https://boards.greenhouse.io/stripe"` (the ATS and slug are read straight out of the URL)
3. **Bare company slug** — `"ramp"` (the actor probes Greenhouse, Lever, and Ashby and uses whichever one has a board)

The `company` slug is the identifier in the careers-page URL: `boards.greenhouse.io/`**`stripe`**, `jobs.lever.co/`**`spotify`**, `jobs.ashbyhq.com/`**`ramp`**.

Underlying endpoints:

- **`greenhouse`** — `boards-api.greenhouse.io/v1/boards/{company}/jobs`
- **`lever`** — `api.lever.co/v0/postings/{company}`
- **`ashby`** — `api.ashbyhq.com/posting-api/job-board/{company}`

### Filters

Narrow the run before anything is billed — filtered-out rows never cost you a thing:

- `titleKeyword` — title must contain this text
- `locationKeyword` — location must contain this text
- `department` — department or team must contain this text
- `remoteOnly` — keep only jobs detected as fully remote

### Input

| Field | Type | Description |
|-------|------|-------------|
| `sources` | array | Board objects `{ats, company}`, board URLs, or bare slugs (any mix). Required. |
| `titleKeyword` | string | Keep only jobs whose title contains this (case-insensitive). |
| `locationKeyword` | string | Keep only jobs whose location contains this (case-insensitive). |
| `department` | string | Keep only jobs whose department or team contains this (case-insensitive). |
| `remoteOnly` | boolean | Keep only fully-remote jobs. |
| `maxItems` | integer | Stop after this many rows across all boards (0 = no limit). |
| `impersonate` | string | Browser TLS fingerprint (`chrome` by default). |

```json
{
  "sources": [
    { "ats": "greenhouse", "company": "stripe" },
    "https://jobs.lever.co/spotify",
    "ramp"
  ],
  "titleKeyword": "engineer",
  "remoteOnly": true,
  "maxItems": 500
}
````

### Output fields

One row per posting. A check means the field is populated when that ATS provides it; otherwise it's `null`.

| Field | Type | Greenhouse | Lever | Ashby |
|-------|------|:---:|:---:|:---:|
| `source_ats` | string | ✅ | ✅ | ✅ |
| `company` | string | ✅ | ✅ | ✅ |
| `job_id` | string | ✅ | ✅ | ✅ |
| `global_id` | string `{ats}:{company}:{job_id}` | ✅ | ✅ | ✅ |
| `title` | string | ✅ | ✅ | ✅ |
| `location` | string | ✅ | ✅ | ✅ |
| `department` | string | ✅ | ✅ | ✅ |
| `team` | string | — | ✅ | ✅ |
| `employment_type` | string | — | ✅ | ✅ |
| `remote` | boolean | inferred | ✅ | ✅ |
| `remote_type` | `remote`/`hybrid`/`onsite` | inferred | ✅ | ✅ |
| `salary` | string | — | when set | when set |
| `url` | string (listing) | ✅ | ✅ | ✅ |
| `apply_url` | string | — | ✅ | ✅ |
| `created_at` | ISO 8601 | ✅ | ✅ | ✅ |
| `updated_at` | ISO 8601 | ✅ | — | when set |
| `posted_at` | ISO 8601 | ✅ | ✅ | ✅ |
| `scraped_at` | ISO 8601 | ✅ | ✅ | ✅ |
| `description` | plain text | ✅ | ✅ | ✅ |
| `description_html` | HTML | ✅ | ✅ | ✅ |

`global_id` is stable across runs, so you can diff datasets or join against your own systems. Greenhouse's board API has no employment-type, team, or salary, no explicit remote flag, and no separate apply endpoint — `remote`/`remote_type` there are inferred from the location and title text (treat them as best-effort for that source), and `apply_url` is `null` since the listing `url` is the only link Greenhouse exposes.

### Example output

```json
{
  "source_ats": "lever",
  "company": "spotify",
  "job_id": "88499546-e9f7-4403-87a5-240050bd7c5b",
  "global_id": "lever:spotify:88499546-e9f7-4403-87a5-240050bd7c5b",
  "title": "Accounts Payable Analyst",
  "location": "New York, NY",
  "department": "Finance",
  "team": "Accounting",
  "employment_type": "Permanent",
  "remote": false,
  "remote_type": "hybrid",
  "salary": null,
  "url": "https://jobs.lever.co/spotify/88499546-e9f7-4403-87a5-240050bd7c5b",
  "apply_url": "https://jobs.lever.co/spotify/88499546-e9f7-4403-87a5-240050bd7c5b/apply",
  "created_at": "2026-05-11T11:20:11.285000+00:00",
  "updated_at": null,
  "posted_at": "2026-05-11T11:20:11.285000+00:00",
  "scraped_at": "2026-06-24T09:15:02.110000+00:00",
  "description": "Spotify is looking for an Accounts Payable Analyst ..."
}
```

### Why this one

- **One shape for every ATS.** Greenhouse, Lever, and Ashby each return a different JSON structure; this maps all of them to the same fields, so you can dump multiple companies into one dataset and not care where each row came from.
- **Paste a URL or just a name.** No need to know which ATS a company uses — paste the careers link or the bare slug and it figures the rest out.
- **Nothing fails silently.** A wrong slug, a failed request, or an API that changes shape gives you a typed error row (`fetch_failed`, `parse_failed`, `invalid_source`, `invalid_input`) instead of a quietly empty run. One bad board never aborts the rest, but if *every* source fails the run is marked failed (not a green empty dataset), and a rate-limit/anti-bot block (HTTP 403/429/5xx) is reported as a block rather than a false "no board found".
- **You're billed per valid posting delivered**, so error rows, duplicates, and rows you filter out don't cost you anything.
- **Duplicates are removed** within a run, per ATS, by job id.

### Notes

Only public, unauthenticated job-board APIs are used — the same data these companies publish on their own careers pages. Follow each provider's Terms and the laws that apply to you, and use the data responsibly.

### Who built this

I build scrapers for my own projects and publish the ones that turn out genuinely useful. This is one of them. If you need a custom scraper, a data pipeline, or a change to this actor, I'm available for freelance work.

GitHub: [github.com/bujhmml](https://github.com/bujhmml) · Site: [bujhmml.fun](https://bujhmml.fun)

# Actor input Schema

## `sources` (type: `array`):

Boards to scrape. Each item can be: an object `{ "ats": "greenhouse|lever|ashby", "company": "<slug>" }`, a pasted board URL (e.g. https://boards.greenhouse.io/stripe, https://jobs.lever.co/spotify, https://jobs.ashbyhq.com/ramp), or a bare company slug as a string (the ATS is auto-detected by probing all three platforms). The slug is the identifier in the careers URL: boards.greenhouse.io/<stripe>.

## `titleKeyword` (type: `string`):

Keep only jobs whose title contains this text (case-insensitive). Filtered-out rows are not billed.

## `locationKeyword` (type: `string`):

Keep only jobs whose location contains this text (case-insensitive). Filtered-out rows are not billed.

## `department` (type: `string`):

Keep only jobs whose department or team contains this text (case-insensitive). Filtered-out rows are not billed.

## `remoteOnly` (type: `boolean`):

Keep only jobs detected as fully remote. Filtered-out rows are not billed.

## `maxItems` (type: `integer`):

Stop after this many job rows across all boards. 0 = no limit.

## `impersonate` (type: `string`):

curl\_cffi impersonation target used for requests.

## Actor input object example

```json
{
  "sources": [
    {
      "ats": "greenhouse",
      "company": "stripe"
    },
    {
      "ats": "lever",
      "company": "spotify"
    },
    {
      "ats": "ashby",
      "company": "ramp"
    }
  ],
  "remoteOnly": false,
  "maxItems": 100,
  "impersonate": "chrome"
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "sources": [
        {
            "ats": "greenhouse",
            "company": "stripe"
        },
        {
            "ats": "lever",
            "company": "spotify"
        },
        {
            "ats": "ashby",
            "company": "ramp"
        }
    ],
    "maxItems": 100
};

// Run the Actor and wait for it to finish
const run = await client.actor("bujhmml/ats-jobs-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "sources": [
        {
            "ats": "greenhouse",
            "company": "stripe",
        },
        {
            "ats": "lever",
            "company": "spotify",
        },
        {
            "ats": "ashby",
            "company": "ramp",
        },
    ],
    "maxItems": 100,
}

# Run the Actor and wait for it to finish
run = client.actor("bujhmml/ats-jobs-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "sources": [
    {
      "ats": "greenhouse",
      "company": "stripe"
    },
    {
      "ats": "lever",
      "company": "spotify"
    },
    {
      "ats": "ashby",
      "company": "ramp"
    }
  ],
  "maxItems": 100
}' |
apify call bujhmml/ats-jobs-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=bujhmml/ats-jobs-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Job Postings Scraper - Greenhouse, Lever & Ashby (Multi-ATS)",
        "description": "Scrape live job postings from any Greenhouse, Lever, or Ashby careers board by slug, URL, or auto-detect. Returns title, location, department, employment type, remote flag, salary, apply URL and dates. Built-in keyword, location, department and remote filters. HTTP-first, no auth, no anti-bot.",
        "version": "1.1",
        "x-build-id": "DCr6fSU6Ei4URnTlJ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/bujhmml~ats-jobs-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-bujhmml-ats-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/bujhmml~ats-jobs-scraper/runs": {
            "post": {
                "operationId": "runs-sync-bujhmml-ats-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/bujhmml~ats-jobs-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-bujhmml-ats-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "sources"
                ],
                "properties": {
                    "sources": {
                        "title": "Company boards",
                        "type": "array",
                        "description": "Boards to scrape. Each item can be: an object `{ \"ats\": \"greenhouse|lever|ashby\", \"company\": \"<slug>\" }`, a pasted board URL (e.g. https://boards.greenhouse.io/stripe, https://jobs.lever.co/spotify, https://jobs.ashbyhq.com/ramp), or a bare company slug as a string (the ATS is auto-detected by probing all three platforms). The slug is the identifier in the careers URL: boards.greenhouse.io/<stripe>."
                    },
                    "titleKeyword": {
                        "title": "Title contains",
                        "type": "string",
                        "description": "Keep only jobs whose title contains this text (case-insensitive). Filtered-out rows are not billed."
                    },
                    "locationKeyword": {
                        "title": "Location contains",
                        "type": "string",
                        "description": "Keep only jobs whose location contains this text (case-insensitive). Filtered-out rows are not billed."
                    },
                    "department": {
                        "title": "Department / team contains",
                        "type": "string",
                        "description": "Keep only jobs whose department or team contains this text (case-insensitive). Filtered-out rows are not billed."
                    },
                    "remoteOnly": {
                        "title": "Remote only",
                        "type": "boolean",
                        "description": "Keep only jobs detected as fully remote. Filtered-out rows are not billed.",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Max results",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Stop after this many job rows across all boards. 0 = no limit.",
                        "default": 100
                    },
                    "impersonate": {
                        "title": "Browser TLS fingerprint",
                        "enum": [
                            "chrome",
                            "chrome131",
                            "chrome124",
                            "safari17_0"
                        ],
                        "type": "string",
                        "description": "curl_cffi impersonation target used for requests.",
                        "default": "chrome"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
