# Career Site Job Listing API (`shahidirfan/career-site-job-listing-api`) Actor

Extract job listings from major ATS career platforms using one or more URLs in startUrls. Collect clean, structured, and rich job records for sourcing, research, monitoring, and automation workflows.

- **URL**: https://apify.com/shahidirfan/career-site-job-listing-api.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** Jobs, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Career Site Job Listing Scraper

Extract job listings from major ATS career platforms using one or more URLs in `startUrls`. Collect clean, structured, and rich job records for sourcing, research, monitoring, and automation workflows.

### Features

- **Multi-platform coverage** — One actor supports Lever, Greenhouse, Ashby, SmartRecruiters, Workable, Recruitee, BreezyHR, BambooHR, Workday, TeamTailor, Personio, JazzHR, iCIMS, and Taleo.
- **Automatic platform detection** — Provide a career page URL and the actor routes to the right extraction flow.
- **Rich output fields** — Includes job ID, title, company, location parts, team/department, job type, dates, links, and platform metadata when available.
- **Clean dataset quality** — Removes duplicate records and skips null/empty values for cleaner downstream use.
- **Flexible configuration** — Supports result count and paging caps.

### Use Cases

#### Sourcing and Recruiting
Build company-specific job datasets directly from employer career systems for recruiter outreach, market mapping, and role tracking.

#### Competitive Hiring Intelligence
Track hiring activity across competitors by platform, location, and role family without manually checking each career site.

#### Job Alert Automation
Schedule runs and push fresh roles into your CRM, Airtable, Slack, email digests, or webhooks.

#### Talent Market Research
Analyze demand trends by title, location, function, and employment type using normalized multi-platform output.

---

### Input Parameters

| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| `startUrls` | Array<String> | Yes | — | One or more career board URLs for supported ATS platforms. |
| `results_wanted` | Integer | No | `20` | Maximum jobs to save. |
| `max_pages` | Integer | No | `5` | Page/offset safety cap for paginated sources. |
| `allow_html_detail_fallback` | Boolean | No | `false` | Optional detail-page enrichment for BreezyHR, iCIMS, and Taleo when API/feed descriptions are missing. |
| `proxyConfiguration` | Object | No | — | Optional Apify proxy settings. |

---

### Supported Job Boards

| Platform | Typical URL Format | Example |
|---|---|---|
| Lever | `https://jobs.lever.co/{company}` | `https://jobs.lever.co/spotify` |
| Greenhouse | `https://job-boards.greenhouse.io/{company}` | `https://job-boards.greenhouse.io/airbnb` |
| Ashby | `https://jobs.ashbyhq.com/{company}` | `https://jobs.ashbyhq.com/vercel` |
| SmartRecruiters | `https://careers.smartrecruiters.com/{company}` | `https://careers.smartrecruiters.com/Visa` |
| Workable | `https://apply.workable.com/{company}/` | `https://apply.workable.com/evidence-action/` |
| TeamTailor | `https://{company}.teamtailor.com/` or `https://careers.{company}.com/` | `https://careers.kognity.com/` |
| BreezyHR | `https://{company}.breezy.hr/` | `https://breezy-hr.breezy.hr/` |
| BambooHR | `https://{company}.bamboohr.com/careers/` | `https://bamboohr.bamboohr.com/careers/` |
| Recruitee | `https://{company}.recruitee.com/` | `https://recruitee.recruitee.com/` |
| Workday | `https://{company}.wdN.myworkdayjobs.com/{board}` | `https://sony.wd1.myworkdayjobs.com/SonyCareers` |
| Personio | `https://{company}.jobs.personio.de/` | `https://company.jobs.personio.de/` |
| JazzHR | `https://{company}.jazz.co/` | `https://example.jazz.co/` |
| iCIMS | `https://careers-{company}.icims.com/` | `https://careers-kloveair1.icims.com/jobs/search?ss=1` |
| Taleo | `https://{company}.taleo.net/careersection/{section}/jobsearch.ftl` | `https://nato.taleo.net/careersection/2/jobsearch.ftl?lang=en` |

---

### Find Company Boards with Google

If you want to find companies for a single platform quickly, use these search commands:

| Platform | Google Search Command |
|---|---|
| Lever | `site:jobs.lever.co "Job Title"` |
| Greenhouse | `site:boards.greenhouse.io "Job Title"` |
| Workday | `site:myworkdayjobs.com "Job Title"` |
| Ashby | `site:jobs.ashbyhq.com "Job Title"` |
| SmartRecruiters | `site:smartrecruiters.com "Job Title"` |
| Workable | `site:apply.workable.com "Job Title"` |
| TeamTailor | `site:teamtailor.com "Job Title"` |
| BreezyHR | `site:breezy.hr "Job Title"` |
| BambooHR | `site:bamboohr.com/careers "Job Title"` |
| Recruitee | `site:recruitee.com "Job Title"` |
| Personio | `site:jobs.personio.de "Job Title"` |
| JazzHR | `site:jazz.co "Job Title"` |
| iCIMS | `site:icims.com/jobs/search "Job Title"` |
| Taleo | `site:taleo.net/careersection "Job Title"` |

---

### Output Data

Each dataset item can include:

| Field | Type | Description |
|---|---|---|
| `job_id` | String | Platform job identifier when available. |
| `title` | String | Job title. |
| `company` | String | Company or tenant name. |
| `location` | String | Human-readable location string. |
| `city` | String | City when available. |
| `state` | String | State/region when available. |
| `country` | String | Country when available. |
| `department` | String | Department or function. |
| `team` | String | Team name when provided. |
| `job_type` | String | Employment type. |
| `workplace_type` | String | On-site/remote/hybrid style when provided. |
| `date_posted` | String | Posting date (`YYYY-MM-DD`) when available. |
| `updated_at` | String | Updated date when available. |
| `url` | String | Public job URL. |
| `apply_url` | String | Apply URL. |
| `description` | String | Job description text when available. |
| `remote` | Boolean/String | Remote marker where provided by source. |
| `platform` | String | Detected ATS platform key. |
| `source_host` | String | Source hostname derived from `url`. |
| `scraped_at` | String | Extraction timestamp in ISO format. |

---

### Usage Examples

#### Single Board (Lever)

```json
{
    "startUrls": ["https://jobs.lever.co/spotify"],
    "results_wanted": 20
}
````

#### Multi-board Run

```json
{
    "startUrls": [
        "https://jobs.lever.co/spotify",
        "https://job-boards.greenhouse.io/iherb",
        "https://careersen-hrrh.icims.com/jobs/search?ss=1"
    ],
    "results_wanted": 30,
    "max_pages": 8,
    "allow_html_detail_fallback": true
}
```

***

### Sample Output

```json
{
    "job_id": "58860a10-4a0d-4a21-a495-1f3605b300c1",
    "title": "Backend Engineer - User Platform",
    "company": "spotify",
    "location": "Toronto",
    "department": "Engineering",
    "workplace_type": "remote",
    "date_posted": "2026-01-16",
    "url": "https://jobs.lever.co/spotify/58860a10-4a0d-4a21-a495-1f3605b300c1",
    "apply_url": "https://jobs.lever.co/spotify/58860a10-4a0d-4a21-a495-1f3605b300c1/apply",
    "description": "User Platform is responsible for...",
    "platform": "lever",
    "source_host": "jobs.lever.co",
    "scraped_at": "2026-05-23T10:12:34.567Z"
}
```

***

### Tips for Best Results

#### Use canonical career board URLs

- Start from the platform-native board URL (for example `jobs.lever.co/{company}`).
- Avoid redirect-heavy marketing pages when possible.

#### Start with small runs first

- Use `results_wanted: 20` for quick validation.
- Increase limits after confirming data quality.

#### Handle platform variability

- Some example company slugs on public lists may be outdated or renamed.
- If a board returns zero results, test another company URL for that platform.

***

### Integrations

Connect dataset output with:

- **Google Sheets** — Operational tracking and reporting
- **Airtable** — Searchable talent opportunity database
- **Slack** — Job alert channels
- **Webhooks** — Push records into custom pipelines
- **Make** — No-code automation flows
- **Zapier** — Trigger downstream actions

#### Export Formats

- **JSON** — API and application use
- **CSV** — Spreadsheet workflows
- **Excel** — Business sharing
- **XML** — Legacy integrations

***

### Frequently Asked Questions

#### Why do some boards return zero jobs?

Some public examples go stale, change slugs, or disable public listings. Use the Google search commands above to find fresh company board URLs.

#### Does one input work for all platforms?

Yes. Use `startUrls` and the actor auto-detects each board URL.

#### Are duplicate records removed?

Yes. The actor deduplicates by platform, ID/link, title, and company before saving.

#### Can I run this on a schedule?

Yes. Schedule runs in Apify and consume only latest dataset items.

***

### Support

For issues or feature requests, open the actor issue thread in Apify Console.

#### Resources

- [Apify Documentation](https://docs.apify.com/)
- [Apify API Reference](https://docs.apify.com/api/v2)
- [Apify Scheduling](https://docs.apify.com/platform/schedules)

***

### Legal Notice

Use this actor only for lawful data collection and in compliance with platform terms and applicable regulations.

# Actor input Schema

## `startUrls` (type: `array`):

Add one or more supported career board URLs.

## `results_wanted` (type: `integer`):

Maximum number of job listings to collect.

## `max_pages` (type: `integer`):

Safety cap on number of API pages to fetch.

## `allow_html_detail_fallback` (type: `boolean`):

When enabled, fetches job detail pages only for boards that may miss API description fields.

## `proxyConfiguration` (type: `object`):

Optional proxy settings.

## Actor input object example

```json
{
  "startUrls": [
    "https://jobs.lever.co/spotify"
  ],
  "results_wanted": 20,
  "max_pages": 5,
  "allow_html_detail_fallback": false,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://jobs.lever.co/spotify"
    ],
    "results_wanted": 20,
    "max_pages": 5,
    "allow_html_detail_fallback": false
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/career-site-job-listing-api").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": ["https://jobs.lever.co/spotify"],
    "results_wanted": 20,
    "max_pages": 5,
    "allow_html_detail_fallback": False,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/career-site-job-listing-api").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://jobs.lever.co/spotify"
  ],
  "results_wanted": 20,
  "max_pages": 5,
  "allow_html_detail_fallback": false
}' |
apify call shahidirfan/career-site-job-listing-api --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/career-site-job-listing-api",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Career Site Job Listing API",
        "description": "Extract job listings from major ATS career platforms using one or more URLs in startUrls. Collect clean, structured, and rich job records for sourcing, research, monitoring, and automation workflows.",
        "version": "0.0",
        "x-build-id": "B0TfrRvT1TeieCO4h"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~career-site-job-listing-api/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-career-site-job-listing-api",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~career-site-job-listing-api/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-career-site-job-listing-api",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~career-site-job-listing-api/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-career-site-job-listing-api",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Job Board URLs",
                        "type": "array",
                        "description": "Add one or more supported career board URLs.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "results_wanted": {
                        "title": "Maximum number of jobs",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of job listings to collect.",
                        "default": 20
                    },
                    "max_pages": {
                        "title": "Maximum pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Safety cap on number of API pages to fetch.",
                        "default": 5
                    },
                    "allow_html_detail_fallback": {
                        "title": "Allow HTML detail fallback (BreezyHR/iCIMS/Taleo only)",
                        "type": "boolean",
                        "description": "When enabled, fetches job detail pages only for boards that may miss API description fields.",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional proxy settings.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
