# Company Job Listings Scraper (`careybrown/company-jobs-scraper`) Actor

Scrape public Greenhouse company job boards for normalized job listing and hiring-signal records.

- **URL**: https://apify.com/careybrown/company-jobs-scraper.md
- **Developed by:** [Carey Brown](https://apify.com/careybrown) (community)
- **Categories:** Jobs, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Company Job Listings Scraper

Company Job Listings Scraper collects public job listings from Greenhouse-hosted company career boards and turns them into normalized hiring-signal records.

Use it to monitor target companies, compare open roles, spot department-level hiring activity, build sales-trigger lists, support recruiting intelligence, or feed market-research dashboards without manually checking each company career page.

The current public version supports Greenhouse public job boards only. It does not scrape LinkedIn, Indeed, Upwork, Lever, Ashby, candidate profiles, emails, phone numbers, outreach systems, or CRM records.

### What It Returns

Each job result includes:

- `companyName`
- `provider`
- `boardToken`
- `jobId`
- `jobTitle`
- `location`
- `department`
- `absoluteUrl`
- `updatedAt`
- `roleFamily`
- `seniority`
- `remoteSignal`
- `hiringIntentScore`
- `sourceUrl`
- `sourceHealth`

The Actor also writes a `RUN_SUMMARY` record with fetched and returned counts plus source-health details for each company board.

### Input

Provide one or more Greenhouse company board tokens.

```json
{
  "companies": [
    {
      "name": "Stripe",
      "provider": "greenhouse",
      "boardToken": "stripe"
    },
    {
      "name": "Figma",
      "provider": "greenhouse",
      "boardToken": "figma"
    }
  ],
  "maxJobsPerCompany": 25,
  "includeDescriptions": false,
  "includeRawRecord": false
}
````

`boardToken` is the public Greenhouse board slug used in URLs such as:

```text
https://boards-api.greenhouse.io/v1/boards/stripe/jobs
```

### Best-Fit Use Cases

- Sales teams watching account hiring activity
- Recruiting intelligence teams monitoring target companies
- Agencies building company hiring dashboards
- Market researchers tracking role mix and expansion signals
- RevOps and data teams creating repeatable company-job feeds

### Filters

Use keyword filters to narrow returned jobs:

- `roleKeywords`
- `locationKeywords`
- `maxJobsPerCompany`
- `includeDescriptions`
- `includeRawRecord`

### Claim Boundaries

This Actor:

- Scrapes public Greenhouse-hosted company job boards.
- Returns normalized company job listing records.
- Includes deterministic role family, seniority, remote signal, hiring intent score, source URL, and source health.
- Supports company board token inputs.
- Supports role and location keyword filters.
- Supports capped results per company board.

This Actor does not:

- Support all ATS providers.
- Scrape LinkedIn, Indeed, Upwork, Lever, or Ashby.
- Provide complete company hiring coverage.
- Verify buyer intent.
- Enrich employee, candidate, email, phone, or contact data.
- Send outreach or sync records to a CRM.
- Provide legal, HR, recruiting, or compliance advice.
- Guarantee revenue, hiring accuracy, or lead quality.

### Local Development

```bash
python3 scripts/run_actor_local.py --input examples/input_default.json --storage-dir .local-storage
```

This writes Apify-style local storage under `.local-storage/datasets/default` and `.local-storage/key_value_stores/default`.

# Actor input Schema

## `companies` (type: `array`):

Public Greenhouse company job boards to scrape. Use the board token from the public board URL/API.

## `roleKeywords` (type: `array`):

Optional title, department, or description keyword filters.

## `locationKeywords` (type: `array`):

Optional location keyword filters, such as remote, New York, London, or San Francisco.

## `maxJobsPerCompany` (type: `integer`):

Maximum normalized jobs to return for each company board.

## `includeDescriptions` (type: `boolean`):

Fetch and include job description text when the source provides it.

## `includeRawRecord` (type: `boolean`):

Include the original source job object on each output record.

## Actor input object example

```json
{
  "companies": [
    {
      "name": "Stripe",
      "provider": "greenhouse",
      "boardToken": "stripe"
    },
    {
      "name": "Databricks",
      "provider": "greenhouse",
      "boardToken": "databricks"
    },
    {
      "name": "Figma",
      "provider": "greenhouse",
      "boardToken": "figma"
    }
  ],
  "maxJobsPerCompany": 100,
  "includeDescriptions": false,
  "includeRawRecord": false
}
```

# Actor output Schema

## `results` (type: `string`):

Normalized company job listings written to the default dataset.

## `runSummary` (type: `string`):

Source health, fetched counts, returned counts, and aggregate run details.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("careybrown/company-jobs-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("careybrown/company-jobs-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call careybrown/company-jobs-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=careybrown/company-jobs-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Company Job Listings Scraper",
        "description": "Scrape public Greenhouse company job boards for normalized job listing and hiring-signal records.",
        "version": "0.1",
        "x-build-id": "IFWzPRux6uA36VVc3"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/careybrown~company-jobs-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-careybrown-company-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/careybrown~company-jobs-scraper/runs": {
            "post": {
                "operationId": "runs-sync-careybrown-company-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/careybrown~company-jobs-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-careybrown-company-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "companies": {
                        "title": "Companies",
                        "type": "array",
                        "description": "Public Greenhouse company job boards to scrape. Use the board token from the public board URL/API.",
                        "items": {
                            "type": "object",
                            "required": [
                                "name",
                                "provider",
                                "boardToken"
                            ],
                            "properties": {
                                "name": {
                                    "title": "Company Name",
                                    "description": "Readable company name to include in output records.",
                                    "type": "string"
                                },
                                "provider": {
                                    "title": "Provider",
                                    "description": "Job-board provider. MVP supports Greenhouse only.",
                                    "type": "string",
                                    "enum": [
                                        "greenhouse"
                                    ]
                                },
                                "boardToken": {
                                    "title": "Board Token",
                                    "description": "Greenhouse board token used in the public jobs API URL.",
                                    "type": "string"
                                }
                            }
                        },
                        "default": [
                            {
                                "name": "Stripe",
                                "provider": "greenhouse",
                                "boardToken": "stripe"
                            },
                            {
                                "name": "Databricks",
                                "provider": "greenhouse",
                                "boardToken": "databricks"
                            },
                            {
                                "name": "Figma",
                                "provider": "greenhouse",
                                "boardToken": "figma"
                            }
                        ]
                    },
                    "roleKeywords": {
                        "title": "Role Keywords",
                        "type": "array",
                        "description": "Optional title, department, or description keyword filters.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "locationKeywords": {
                        "title": "Location Keywords",
                        "type": "array",
                        "description": "Optional location keyword filters, such as remote, New York, London, or San Francisco.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxJobsPerCompany": {
                        "title": "Maximum Jobs Per Company",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum normalized jobs to return for each company board.",
                        "default": 100
                    },
                    "includeDescriptions": {
                        "title": "Include Descriptions",
                        "type": "boolean",
                        "description": "Fetch and include job description text when the source provides it.",
                        "default": false
                    },
                    "includeRawRecord": {
                        "title": "Include Raw Record",
                        "type": "boolean",
                        "description": "Include the original source job object on each output record.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
