# Dice.com Tech Jobs Scraper (`crawlerbros/dice-jobs-scraper`) Actor

Scrape US tech jobs from Dice.com with titles, companies, salaries, skills, descriptions, and remote-work flags. HTTP-only, no login required.

- **URL**: https://apify.com/crawlerbros/dice-jobs-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Jobs, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 6 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Dice.com Tech Jobs Scraper

Scrape US tech jobs from [Dice.com](https://www.dice.com) — titles, companies, salaries, skills, descriptions, and remote-work flags. Parses `JobPosting` JSON-LD on each `/job-detail/<uuid>` page. HTTP-only; no login, no cookies required.

### Output (per job)

- `type` = `job_dice`
- `id`, `url`, `positionId` (Dice internal ID from `__NEXT_DATA__`)
- `title`, `jobTitleOverride` (Dice-side override, when present)
- `company`, `companyUrl`, `companyLogo`, `companyIdentifier`
- `hiringOrganizationUrl` — from JSON-LD `hiringOrganization.url`
- `location` — `{city, state, postalCode, country}`
- `applicantCountry` — required applicant country (`applicantLocationRequirements.name`)
- `jobLocationType` — `TELECOMMUTE` / `ONSITE` (JSON-LD)
- `salary` / `salaryDetails` (min, max, currency, unit — when published)
- `salaryUnit` — `HOUR` / `DAY` / `WEEK` / `MONTH` / `YEAR`
- `employmentType` — `FULL_TIME`, `PART_TIME`, `CONTRACTS`, `THIRD_PARTY`
- `employmentDetail` — extra employment metadata (Dice inline state)
- `postedAt`, `datePosted` (raw JSON-LD string), `datePostedLocal`
- `applyBefore`, `validThrough`
- `occupationalCategory`, `industry`
- `educationRequirements`, `experienceRequirements`, `experienceMonths`
- `qualifications`, `responsibilities`, `jobBenefits`, `specialCommitments`
- `incentives` — e.g. "Sign-on bonus", "401k match" (from inline state)
- `totalJobOpenings` — number of openings (when published)
- `descriptionHtml`, `descriptionText`
- `skills` (array)
- `isRemote`, `easyApply` (when signaled on page)
- `directApply` — from JSON-LD `directApply` when present
- `scrapedAt`

If no jobs match, a single `job_dice_blocked` sentinel record is emitted so runs exit 0.

### Input

| Field | Type | Description |
|---|---|---|
| `startUrls` | string[] | Dice search or `/job-detail/<uuid>` URLs. Prefill: `https://www.dice.com/jobs?q=python`. |
| `searchTerm` | string | Used when no `startUrls` — builds `https://www.dice.com/jobs?q=<term>`. |
| `location` | string | City / ZIP added as `location` query param. |
| `employmentType` | enum | `any / FULL_TIME / PART_TIME / CONTRACTS / THIRD_PARTY` — applied client-side. |
| `workType` | enum | `any / remote / hybrid / on-site` — matched against title + description. |
| `maxItems` | integer | Max jobs per run. Default 3. |
| `salaryMin` | integer | Minimum salary — matched against JSON-LD `baseSalary.value.minValue`. |
| `datePostedDays` | integer | Only include jobs posted within the last N days (from JSON-LD `datePosted`). |
| `includeKeywords` | string[] | Title/description must contain at least one of these (case-insensitive). |
| `excludeKeywords` | string[] | Drop jobs whose title/description contains any of these (case-insensitive). |
| `proxyConfiguration` | object | Apify proxy (datacenter by default). |

### How it works

1. For each `startUrls` entry, classify as search page or direct `/job-detail/` URL.
2. Search pages: extract every `/job-detail/<uuid>` href.
3. For each job URL, fetch detail page and parse `JobPosting` JSON-LD (title, company, salary, location, description). Apply client-side `employmentType` / `workType` filters.
4. Rotate Apify-proxy session per retry on 403 / 429 / 5xx.

### FAQ

**Do I need a proxy?** The default Apify proxy is enabled to avoid 403s on detail pages. Free datacenter proxy is sufficient.
**Why a sentinel record?** When the search has no matches or the provided URL 404s, the actor still emits one record so downstream pipelines never see an empty output.

# Actor input Schema

## `startUrls` (type: `array`):

Dice search or job-detail URLs. Examples: https://www.dice.com/jobs?q=python, https://www.dice.com/jobs?q=react&location=New+York
## `searchTerm` (type: `string`):

Keyword used when no startUrls are supplied. Builds https://www.dice.com/jobs?q=<term>.
## `location` (type: `string`):

City / ZIP used as the `location` query param when building a URL from searchTerm.
## `employmentType` (type: `string`):

Filter by employment type (applied client-side).
## `workType` (type: `string`):

Filter by on-site / hybrid / remote (applied client-side).
## `maxItems` (type: `integer`):

Maximum jobs per run.
## `salaryMin` (type: `integer`):

Minimum salary filter. Matched against JSON-LD baseSalary.value.minValue (same currency/unit as the posting).
## `datePostedDays` (type: `integer`):

Only include jobs posted within the last N days (based on JSON-LD datePosted).
## `includeKeywords` (type: `array`):

Job must contain at least one of these keywords in title or description (case-insensitive).
## `excludeKeywords` (type: `array`):

Drop jobs where title or description contains any of these keywords (case-insensitive).
## `proxyConfiguration` (type: `object`):

Apify proxy. Free Apify datacenter proxy recommended.

## Actor input object example

```json
{
  "startUrls": [
    "https://www.dice.com/jobs?q=python"
  ],
  "employmentType": "any",
  "workType": "any",
  "maxItems": 3,
  "includeKeywords": [],
  "excludeKeywords": [],
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
````

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://www.dice.com/jobs?q=python"
    ],
    "maxItems": 3,
    "includeKeywords": [],
    "excludeKeywords": [],
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/dice-jobs-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": ["https://www.dice.com/jobs?q=python"],
    "maxItems": 3,
    "includeKeywords": [],
    "excludeKeywords": [],
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/dice-jobs-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://www.dice.com/jobs?q=python"
  ],
  "maxItems": 3,
  "includeKeywords": [],
  "excludeKeywords": [],
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call crawlerbros/dice-jobs-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/dice-jobs-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Dice.com Tech Jobs Scraper",
        "description": "Scrape US tech jobs from Dice.com with titles, companies, salaries, skills, descriptions, and remote-work flags. HTTP-only, no login required.",
        "version": "1.0",
        "x-build-id": "aWdTdKXfGAppdFZr9"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~dice-jobs-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-dice-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~dice-jobs-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-dice-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~dice-jobs-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-dice-jobs-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Dice search or job-detail URLs. Examples: https://www.dice.com/jobs?q=python, https://www.dice.com/jobs?q=react&location=New+York",
                        "default": [
                            "https://www.dice.com/jobs?q=python"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchTerm": {
                        "title": "Search Term",
                        "type": "string",
                        "description": "Keyword used when no startUrls are supplied. Builds https://www.dice.com/jobs?q=<term>."
                    },
                    "location": {
                        "title": "Location",
                        "type": "string",
                        "description": "City / ZIP used as the `location` query param when building a URL from searchTerm."
                    },
                    "employmentType": {
                        "title": "Employment Type",
                        "enum": [
                            "any",
                            "FULL_TIME",
                            "PART_TIME",
                            "CONTRACTS",
                            "THIRD_PARTY"
                        ],
                        "type": "string",
                        "description": "Filter by employment type (applied client-side).",
                        "default": "any"
                    },
                    "workType": {
                        "title": "Work Type",
                        "enum": [
                            "any",
                            "remote",
                            "hybrid",
                            "on-site"
                        ],
                        "type": "string",
                        "description": "Filter by on-site / hybrid / remote (applied client-side).",
                        "default": "any"
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum jobs per run.",
                        "default": 3
                    },
                    "salaryMin": {
                        "title": "Minimum Salary",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Minimum salary filter. Matched against JSON-LD baseSalary.value.minValue (same currency/unit as the posting)."
                    },
                    "datePostedDays": {
                        "title": "Posted Within (days)",
                        "minimum": 1,
                        "maximum": 365,
                        "type": "integer",
                        "description": "Only include jobs posted within the last N days (based on JSON-LD datePosted)."
                    },
                    "includeKeywords": {
                        "title": "Include Keywords",
                        "type": "array",
                        "description": "Job must contain at least one of these keywords in title or description (case-insensitive).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "excludeKeywords": {
                        "title": "Exclude Keywords",
                        "type": "array",
                        "description": "Drop jobs where title or description contains any of these keywords (case-insensitive).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Apify proxy. Free Apify datacenter proxy recommended.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
