Dice Job Scraper avatar
Dice Job Scraper

Pricing

Pay per usage

Go to Apify Store
Dice Job Scraper

Dice Job Scraper

This lightweight and fast actor effortlessly scrapes job listings from Dice.com. Designed for simplicity, it extracts only the most essential data fields, giving you a clean and focused dataset. For a smooth and reliable run, using Apify Residential Proxies is strongly recommended.

Pricing

Pay per usage

Rating

5.0

(1)

Developer

Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

1

Bookmarked

11

Total users

9

Monthly active users

0.63 hours

Issues response

11 days ago

Last modified

Share

This Apify actor scrapes job listings from Dice.com. It is designed to be fast and lightweight, fetching data directly without the need for a headless browser.

Features

  • Scrapes job listings from Dice.com.
  • Extracts comprehensive job details, including title, company, location, posting date, and full description.
  • Handles pagination to retrieve multiple pages of search results.
  • Saves the collected data to the Apify dataset in a structured format.

Input

The actor requires the following input fields to define the job search criteria.

FieldTypeDescriptionDefault
keywordStringThe job title, skill, or keyword to search for (e.g., "Software Engineer").
locationStringThe geographic location for the job search (e.g., "Austin, TX", "Remote").
posted_dateStringFilters jobs by the date they were posted.all
results_wantedNumberThe maximum number of job listings to scrape.100
proxyConfigurationObjectSpecifies the proxy settings to be used for the crawl.{ "useApifyProxy": true }

Input Example

Here is an example of a valid input object:

{
"keyword": "Data Scientist",
"location": "New York, NY",
"posted_date": "7d",
"results_wanted": 50,
"proxyConfiguration": {
"useApifyProxy": true,
"proxyUrls": [],
"groups": ["RESIDENTIAL"]
}
}

Output

The actor stores the scraped job listings in the Apify dataset. Each item is a JSON object with the following structure:

FieldTypeDescription
titleStringThe job title.
companyStringThe name of the hiring company.
locationStringThe location of the job.
postedStringThe date the job was posted.
updatedStringThe date the job listing was last updated.
workSettingStringThe work arrangement (e.g., "Remote", "Hybrid", "On Site").
employmentTypeStringThe type of employment (e.g., "Full-Time", "Contract").
salaryStringThe salary information, if available.
description_htmlStringThe full job description in HTML format.
description_textStringThe plain text version of the job description.
urlStringThe URL of the original job posting on Dice.com.
dice_idStringThe Dice ID for the job, if available.
position_idStringThe Position ID for the job, if available.
sourceStringThe source of the job listing (always "dice.com").

Output Example

{
"title": "Senior Backend Engineer",
"company": "Tech Solutions Inc.",
"location": "San Francisco, CA",
"posted": "2 days ago",
"updated": "1 day ago",
"workSetting": "Hybrid",
"employmentType": "Full-Time",
"salary": "$150,000 - $180,000 per year",
"description_html": "...",
"description_text": "...",
"url": "https://www.dice.com/job-detail/...",
"dice_id": "911111w",
"position_id": "12345",
"source": "dice.com"
}

Usage

You can run this actor from the Apify platform or locally using the Apify CLI.

Running on Apify

  1. Go to the actor's page on the Apify platform.
  2. Click the Run button.
  3. Fill in the input fields and start the run.
  4. Once the run is finished, you can download the results from the Dataset tab.

Running Locally

To run the actor locally, you need to have Node.js and the Apify CLI installed.

# Install dependencies
npm install
# Run the actor
apify run

Configuration

Proxy Settings

This actor is configured to use Apify Proxy by default. Residential proxies are recommended for the best results, as they are less likely to be blocked. You can configure the proxy settings in the input.

Performance

The actor is designed for efficiency. You can adjust the maxConcurrency and requestRetries settings in the src/main.js file to fine-tune performance based on your needs.

Support

If you encounter any issues or have questions, please create an issue on the project's GitHub repository.

Author

This actor was developed by Shahid.