Indeed Job Scraper avatar
Indeed Job Scraper

Pricing

Pay per usage

Go to Apify Store
Indeed Job Scraper

Indeed Job Scraper

Developed by

Shahid Irfan

Shahid Irfan

Maintained by Community

A simple Indeed Job Scraper for minimalist, essential data. Uses residential proxies and cookies to prevent blocks, ensuring smooth and reliable runs. Perfect for getting targeted job data without the clutter.

5.0 (1)

Pricing

Pay per usage

1

17

12

Last modified

3 days ago

Indeed Jobs Scraper

A configurable actor that scrapes job listings from Indeed search results. Designed to be efficient and configurable for production runs.

Overview

This actor collects job listing metadata and, optionally, full job descriptions. It supports single or multiple search URLs, pagination, and options to control concurrency, proxy usage, and cookies.

Features

  • Scrape job title, company, location, salary, post date, and description (HTML & text).
  • Accepts a full search URL or builds searches from keywords and location.
  • Handles pagination to collect multiple pages of results.
  • Supports configuring concurrency, proxy usage, and cookies for authenticated sessions.
  • Outputs results to the default dataset for further processing.

Inputs

Provide a JSON object with the following properties. Any unspecified fields use sensible defaults.

Search parameters

FieldTypeDescription
searchUrlstringFull Indeed search URL (if present, keyword and location are ignored).
startUrlsstring[]List of search URLs to process (optional).
keywordstringSearch keywords (used when searchUrl is not provided).
locationstringLocation filter for searches (optional).
posted_datestringFilter by date posted: e.g., Last 24 hours, Last 7 days, Last 30 days.

Scraping options

FieldTypeDescription
maxItemsnumberMaximum number of job items to collect.
collectDetailsbooleanIf true, visits each job detail page to extract full description.
maxConcurrencynumberMaximum parallel requests (tune to avoid rate limits).
cookies / cookiesJsonobjectstring
proxyConfigurationobjectProxy settings (use residential proxies when needed).

Example input

{
"startUrls": ["https://www.indeed.com/jobs?q=software+engineer&l=Remote"],
"maxItems": 200,
"collectDetails": true,
"maxConcurrency": 5,
"proxyConfiguration": { "useApifyProxy": true }
}

Output

The actor writes results to the dataset. Each item includes:

  • title — Job title
  • company — Company name
  • location — Job location
  • postedAt — When the job was posted (human readable)
  • salary — Salary information (if present)
  • description_html — Job description in HTML
  • description_text — Plain text job description
  • url — Job posting URL
  • source — Source identifier (e.g., indeed)
  • search_url — Search page where the job was found

How to run

  1. Provide the input JSON (example above) via the platform's run interface or CLI.
  2. Start the actor. Monitor the run and dataset for collected items.
  3. Adjust maxConcurrency, proxyConfiguration, and cookies if you encounter rate limiting.

Best practices & troubleshooting

Avoiding blocks

  • Use a proxy pool or residential proxies for large-scale runs.
  • Lower concurrency and add small delays when you see request failures.
  • Provide valid cookies if you want to run authenticated sessions or reduce bot checks.

Common issues

  • Incomplete results: increase maxItems or confirm your search URL parameters.
  • Many HTTP errors: reduce maxConcurrency and/or enable proxy rotation.
  • Captcha / challenges: try using cookies from a valid session and a reliable proxy provider.

Notes

Structure your input carefully and run smaller test jobs first to validate settings. Adjust proxies and concurrency for production-scale scraping.