Indeed Job Scraper avatar

Indeed Job Scraper

Pricing

Pay per usage

Go to Apify Store
Indeed Job Scraper

Indeed Job Scraper

A simple Indeed Job Scraper for minimalist, essential data. Uses residential proxies and cookies to prevent blocks, ensuring smooth and reliable runs. Perfect for getting targeted job data without the clutter.

Pricing

Pay per usage

Rating

5.0

(4)

Developer

Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

2

Bookmarked

165

Total users

24

Monthly active users

2.6 days

Issues response

16 days ago

Last modified

Share

Indeed Jobs Scraper

A powerful and configurable scraper for extracting job listings from Indeed.com. Ideal for job market analysis, recruitment automation, and data-driven insights. This actor efficiently collects job metadata and full descriptions, supporting advanced search parameters, pagination, and anti-detection measures for reliable, large-scale scraping.

What is this actor?

The Indeed Jobs Scraper is designed to automate the extraction of job postings from Indeed's search results. Whether you're building a job board, conducting market research, or aggregating employment data, this tool provides a seamless way to gather structured job information without manual effort. It mimics human browsing to avoid blocks, making it suitable for production use.

Key Features

  • Comprehensive Data Extraction: Scrape job titles, companies, locations, salaries, posting dates, and detailed descriptions (both HTML and plain text).
  • JSON-first & API fallbacks: Job cards are parsed from Indeed's provider JSON before HTML selectors; job details try the lightweight rpc/jobdescs API before DOM parsing for speed and resilience.
  • Flexible Search Options: Input full Indeed URLs, keywords, locations, or date filters to target specific job searches.
  • Pagination Support: Automatically handles multiple pages of results to collect extensive datasets.
  • Performance Controls: Configure residential proxies and cookiesJson to improve success rate and reduce blocking.
  • Output to Dataset: Results are stored in a structured JSON format for easy integration with downstream tools.
  • Anti-Bot Measures: Built-in support for proxies and session management to ensure uninterrupted scraping.

Use Cases

  • Job Market Research: Analyze trends in job postings by location, industry, or salary ranges.
  • Recruitment Platforms: Feed job data into custom job boards or matching algorithms.
  • Data Analytics: Collect and process job listings for reports on employment opportunities.
  • Competitive Intelligence: Monitor competitor hiring patterns or industry-specific roles.
  • Automation Workflows: Integrate with tools like Zapier or custom scripts for automated job alerts.

Inputs

Configure the actor with a JSON input object. Defaults are applied for unspecified fields to ensure ease of use.

Search Parameters

FieldTypeDescription
startUrlsstring[]Array of Indeed search URLs to scrape multiple queries in one run.
keywordstringJob search keywords (e.g., "software engineer"). Used to build search URLs.
locationstringGeographic filter (e.g., "Remote" or "San Francisco, CA").
posted_datestringDate filter: 24h, 7d, 30d, or anytime.
results_wantednumberMaximum number of jobs to collect (recommended default: 20 for fast runs).
maxPagesnumberStop after this many pages; leave empty to rely on results_wanted.

Scraping Options

FieldTypeDescription
collectDetailsbooleanEnable to fetch full job descriptions from detail pages (default: false).
cookiesJsonstringBrowser cookies in JSON format (recommended). Example format: [{"name":"foo","value":"bar"}].
proxyConfigurationobjectProxy settings. Residential Apify Proxy is recommended and prefilled by default.

Example Input

{
"startUrls": [
"https://www.indeed.com/jobs?q=software+engineer&l=Remote",
"https://www.indeed.com/jobs?q=data+scientist&l=New+York"
],
"results_wanted": 100,
"collectDetails": true,
"cookiesJson": "[{\"name\":\"CTK\",\"value\":\"your_cookie_value\"}]",
"proxyConfiguration": {
"useApifyProxy": true,
"groups": ["RESIDENTIAL"],
"apifyProxyGroups": ["RESIDENTIAL"]
}
}

This example scrapes up to 100 jobs across two queries, includes full descriptions, and uses both cookies and residential proxy settings for improved stability.

Use browser cookie extension export and paste it into cookiesJson exactly as JSON text.

  1. Install a cookie editor extension in Chrome/Brave (for example, EditThisCookie or Cookie-Editor).
  2. Open https://www.indeed.com, log in, and complete consent/captcha in the same browser session.
  3. Open the extension and export cookies for indeed.com as JSON array.
  4. Keep only name and value pairs if needed, for example:
[
{ "name": "CTK", "value": "..." },
{ "name": "JSESSIONID", "value": "..." }
]
  1. Paste that JSON array into the actor cookiesJson input field.

cookiesJson is converted into a standard Cookie header and sent with list and detail requests. Using fresh login cookies with residential proxies is strongly recommended to reduce 403/429 blocking.

Output

Data is saved to the default dataset in JSON format. Each record contains:

  • title (string): Job title.
  • company (string): Hiring company.
  • location (string): Job location.
  • postedAt (string): Posting date (e.g., "2 days ago").
  • salary (string): Salary details if available.
  • description_html (string): Full job description in HTML.
  • description_text (string): Plain text version of the description.
  • url (string): Direct link to the job posting.
  • source (string): Always "indeed".
  • search_url (string): The search page URL where the job was found.

Example output item:

{
"title": "Senior Software Engineer",
"company": "Tech Corp",
"location": "Remote",
"postedAt": "1 day ago",
"salary": "$120,000 - $150,000 a year",
"description_html": "<p>We are looking for...</p>",
"description_text": "We are looking for a skilled engineer...",
"url": "https://www.indeed.com/viewjob?jk=12345",
"source": "indeed",
"search_url": "https://www.indeed.com/jobs?q=software+engineer&l=Remote"
}

How to Run

  1. Set Up Input: Use the JSON schema above. Start with results_wanted: 20 for quick validation.
  2. Launch the Actor: Run via Apify Console, API, or CLI. Monitor logs for progress.
  3. Retrieve Results: Access the dataset after completion. Export to CSV/JSON for analysis.
  4. Optimize for Scale: Use cookiesJson + residential proxy configuration for more stable runs.

For CLI: apify run your-actor-id --input input.json

Best Practices & Troubleshooting

Optimizing Performance

  • Proxies: Always use residential proxies for high-volume scrapes to avoid CAPTCHAs.
  • Cookies: Use fresh login cookies in cookiesJson from a real browser session.
  • Rate Limits: If errors occur, pause runs or rotate IPs.

Common Issues

  • Incomplete Data: Check startUrls validity or increase results_wanted.
  • HTTP Errors (429/403): Enable residential proxies and refresh cookiesJson.
  • CAPTCHAs: Re-export fresh cookies after solving captcha in browser.
  • No Results: Verify keywords/location and ensure your input URL has live listings.

Limitations

  • Scraping is subject to Indeed's terms of service; use responsibly.
  • Results may vary based on Indeed's layout changes or geo-blocking.
  • Full descriptions require additional requests, increasing run time.

SEO Keywords

Indeed job scraper, scrape Indeed jobs, Indeed API alternative, job data extraction, automated job scraping, Indeed crawler, job market scraper, recruitment data tool, Indeed job listings scraper, extract Indeed jobs data.

Support

For issues or feature requests, check Apify's documentation or community forums. Ensure your runs comply with legal guidelines.