Indeed Job Scraper avatar

Indeed Job Scraper

Pricing

from $2.00 / 1,000 results

Go to Apify Store
Indeed Job Scraper

Indeed Job Scraper

Extract structured job listings from Indeed across 62 markets with built-in change tracking for recurring monitoring.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Black Falcon Data

Black Falcon Data

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

23 minutes ago

Last modified

Categories

Share

🔍 What is Indeed Job Scraper?

Indeed Job Scraper extracts structured job listings from indeed.com — with company metadata. The input is built around keyword search, location filters, and controllable result limits, so you can rerun the same search universe consistently over time.

indeed.com is a public source platform, but it does not provide the kind of structured export most teams need for recurring data workflows. This actor bridges that gap by turning the source into clean JSON with company metadata, with repeatable source access and a schema that is easier to reuse in dashboards, enrichment pipelines, and agent workflows.

🎯 What you can do with this actor

  • Build richer employer datasets with company profiles, ratings, social links, and career-page signals where the source exposes them.
  • Feed compact listing data into AI agents, MCP tools, and ranking workflows without carrying full raw payloads every time.
  • Start with lightweight search runs, then enable detail enrichment only when you need deeper company or listing context.

✨ Why choose this actor?

FeatureThis actorTypical alternatives
Enrichment depthStructured seller / company signals where the source exposes themOften limited to title, price, and URL
Collection strategyCan stay lightweight or add enrichment only when neededOften fixed to one scraping mode
AI-agent usabilityCompact output mode for smaller, more controllable payloadsOften full payload only
Schema qualityKeeps company metadata in a consistent output shapeOften inconsistent across runs

🚀 Quick start

Basic search:

{
"query": "software developer",
"country": "US",
"maxResults": 50,
"maxPages": 5,
"sort": "relevance",
"includeDetails": true,
"compact": false,
"includeCompanyProfile": false,
"incrementalMode": false,
"emitUnchanged": false,
"emitExpired": false
}

With enrichment:

{
"query": "software developer",
"country": "US",
"maxResults": 50,
"maxPages": 5,
"sort": "relevance",
"includeDetails": true,
"compact": false,
"includeCompanyProfile": true,
"incrementalMode": false,
"emitUnchanged": false,
"emitExpired": false
}

Incremental monitoring:

{
"query": "software developer",
"country": "US",
"maxResults": 50,
"maxPages": 5,
"sort": "relevance",
"includeDetails": true,
"compact": false,
"includeCompanyProfile": false,
"incrementalMode": true,
"emitUnchanged": false,
"emitExpired": false,
"stateKey": "daily-monitor"
}

📊 Sample output

{
"title": "Example title",
"company": "Example company",
"location": "Example location",
"url": "https://indeed.com"
}

⚙️ Input reference

ParameterTypeDefaultDescription
Search
querystringJob search keywords. Single string or JSON array for multi-query (e.g. ["software engineer", "data analyst"]). Required unless startUrls is provided.
countryenum"US"Which Indeed domain to search (62 markets). Major markets (US, UK, DE, FR, etc.) are well-tested. Smaller markets are config-supported but listing volume varies.
locationstringCity, state, or region to search within. Single string or JSON array for multi-location (e.g. ["New York", "San Francisco"]). Leave empty for nationwide results.
startUrlsarrayDirect Indeed search or job detail URLs. Use instead of or alongside query. Accepts search pages (indeed.com/jobs?q=...) and job pages (indeed.com/viewjob?jk=...).
maxResultsinteger50Maximum total job listings to return across all search sources.
maxPagesinteger5Maximum SERP pages to scrape per search source. Each page typically contains 15 results.
postedDaysintegerOnly return jobs posted within this many days. Automatically snapped to nearest valid value: 1, 3, 7, or 14.
remoteFilterenumFilter jobs by remote work availability.
jobTypeenumFilter by employment type.
radiusintegerSearch radius around the specified location. Only applies when location is set. Valid values: 5, 10, 15, 25, 35, 50, 100.
sortenum"relevance"Sort results by relevance (default) or by posting date (newest first).
Enrichment
includeDetailsbooleantrueFetch each job's detail page for full description, JSON-LD data, requirements, benefits, and hiring signals. Set to false for fast SERP-only scraping.
compactbooleanfalseOutput only ~12 core fields (jobId, title, company, location, salary, description, URL, dates, remote status). Ideal for AI agents, MCP workflows, and LLM context windows where token budget matters.
descriptionMaxLengthintegerTruncate job descriptions to this many characters (adds '...' suffix). Set to 0 to omit descriptions entirely. Leave empty for full descriptions. Useful for reducing payload size in API integrations and AI pipelines.
includeCompanyProfilebooleanfalseFetch each unique company's /cmp/ page for industry, employee count, headquarters, revenue, and more. Cached per company within each run.
Incremental Tracking
incrementalModebooleanfalseCompare current results against stored state from a previous run. Each job is classified as NEW, UPDATED, UNCHANGED, EXPIRED, or REAPPEARED. Requires stateKey to be set.
stateKeystringStable identifier for the search universe being tracked. Use a descriptive key like "us-software-nyc" or "de-data-berlin". Different queries/locations should use different keys to avoid state cross-contamination. Required when incrementalMode is true.
emitUnchangedbooleanfalseWhen incremental mode is active, also output jobs that haven't changed since the last run. Default: only NEW, UPDATED, and REAPPEARED jobs are emitted.
emitExpiredbooleanfalseWhen incremental mode is active, also output jobs from the previous state that were not found in the current run (marked as EXPIRED).

📦 Output fields

Each result can include company metadata, depending on listing content and the enrichment options enabled for the run.

This actor returns structured dataset items through the default Apify dataset output. See the sample output above for the practical field shape.

⚠️ Known limitations

  • Contact information is only returned when the source exposes it directly; many listings will still rely on apply URLs rather than named contacts.
  • Field population rates always depend on the source site itself, so null values are normal for data points the source does not publish on every listing.

💰 How much does it cost to scrape indeed job scraper?

This actor uses pay-per-event pricing, so you pay a small run-start fee and then only for results that are actually emitted.

EventPriceWhen
actor-start$0.01Each run
result$0.002Per emitted record

Example costs:

ScenarioResultsCost
Quick test10$0.03
Daily monitor50$0.11
Full scrape500$1.01

💡 Use cases

Recruiting and sourcing

Pull indeed.com job listings into dashboards, triage queues, or recruiter workflows without re-normalizing the source on every run.

Recurring monitoring

Track only newly posted or changed listings on scheduled runs, which is better suited to alerts and daily pipeline jobs than repeated full exports.

Outreach and hiring-intent research

Use employer, contact, and apply fields to support account research, outreach queues, or company watchlists when the source provides those details.

AI-agent and MCP workflows

Feed compact listing data into ranking, summarization, classification, or agent pipelines without burning unnecessary context on large descriptions.

🤖 AI-agent and MCP usage

This actor is suitable for AI-agent workflows because the output is structured and the input can intentionally reduce payload size for downstream tools.

  • compact returns a smaller core schema for ranking, classification, and MCP tool calls.
  • descriptionMaxLength lets you cap description size so larger batches stay practical in model context windows.
{
"query": "software developer",
"country": "US",
"maxResults": 10,
"maxPages": 5,
"sort": "relevance",
"includeDetails": true,
"compact": true,
"includeCompanyProfile": false,
"incrementalMode": false,
"emitUnchanged": false,
"emitExpired": false,
"descriptionMaxLength": 300
}

🔄 Incremental mode

Incremental mode is intended for repeated monitoring runs where only new or changed listings should be emitted.

Change typeMeaning
NEWFirst time seen in the monitored result set
CHANGEDPreviously seen listing with updated content
UNCHANGEDSame listing and content as a prior run when unchanged emission is enabled
EXPIREDListing disappeared from the monitored result set when expired emission is enabled

📖 How to scrape indeed job scraper

  1. Open the actor in Apify Console and review the input schema.
  2. Enter your search query and location settings, then set maxResults for the amount of data you need.
  3. Enable optional enrichment fields only when you need richer output such as descriptions, contacts, or company data.
  4. Run the actor and export the dataset as JSON, CSV, or Excel for downstream analysis.

❓ FAQ

What data does this actor return from indeed.com?

It returns structured listing records with fields such as company metadata, plus the core identifiers and metadata defined in the dataset schema.

Can I fetch full descriptions and detail fields?

Yes. Enable the detail-related input options when you need richer fields such as descriptions, employer metadata, or contact details from the listing detail pages.

Does it support recurring monitoring?

Yes. Incremental mode is built for recurring runs where you only want newly seen or changed listings instead of a full repeat dataset every time.

Is it suitable for AI agents or MCP workflows?

Yes. Compact mode and output-size controls make it easier to use the actor in AI-agent workflows where predictable fields matter more than raw page size.

Why use this actor instead of scraping the site ad hoc?

Because it already handles repeatable source access, keeps a stable schema, and exposes filters and enrichment options in a form that is easier to automate repeatedly.

This actor is intended for publicly accessible data workflows. Always review the target site terms and your own legal requirements for the way you plan to use the data.