Indeed Jobs Scraper avatar

Indeed Jobs Scraper

Pricing

$3.00 / 1,000 job listings

Go to Apify Store
Indeed Jobs Scraper

Indeed Jobs Scraper

Extract public Indeed job listings into clean Apify datasets for recruiting intelligence, talent pipeline building, competitor monitoring, and hiring analytics.

Pricing

$3.00 / 1,000 job listings

Rating

0.0

(0)

Developer

Camilo Aguilar

Camilo Aguilar

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Extract public Indeed job listings into clean Apify datasets for recruiting intelligence, talent pipeline building, labor-market research, competitor hiring monitoring, and job-board aggregation.

This actor searches Indeed by keyword, location, country site, posting age, radius, job type, remote filter, experience level, and sort order. It can also scrape direct Indeed search URLs or /viewjob URLs. Results are normalized into a stable schema with job identity, company details, location, salary, descriptions, apply links, and scraping diagnostics.

Why Choose This Actor

  • Pay per saved result: pricing is $3.00 / 1,000 saved jobs.
  • Search-only or enriched output: disable detail pages for speed, or enable detail pages for full descriptions and richer company/apply data.
  • Stable dataset schema: missing fields are returned as null or empty arrays instead of changing the output shape.
  • Block-aware request flow: when regular HTTP receives a 403, the actor retries with browser TLS impersonation.
  • Deduplication included: duplicate jobs across pages and generated search segments are skipped by default.
  • Buyer-visible diagnostics: each run writes RUN_SUMMARY with saved count, duplicate count, filtered count, stop reason, and field coverage.
  • Migration-friendly inputs: common snake_case aliases are accepted for easier integration.
  • Proxy flexible: supports Apify Proxy, custom proxyUrl, DEFAULT_PROXY_URL, and DataImpulse environment variables.

Common Use Cases

  • Talent acquisition: build candidate sourcing lists by role, location, and posting age.
  • Competitor monitoring: track which companies are hiring, where, and for which roles.
  • Compensation research: collect salary text and normalized salary bounds where Indeed exposes them.
  • Labor-market analytics: compare job demand by geography, keyword, company, and date window.
  • Job-board aggregation: feed normalized records into search, alerts, dashboards, or enrichment pipelines.
  • Sales prospecting: find companies actively hiring for roles that indicate buying intent.

Quick Start

Search for jobs by keyword and location:

{
"query": "data analyst",
"location": "Toronto, ON",
"country": "ca.indeed.com",
"maxItems": 100,
"scrapeDetailPages": true
}

Use search-only mode for lower cost and faster exports:

{
"query": "software engineer",
"location": "San Francisco, CA",
"country": "www.indeed.com",
"maxItems": 500,
"scrapeDetailPages": false,
"maxConcurrency": 5
}

Scrape a precise Indeed URL:

{
"startUrls": [
"https://www.indeed.com/jobs?q=product+manager&l=New+York&fromage=1"
],
"maxItems": 250,
"scrapeDetailPages": true
}

Output

Each saved item represents one Indeed job listing. The schema stays consistent across search-only and detail-enriched runs.

{
"data_source": "Indeed",
"job_key": "abc123",
"job_url": "https://ca.indeed.com/viewjob?jk=abc123",
"apply_url": "https://ca.indeed.com/applystart?jk=abc123",
"title": "Data Analyst",
"company_name": "Example Co",
"company_url": "https://ca.indeed.com/cmp/example-co",
"company_logo_url": null,
"company_header_url": null,
"company_rating": 4.2,
"company_review_count": 128,
"company_active_jobs_count": null,
"company_industry": null,
"company_size": null,
"company_revenue": null,
"company_headquarters": null,
"company_ceo": null,
"location": "Toronto, ON",
"city": "Toronto",
"country": "CA",
"street_address": null,
"postal_code": null,
"latitude": 43.6532,
"longitude": -79.3832,
"salary": "$80,000 a year",
"salary_min": 80000,
"salary_max": null,
"salary_currency": "CAD",
"job_type": "Full-time",
"posted_at": "2026-06-10",
"relative_time": "3 days ago",
"description_text": "Build hiring analytics dashboards.",
"description_html": "<p>Build hiring analytics dashboards.</p>",
"benefits": ["Dental insurance"],
"attributes": ["Python", "SQL"],
"requirements": [],
"expired": false,
"indeed_apply": true,
"is_sponsored": false,
"detail_status": "fetched",
"detail_error": null,
"scraping_page": 1,
"scraping_index": 0,
"input_query": "data analyst",
"input_location": "Toronto, ON",
"input_country": "ca.indeed.com"
}

Data Fields

GroupFields
Job identitydata_source, job_key, job_url, apply_url
Job contenttitle, description_text, description_html, job_type, posted_at, relative_time
Companycompany_name, company_url, company_logo_url, company_header_url, company_rating, company_review_count, company_active_jobs_count, company_industry, company_size, company_revenue, company_headquarters, company_ceo
Locationlocation, city, country, street_address, postal_code, latitude, longitude
Compensationsalary, salary_min, salary_max, salary_currency, benefits
Classificationattributes, requirements, expired, indeed_apply, is_sponsored
Diagnosticsdetail_status, detail_error, scraping_page, scraping_index, input_query, input_location, input_country

Indeed does not expose every field on every job, country site, or result page. Unavailable values are returned as null or [].

Detail Pages

Set scrapeDetailPages based on your workflow:

  • false: faster, lower-cost search result extraction. Good for large candidate lists, job counts, and market snapshots.
  • true: fetches each job detail page for full descriptions, richer apply links, and additional company fields where available.

Detail-enriched records include detail_status:

StatusMeaning
fetchedDetail page was fetched and a readable description was found.
fetched_no_descriptionDetail page was fetched, but no description field was extractable.
failedDetail fetch or parsing failed. detail_error contains a short diagnostic.
no_urlThe record did not include a usable job URL.
not_requestedscrapeDetailPages was disabled.

Inputs

OptionTypeDefaultDescription
startUrlsarrayemptyIndeed search result URLs or direct /viewjob URLs.
querystringdata analystJob keywords used when startUrls is empty.
locationstringNew York, NYCity, region, country, or postal code.
countrystringwww.indeed.comIndeed host or alias such as us, ca, uk, www.indeed.com, ca.indeed.com.
maxItemsinteger100Maximum unique job records to save.
startPageinteger1First search results page, using 1-indexed page numbers.
endPageintegeremptyLast search results page. Leave empty to continue until maxItems is reached.
pageSizeinteger15Expected jobs per search page, used for pagination offsets and stopping logic.
postedWithinDaysinteger7Indeed date filter. Common values are 1, 3, 7, and 14.
radiusKminteger25Search radius around the location.
jobTypestringanyany, full-time, part-time, contract, internship, or temporary.
remoteWorkTypestringanyany, remote, hybrid, or on-site. remote maps to Indeed's remote-only URL filter.
sortstringrelevancerelevance or date.
experienceLevelstringanyany, entry-level, mid-level, or senior-level where Indeed supports it.
excludeEmployersarrayemptyEmployer names or substrings to exclude after extraction.
excludeSalaryTypesarrayemptySalary cadence labels to exclude after extraction, such as hourly.
dedupeJobsbooleantrueSkip duplicate jobs across pages and generated search segments.
scrapeDetailPagesbooleantrueFetch each job page for description and richer apply/company data.
maxConcurrencyinteger20Parallel request limit. Lower this if the target site starts rate limiting.
requestDelayMillisinteger0Optional delay after successful requests.
requestRetriesinteger2Retry count for failed requests.
retryDelayMillisinteger1000Delay between retries.
requestTimeoutSecsinteger30HTTP timeout per request.
proxyUrlstringemptyOptional custom proxy URL.
proxyConfigurationobject{ "useApifyProxy": true }Apify Proxy settings.

Country aliases include us, usa, ca, canada, uk, gb, au, de, fr, and es.

Advanced Search Segments

Indeed may stop returning new jobs for one broad search even when more results appear to exist. For larger exports, split the run into smaller search segments and let the actor deduplicate across them.

{
"query": "data analyst",
"queryVariants": ["business analyst", "analytics engineer", "reporting analyst"],
"location": "Toronto, ON",
"locationVariants": ["Mississauga, ON", "Markham, ON", "Vaughan, ON"],
"country": "ca.indeed.com",
"postedWithinDays": 14,
"postedWithinDaysSegments": [7, 3, 1],
"radiusKm": 25,
"radiusKmSegments": [50],
"maxSearchSegments": 20,
"maxItems": 1000,
"scrapeDetailPages": false
}

Segment inputs:

OptionDescription
queryVariantsExtra keyword searches. The main query is tried first.
locationVariantsExtra locations. The main location is tried first.
postedWithinDaysSegmentsExtra date windows, such as 1, 3, 7, 14.
radiusKmSegmentsExtra radius values.
maxSearchSegmentsMaximum generated search URLs from query, location, date, and radius combinations.

Proxy And Blocking Behavior

The actor uses Apify Proxy by default through proxyConfiguration. You can also provide:

  • proxyUrl: direct proxy URL in input.
  • DEFAULT_PROXY_URL: full proxy URL as an environment variable or secret.
  • DATAIMPULSE_PROXY_URL or DATAIMPULSE_URL: full DataImpulse proxy URL.
  • DATAIMPULSE_PROXY_HOST, DATAIMPULSE_PROXY_PORT, DATAIMPULSE_PROXY_USERNAME, DATAIMPULSE_PROXY_PASSWORD: split proxy credentials.
  • Short DataImpulse aliases: DATAIMPULSE_HOST, DATAIMPULSE_PORT, DATAIMPULSE_USER, DATAIMPULSE_PASS, and optional DATAIMPULSE_SCHEME.

When a request receives 403, the actor retries with browser TLS impersonation. If Indeed changes its protections, runs may still return fewer jobs or stop early; check RUN_SUMMARY.stop_reason, detail_status, and detail_error.

Billing And Max Charge Safety

This actor is billed per saved job listing:

  • Price: $0.003 per saved dataset item.
  • Displayed price: $3.00 / 1,000 saved jobs.
  • Empty pages, duplicate-only pages, blocked pages, failed detail requests, and diagnostics are not saved as paid job results.

The actor's RUN_SUMMARY.billable_saved_jobs equals the number of saved dataset records.

If a user sets a maximum run charge, configure maxItems so the actor cannot save more paid records than the user intended. A safe estimate is:

maxItems = floor(maximum_run_charge / 0.003)

Examples:

Maximum chargeSafe maxItems
$31000
$155000
$3010000

Run Summary And Quality Checks

Every run writes a RUN_SUMMARY key-value store record with:

  • requested max items
  • pages attempted
  • search URLs planned and attempted
  • saved count
  • duplicate count
  • filtered count
  • no-job pages
  • stop reason
  • field coverage by key field
  • billable saved jobs

Stop reasons:

Stop reasonMeaning
max_items_reachedThe requested maxItems was saved.
end_page_reachedPagination reached endPage or a short final page.
detail_url_completedA direct /viewjob URL was processed.
no_jobs_foundA page produced no extractable jobs.
no_new_unique_jobsA page only contained jobs already saved in the run.
completedAll start URLs were processed without another stop reason.

Use RUN_SUMMARY.field_coverage to check whether descriptions, salaries, company fields, geodata, benefits, apply URLs, and detail statuses are being populated for a run.

Compatibility Aliases

The actor accepts these snake_case aliases for easier integration:

AliasCanonical input
start_urlsstartUrls
max_itemsmaxItems
posted_within_dayspostedWithinDays
radius_kmradiusKm
scrape_detail_pagesscrapeDetailPages
max_concurrencymaxConcurrency
proxy_urlproxyUrl
request_retriesrequestRetries
request_timeout_secsrequestTimeoutSecs
retry_delay_millisretryDelayMillis
request_delay_millisrequestDelayMillis
start_pagestartPage
end_pageendPage
query_variantsqueryVariants
location_variantslocationVariants
posted_within_days_segmentspostedWithinDaysSegments
radius_km_segmentsradiusKmSegments
max_search_segmentsmaxSearchSegments
job_typejobType
remote_work_typeremoteWorkType
experience_levelexperienceLevel
exclude_employersexcludeEmployers
exclude_salary_typesexcludeSalaryTypes
dedupe_jobsdedupeJobs

When both spellings are provided, the camelCase input wins.

Reliability Notes

This actor is designed to address common buyer concerns with Indeed scrapers:

  • Blocked requests: retries with browser TLS impersonation after 403.
  • Missing descriptions: detail-page mode fetches full descriptions when available and marks failures per record.
  • Duplicate records: deduplication is enabled by default across pages and segments.
  • Unclear output shape: the dataset schema remains stable with nulls or empty arrays for unavailable fields.
  • Overcharging risk: charge only for saved dataset items, not starts or failed pages.
  • Large result limits: use search segments to fan out broad searches into smaller deduplicated runs.
  • Troubleshooting: RUN_SUMMARY, detail_status, and detail_error show what happened.

Support

Open an issue on this actor's Apify page for bugs, blocked runs, feature requests, missing fields, or country/filter requests. Include the run ID, input JSON, expected result count, and whether detail pages were enabled.

Maintained by Camilo Aguilar, Aguilar Hernandez Consultants Inc.

Compliance

This actor is intended for extracting publicly available job listing information for legitimate research, analytics, and workflow automation. Users are responsible for ensuring their usage complies with Indeed terms, applicable law, privacy requirements, and downstream data-processing obligations.

This actor is not affiliated with, endorsed by, or sponsored by Indeed.