Indeed Jobs Scraper
Pricing
$3.00 / 1,000 job listings
Indeed Jobs Scraper
Extract public Indeed job listings into clean Apify datasets for recruiting intelligence, talent pipeline building, competitor monitoring, and hiring analytics.
Pricing
$3.00 / 1,000 job listings
Rating
0.0
(0)
Developer
Camilo Aguilar
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Extract public Indeed job listings into clean Apify datasets for recruiting intelligence, talent pipeline building, labor-market research, competitor hiring monitoring, and job-board aggregation.
This actor searches Indeed by keyword, location, country site, posting age, radius, job type, remote filter, experience level, and sort order. It can also scrape direct Indeed search URLs or /viewjob URLs. Results are normalized into a stable schema with job identity, company details, location, salary, descriptions, apply links, and scraping diagnostics.
Why Choose This Actor
- Pay per saved result: pricing is
$3.00 / 1,000 saved jobs. - Search-only or enriched output: disable detail pages for speed, or enable detail pages for full descriptions and richer company/apply data.
- Stable dataset schema: missing fields are returned as
nullor empty arrays instead of changing the output shape. - Block-aware request flow: when regular HTTP receives a
403, the actor retries with browser TLS impersonation. - Deduplication included: duplicate jobs across pages and generated search segments are skipped by default.
- Buyer-visible diagnostics: each run writes
RUN_SUMMARYwith saved count, duplicate count, filtered count, stop reason, and field coverage. - Migration-friendly inputs: common snake_case aliases are accepted for easier integration.
- Proxy flexible: supports Apify Proxy, custom
proxyUrl,DEFAULT_PROXY_URL, and DataImpulse environment variables.
Common Use Cases
- Talent acquisition: build candidate sourcing lists by role, location, and posting age.
- Competitor monitoring: track which companies are hiring, where, and for which roles.
- Compensation research: collect salary text and normalized salary bounds where Indeed exposes them.
- Labor-market analytics: compare job demand by geography, keyword, company, and date window.
- Job-board aggregation: feed normalized records into search, alerts, dashboards, or enrichment pipelines.
- Sales prospecting: find companies actively hiring for roles that indicate buying intent.
Quick Start
Search for jobs by keyword and location:
{"query": "data analyst","location": "Toronto, ON","country": "ca.indeed.com","maxItems": 100,"scrapeDetailPages": true}
Use search-only mode for lower cost and faster exports:
{"query": "software engineer","location": "San Francisco, CA","country": "www.indeed.com","maxItems": 500,"scrapeDetailPages": false,"maxConcurrency": 5}
Scrape a precise Indeed URL:
{"startUrls": ["https://www.indeed.com/jobs?q=product+manager&l=New+York&fromage=1"],"maxItems": 250,"scrapeDetailPages": true}
Output
Each saved item represents one Indeed job listing. The schema stays consistent across search-only and detail-enriched runs.
{"data_source": "Indeed","job_key": "abc123","job_url": "https://ca.indeed.com/viewjob?jk=abc123","apply_url": "https://ca.indeed.com/applystart?jk=abc123","title": "Data Analyst","company_name": "Example Co","company_url": "https://ca.indeed.com/cmp/example-co","company_logo_url": null,"company_header_url": null,"company_rating": 4.2,"company_review_count": 128,"company_active_jobs_count": null,"company_industry": null,"company_size": null,"company_revenue": null,"company_headquarters": null,"company_ceo": null,"location": "Toronto, ON","city": "Toronto","country": "CA","street_address": null,"postal_code": null,"latitude": 43.6532,"longitude": -79.3832,"salary": "$80,000 a year","salary_min": 80000,"salary_max": null,"salary_currency": "CAD","job_type": "Full-time","posted_at": "2026-06-10","relative_time": "3 days ago","description_text": "Build hiring analytics dashboards.","description_html": "<p>Build hiring analytics dashboards.</p>","benefits": ["Dental insurance"],"attributes": ["Python", "SQL"],"requirements": [],"expired": false,"indeed_apply": true,"is_sponsored": false,"detail_status": "fetched","detail_error": null,"scraping_page": 1,"scraping_index": 0,"input_query": "data analyst","input_location": "Toronto, ON","input_country": "ca.indeed.com"}
Data Fields
| Group | Fields |
|---|---|
| Job identity | data_source, job_key, job_url, apply_url |
| Job content | title, description_text, description_html, job_type, posted_at, relative_time |
| Company | company_name, company_url, company_logo_url, company_header_url, company_rating, company_review_count, company_active_jobs_count, company_industry, company_size, company_revenue, company_headquarters, company_ceo |
| Location | location, city, country, street_address, postal_code, latitude, longitude |
| Compensation | salary, salary_min, salary_max, salary_currency, benefits |
| Classification | attributes, requirements, expired, indeed_apply, is_sponsored |
| Diagnostics | detail_status, detail_error, scraping_page, scraping_index, input_query, input_location, input_country |
Indeed does not expose every field on every job, country site, or result page. Unavailable values are returned as null or [].
Detail Pages
Set scrapeDetailPages based on your workflow:
false: faster, lower-cost search result extraction. Good for large candidate lists, job counts, and market snapshots.true: fetches each job detail page for full descriptions, richer apply links, and additional company fields where available.
Detail-enriched records include detail_status:
| Status | Meaning |
|---|---|
fetched | Detail page was fetched and a readable description was found. |
fetched_no_description | Detail page was fetched, but no description field was extractable. |
failed | Detail fetch or parsing failed. detail_error contains a short diagnostic. |
no_url | The record did not include a usable job URL. |
not_requested | scrapeDetailPages was disabled. |
Inputs
| Option | Type | Default | Description |
|---|---|---|---|
startUrls | array | empty | Indeed search result URLs or direct /viewjob URLs. |
query | string | data analyst | Job keywords used when startUrls is empty. |
location | string | New York, NY | City, region, country, or postal code. |
country | string | www.indeed.com | Indeed host or alias such as us, ca, uk, www.indeed.com, ca.indeed.com. |
maxItems | integer | 100 | Maximum unique job records to save. |
startPage | integer | 1 | First search results page, using 1-indexed page numbers. |
endPage | integer | empty | Last search results page. Leave empty to continue until maxItems is reached. |
pageSize | integer | 15 | Expected jobs per search page, used for pagination offsets and stopping logic. |
postedWithinDays | integer | 7 | Indeed date filter. Common values are 1, 3, 7, and 14. |
radiusKm | integer | 25 | Search radius around the location. |
jobType | string | any | any, full-time, part-time, contract, internship, or temporary. |
remoteWorkType | string | any | any, remote, hybrid, or on-site. remote maps to Indeed's remote-only URL filter. |
sort | string | relevance | relevance or date. |
experienceLevel | string | any | any, entry-level, mid-level, or senior-level where Indeed supports it. |
excludeEmployers | array | empty | Employer names or substrings to exclude after extraction. |
excludeSalaryTypes | array | empty | Salary cadence labels to exclude after extraction, such as hourly. |
dedupeJobs | boolean | true | Skip duplicate jobs across pages and generated search segments. |
scrapeDetailPages | boolean | true | Fetch each job page for description and richer apply/company data. |
maxConcurrency | integer | 20 | Parallel request limit. Lower this if the target site starts rate limiting. |
requestDelayMillis | integer | 0 | Optional delay after successful requests. |
requestRetries | integer | 2 | Retry count for failed requests. |
retryDelayMillis | integer | 1000 | Delay between retries. |
requestTimeoutSecs | integer | 30 | HTTP timeout per request. |
proxyUrl | string | empty | Optional custom proxy URL. |
proxyConfiguration | object | { "useApifyProxy": true } | Apify Proxy settings. |
Country aliases include us, usa, ca, canada, uk, gb, au, de, fr, and es.
Advanced Search Segments
Indeed may stop returning new jobs for one broad search even when more results appear to exist. For larger exports, split the run into smaller search segments and let the actor deduplicate across them.
{"query": "data analyst","queryVariants": ["business analyst", "analytics engineer", "reporting analyst"],"location": "Toronto, ON","locationVariants": ["Mississauga, ON", "Markham, ON", "Vaughan, ON"],"country": "ca.indeed.com","postedWithinDays": 14,"postedWithinDaysSegments": [7, 3, 1],"radiusKm": 25,"radiusKmSegments": [50],"maxSearchSegments": 20,"maxItems": 1000,"scrapeDetailPages": false}
Segment inputs:
| Option | Description |
|---|---|
queryVariants | Extra keyword searches. The main query is tried first. |
locationVariants | Extra locations. The main location is tried first. |
postedWithinDaysSegments | Extra date windows, such as 1, 3, 7, 14. |
radiusKmSegments | Extra radius values. |
maxSearchSegments | Maximum generated search URLs from query, location, date, and radius combinations. |
Proxy And Blocking Behavior
The actor uses Apify Proxy by default through proxyConfiguration. You can also provide:
proxyUrl: direct proxy URL in input.DEFAULT_PROXY_URL: full proxy URL as an environment variable or secret.DATAIMPULSE_PROXY_URLorDATAIMPULSE_URL: full DataImpulse proxy URL.DATAIMPULSE_PROXY_HOST,DATAIMPULSE_PROXY_PORT,DATAIMPULSE_PROXY_USERNAME,DATAIMPULSE_PROXY_PASSWORD: split proxy credentials.- Short DataImpulse aliases:
DATAIMPULSE_HOST,DATAIMPULSE_PORT,DATAIMPULSE_USER,DATAIMPULSE_PASS, and optionalDATAIMPULSE_SCHEME.
When a request receives 403, the actor retries with browser TLS impersonation. If Indeed changes its protections, runs may still return fewer jobs or stop early; check RUN_SUMMARY.stop_reason, detail_status, and detail_error.
Billing And Max Charge Safety
This actor is billed per saved job listing:
- Price:
$0.003per saved dataset item. - Displayed price:
$3.00 / 1,000 saved jobs. - Empty pages, duplicate-only pages, blocked pages, failed detail requests, and diagnostics are not saved as paid job results.
The actor's RUN_SUMMARY.billable_saved_jobs equals the number of saved dataset records.
If a user sets a maximum run charge, configure maxItems so the actor cannot save more paid records than the user intended. A safe estimate is:
maxItems = floor(maximum_run_charge / 0.003)
Examples:
| Maximum charge | Safe maxItems |
|---|---|
$3 | 1000 |
$15 | 5000 |
$30 | 10000 |
Run Summary And Quality Checks
Every run writes a RUN_SUMMARY key-value store record with:
- requested max items
- pages attempted
- search URLs planned and attempted
- saved count
- duplicate count
- filtered count
- no-job pages
- stop reason
- field coverage by key field
- billable saved jobs
Stop reasons:
| Stop reason | Meaning |
|---|---|
max_items_reached | The requested maxItems was saved. |
end_page_reached | Pagination reached endPage or a short final page. |
detail_url_completed | A direct /viewjob URL was processed. |
no_jobs_found | A page produced no extractable jobs. |
no_new_unique_jobs | A page only contained jobs already saved in the run. |
completed | All start URLs were processed without another stop reason. |
Use RUN_SUMMARY.field_coverage to check whether descriptions, salaries, company fields, geodata, benefits, apply URLs, and detail statuses are being populated for a run.
Compatibility Aliases
The actor accepts these snake_case aliases for easier integration:
| Alias | Canonical input |
|---|---|
start_urls | startUrls |
max_items | maxItems |
posted_within_days | postedWithinDays |
radius_km | radiusKm |
scrape_detail_pages | scrapeDetailPages |
max_concurrency | maxConcurrency |
proxy_url | proxyUrl |
request_retries | requestRetries |
request_timeout_secs | requestTimeoutSecs |
retry_delay_millis | retryDelayMillis |
request_delay_millis | requestDelayMillis |
start_page | startPage |
end_page | endPage |
query_variants | queryVariants |
location_variants | locationVariants |
posted_within_days_segments | postedWithinDaysSegments |
radius_km_segments | radiusKmSegments |
max_search_segments | maxSearchSegments |
job_type | jobType |
remote_work_type | remoteWorkType |
experience_level | experienceLevel |
exclude_employers | excludeEmployers |
exclude_salary_types | excludeSalaryTypes |
dedupe_jobs | dedupeJobs |
When both spellings are provided, the camelCase input wins.
Reliability Notes
This actor is designed to address common buyer concerns with Indeed scrapers:
- Blocked requests: retries with browser TLS impersonation after
403. - Missing descriptions: detail-page mode fetches full descriptions when available and marks failures per record.
- Duplicate records: deduplication is enabled by default across pages and segments.
- Unclear output shape: the dataset schema remains stable with nulls or empty arrays for unavailable fields.
- Overcharging risk: charge only for saved dataset items, not starts or failed pages.
- Large result limits: use search segments to fan out broad searches into smaller deduplicated runs.
- Troubleshooting:
RUN_SUMMARY,detail_status, anddetail_errorshow what happened.
Support
Open an issue on this actor's Apify page for bugs, blocked runs, feature requests, missing fields, or country/filter requests. Include the run ID, input JSON, expected result count, and whether detail pages were enabled.
Maintained by Camilo Aguilar, Aguilar Hernandez Consultants Inc.
Compliance
This actor is intended for extracting publicly available job listing information for legitimate research, analytics, and workflow automation. Users are responsible for ensuring their usage complies with Indeed terms, applicable law, privacy requirements, and downstream data-processing obligations.
This actor is not affiliated with, endorsed by, or sponsored by Indeed.