Naukri Job Scraper avatar

Naukri Job Scraper

Pricing

from $0.50 / 1,000 standard job items

Go to Apify Store
Naukri Job Scraper

Naukri Job Scraper

Scrape Naukri job listings by keyword, search URL, or job IDs. Standard mode returns search-card data; Detailed mode adds full description, company profile and AmbitionBox reviews. Auto-resolves city names to internal IDs. HTTP-only with TLS impersonation, no browser.

Pricing

from $0.50 / 1,000 standard job items

Rating

0.0

(0)

Developer

Blynx

Blynx

Maintained by Community

Actor stats

1

Bookmarked

3

Total users

2

Monthly active users

3 days ago

Last modified

Share

A pay-per-event Apify actor that pulls structured job listings from Naukri through its internal JSON endpoints. No browser, no Chromium, no Selenium - plain HTTP with TLS impersonation, so it stays cheap and fast even at tens of thousands of listings.

What it does

  • Takes one of three inputs as a starting point: a free-text keyword, a full Naukri search URL, or an explicit list of job IDs.
  • Optionally narrows the search by city, experience, posting age, salary, industry and sort order.
  • Paginates through the search results, deduplicates by job ID, stops at the user-supplied maxJobs cap.
  • In Standard mode pushes the trimmed search-card object for each job (title, company, salary, experience, location, skills, URL, AmbitionBox rating, ...).
  • In Detailed mode additionally fetches the per-job endpoint and merges the full HTML description, key skills, role category, education requirements, AmbitionBox reviews/salaries/benefits onto the card.
  • Writes everything to the default Apify dataset and bills one pay-per-event charge per pushed item (job_item or job_item_detailed).

Modes

ModeTriggerOutputBilled event
Standardkeyword or searchUrl, fetchDetails=falseJob summary card from search resultsjob_item
DetailedfetchDetails=true (with keyword/URL)Summary plus full description, company profile, AmbitionBox reviews/salaries/benefitsjob_item_detailed
Directnon-empty jobIdsFull detail object for each given IDjob_item_detailed

Precedence: jobIds > searchUrl > keyword. When searchUrl is set, manual filters are ignored, but sortBy is still applied on top.

Input fields

Anchors (one of these is required)

FieldWhat it does
keywordFree-text search term (e.g. "python developer").
searchUrlA full Naukri search-results URL. Just paste whatever is in your browser's address bar after you've set the filters you want on the Naukri site itself. Supports all three URL shapes - /<keyword>-jobs, /jobs-in-<city>, /<keyword>-jobs-in-<city> - and carries every URL-style filter parameter through to the API (?experience=5&jobAge=7&minSalary=...). When this is set, the structured filter fields below are ignored, but sortBy is still layered on top.
jobIdsDirect list of Naukri job IDs to fetch detail-pages for. Skips search entirely.

Filters (all optional, all verified to actually work)

FieldFormatWhat it does
cities["Mumbai", "Bengaluru"] or ["17", "97"] or mixedCity filter. Names are auto-resolved through a built-in lookup; unknown names are forwarded to Naukri's server-side resolver. Numeric IDs pass through.
experience"5"Required years of experience. Vacancies whose declared range covers this value match (so "5" matches 3-7 Yrs, 5-10 Yrs, etc.).
freshness"1" / "3" / "7" / "15" / "30" / "all"How recently the job was posted (in days).
salaryRange["10to15", "15to25"]Annual salary buckets in lakhs (LPA). Multiple buckets are merged into one wider range. "75plus" drops the upper bound.
industry["25", "14"]Numeric Naukri industry IDs.
sortBy"date" or "relevance"Sort order. Defaults to relevance.

Run controls

FieldDefaultWhat it does
maxJobs100Upper limit on jobs collected (minimum 50).
fetchDetailsfalseToggle between Standard and Detailed mode.
proxyConfigurationresidentialApify proxy settings; residential is strongly recommended.
maxConcurrency5Concurrent HTTP requests cap.
maxRetries5Retry budget per request for transient errors (separate from proxy and block-detection retries, which have their own caps).

Quick examples

Plain keyword search

{
"keyword": "data engineer",
"experience": "5",
"freshness": "7",
"sortBy": "date",
"maxJobs": 100
}

Keyword + city filter (auto-resolved by name)

{
"keyword": "frontend developer",
"cities": ["Singapore"],
"freshness": "30",
"maxJobs": 100,
"fetchDetails": true
}

From a full search URL - paste any Naukri search-results URL (the one in your address bar after picking filters on the site)

{
"searchUrl": "https://www.naukri.com/python-jobs?experience=3",
"maxJobs": 100,
"sortBy": "date"
}

Direct fetch by job IDs

{
"jobIds": ["220126040161", "170424007054"],
"maxJobs": 50,
"fetchDetails": true
}

Salary band + industry

{
"keyword": "devops",
"salaryRange": ["15to25", "25to50"],
"industry": ["25"],
"maxJobs": 200
}

How it stays unblocked

Naukri sits behind an Akamai-class bot manager. Generic Python HTTP clients are dead on arrival because they fail at the TLS handshake before a single request body goes out. This actor uses a different toolchain:

  • curl_cffi with impersonate="chrome" - TLS handshake, HTTP/2 SETTINGS frame, ALPN, cipher order and JA3/JA4 fingerprint all match a real Chrome.
  • Residential proxies - datacenter exits get challenged constantly.
  • Per-request session rotation - every retry spins up a fresh AsyncSession with a new session_id, which means a new exit IP and an empty cookie jar. No state carries over between attempts.
  • Three-budget retry policy - proxy failures, bot-management blocks (HTTP 401/403 with bot-wall markers, or 406 with recaptcha required), and ordinary 429/5xx/parse errors each have their own counter. A flaky proxy cannot eat the retry budget reserved for transient errors.
  • Browser-shaped headers - Accept, Accept-Language, Sec-Fetch-*, Priority, Upgrade-Insecure-Requests, etc.; the Chrome user agent and sec-ch-ua-* are set automatically by curl_cffi based on the impersonate profile.
  • Internal /jobapi/v1/* endpoint family - chosen specifically because v3+ now requires an invisible reCAPTCHA token only the real frontend can mint. The actor will detect a 406 recaptcha required if Naukri ever closes v1 too, and rotate sessions.

No Playwright, no Chromium, no Selenium. Default memory is 512 MB.

Output shape

Standard-mode items are trimmed to ~40 useful fields. Legacy v1 keys that are always null, internal flags, recruiter-contact junk and duplicates of derived fields (post, urlStr, addDate, compLogo, keywords, jobDesc, tupleDesc, currencySal, isSavedJob, internship-only fields on non-internship jobs) are stripped before writing to the dataset. The useful surface is:

  • jobId, title, companyName, companyId, groupId, staticCompanyName
  • experience ("5-10 Yrs" derived from numeric minExp + maxExp)
  • salary ("Not disclosed" if the company hid the figure, otherwise the formatted range; raw minSal/maxSal/showSal are also kept)
  • location (best-effort canonical city), cityfield, locality
  • jdURL (with the trailing tracking query stripped)
  • tagsAndSkills, logoPath, currency (always normalised to INR)
  • companyJobsUrl (synthesised from staticCompanyName)
  • companyProfile, employmentType, noOfVacancy
  • ambitionBoxData (object with rating, reviewCount, title, url - when Naukri provides it; expect undefined for smaller employers in Standard mode)
  • createdDate, isSaved, isExpiredJob, isWalkIn, isTopGroup, multipleApply
  • jobtype, jobType1-jobType5 (Naukri's internal listing categorisation)

Detailed-mode items merge the search-card on top of the full detail-API response, which adds:

  • the entire job object - HTML description, key skills (preferred + other), industry, employment type, role category, education requirements
  • AmbitionBox details (reviews, salaries, benefits) when present

Local development

pip install -r requirements.txt
python -m src

Place an INPUT.json under storage/key_value_stores/default/ and run apify run --purge (Apify CLI required). Output lands in storage/datasets/default/.

Notes & limits

  • maxJobs is clamped to a minimum of 50.
  • Page size is 20 (the v1 endpoint default); pagination stops on the first short page.
  • If a detail-API call fails despite retries, the job is still pushed using only the search summary and billed as job_item rather than dropped.
  • The built-in city table covers ~30 high-volume cities; smaller cities take the free-text fallback path automatically.
  • "Remote" / "Hybrid" inside cities is rejected with a helpful error message because Naukri's workMode filter is broken on v1 and would silently return the full database.