Naukri Job Scraper
Pricing
from $0.50 / 1,000 standard job items
Naukri Job Scraper
Scrape Naukri job listings by keyword, search URL, or job IDs. Standard mode returns search-card data; Detailed mode adds full description, company profile and AmbitionBox reviews. Auto-resolves city names to internal IDs. HTTP-only with TLS impersonation, no browser.
Pricing
from $0.50 / 1,000 standard job items
Rating
0.0
(0)
Developer
Blynx
Actor stats
1
Bookmarked
3
Total users
2
Monthly active users
3 days ago
Last modified
Categories
Share
A pay-per-event Apify actor that pulls structured job listings from Naukri through its internal JSON endpoints. No browser, no Chromium, no Selenium - plain HTTP with TLS impersonation, so it stays cheap and fast even at tens of thousands of listings.
What it does
- Takes one of three inputs as a starting point: a free-text keyword, a full Naukri search URL, or an explicit list of job IDs.
- Optionally narrows the search by city, experience, posting age, salary, industry and sort order.
- Paginates through the search results, deduplicates by job ID, stops at
the user-supplied
maxJobscap. - In Standard mode pushes the trimmed search-card object for each job (title, company, salary, experience, location, skills, URL, AmbitionBox rating, ...).
- In Detailed mode additionally fetches the per-job endpoint and merges the full HTML description, key skills, role category, education requirements, AmbitionBox reviews/salaries/benefits onto the card.
- Writes everything to the default Apify dataset and bills one
pay-per-event charge per pushed item (
job_itemorjob_item_detailed).
Modes
| Mode | Trigger | Output | Billed event |
|---|---|---|---|
| Standard | keyword or searchUrl, fetchDetails=false | Job summary card from search results | job_item |
| Detailed | fetchDetails=true (with keyword/URL) | Summary plus full description, company profile, AmbitionBox reviews/salaries/benefits | job_item_detailed |
| Direct | non-empty jobIds | Full detail object for each given ID | job_item_detailed |
Precedence: jobIds > searchUrl > keyword. When searchUrl is set, manual
filters are ignored, but sortBy is still applied on top.
Input fields
Anchors (one of these is required)
| Field | What it does |
|---|---|
keyword | Free-text search term (e.g. "python developer"). |
searchUrl | A full Naukri search-results URL. Just paste whatever is in your browser's address bar after you've set the filters you want on the Naukri site itself. Supports all three URL shapes - /<keyword>-jobs, /jobs-in-<city>, /<keyword>-jobs-in-<city> - and carries every URL-style filter parameter through to the API (?experience=5&jobAge=7&minSalary=...). When this is set, the structured filter fields below are ignored, but sortBy is still layered on top. |
jobIds | Direct list of Naukri job IDs to fetch detail-pages for. Skips search entirely. |
Filters (all optional, all verified to actually work)
| Field | Format | What it does |
|---|---|---|
cities | ["Mumbai", "Bengaluru"] or ["17", "97"] or mixed | City filter. Names are auto-resolved through a built-in lookup; unknown names are forwarded to Naukri's server-side resolver. Numeric IDs pass through. |
experience | "5" | Required years of experience. Vacancies whose declared range covers this value match (so "5" matches 3-7 Yrs, 5-10 Yrs, etc.). |
freshness | "1" / "3" / "7" / "15" / "30" / "all" | How recently the job was posted (in days). |
salaryRange | ["10to15", "15to25"] | Annual salary buckets in lakhs (LPA). Multiple buckets are merged into one wider range. "75plus" drops the upper bound. |
industry | ["25", "14"] | Numeric Naukri industry IDs. |
sortBy | "date" or "relevance" | Sort order. Defaults to relevance. |
Run controls
| Field | Default | What it does |
|---|---|---|
maxJobs | 100 | Upper limit on jobs collected (minimum 50). |
fetchDetails | false | Toggle between Standard and Detailed mode. |
proxyConfiguration | residential | Apify proxy settings; residential is strongly recommended. |
maxConcurrency | 5 | Concurrent HTTP requests cap. |
maxRetries | 5 | Retry budget per request for transient errors (separate from proxy and block-detection retries, which have their own caps). |
Quick examples
Plain keyword search
{"keyword": "data engineer","experience": "5","freshness": "7","sortBy": "date","maxJobs": 100}
Keyword + city filter (auto-resolved by name)
{"keyword": "frontend developer","cities": ["Singapore"],"freshness": "30","maxJobs": 100,"fetchDetails": true}
From a full search URL - paste any Naukri search-results URL (the one in your address bar after picking filters on the site)
{"searchUrl": "https://www.naukri.com/python-jobs?experience=3","maxJobs": 100,"sortBy": "date"}
Direct fetch by job IDs
{"jobIds": ["220126040161", "170424007054"],"maxJobs": 50,"fetchDetails": true}
Salary band + industry
{"keyword": "devops","salaryRange": ["15to25", "25to50"],"industry": ["25"],"maxJobs": 200}
How it stays unblocked
Naukri sits behind an Akamai-class bot manager. Generic Python HTTP clients are dead on arrival because they fail at the TLS handshake before a single request body goes out. This actor uses a different toolchain:
curl_cffiwithimpersonate="chrome"- TLS handshake, HTTP/2 SETTINGS frame, ALPN, cipher order and JA3/JA4 fingerprint all match a real Chrome.- Residential proxies - datacenter exits get challenged constantly.
- Per-request session rotation - every retry spins up a fresh
AsyncSessionwith a newsession_id, which means a new exit IP and an empty cookie jar. No state carries over between attempts. - Three-budget retry policy - proxy failures, bot-management blocks
(HTTP 401/403 with bot-wall markers, or 406 with
recaptcha required), and ordinary 429/5xx/parse errors each have their own counter. A flaky proxy cannot eat the retry budget reserved for transient errors. - Browser-shaped headers - Accept, Accept-Language, Sec-Fetch-*,
Priority, Upgrade-Insecure-Requests, etc.; the Chrome user agent and
sec-ch-ua-*are set automatically by curl_cffi based on the impersonate profile. - Internal
/jobapi/v1/*endpoint family - chosen specifically because v3+ now requires an invisible reCAPTCHA token only the real frontend can mint. The actor will detect a406 recaptcha requiredif Naukri ever closes v1 too, and rotate sessions.
No Playwright, no Chromium, no Selenium. Default memory is 512 MB.
Output shape
Standard-mode items are trimmed to ~40 useful fields. Legacy v1 keys that
are always null, internal flags, recruiter-contact junk and duplicates of
derived fields (post, urlStr, addDate, compLogo, keywords,
jobDesc, tupleDesc, currencySal, isSavedJob, internship-only fields
on non-internship jobs) are stripped before writing to the dataset. The
useful surface is:
jobId,title,companyName,companyId,groupId,staticCompanyNameexperience("5-10 Yrs"derived from numericminExp+maxExp)salary("Not disclosed"if the company hid the figure, otherwise the formatted range; rawminSal/maxSal/showSalare also kept)location(best-effort canonical city),cityfield,localityjdURL(with the trailing tracking query stripped)tagsAndSkills,logoPath,currency(always normalised toINR)companyJobsUrl(synthesised fromstaticCompanyName)companyProfile,employmentType,noOfVacancyambitionBoxData(object with rating, reviewCount, title, url - when Naukri provides it; expectundefinedfor smaller employers in Standard mode)createdDate,isSaved,isExpiredJob,isWalkIn,isTopGroup,multipleApplyjobtype,jobType1-jobType5(Naukri's internal listing categorisation)
Detailed-mode items merge the search-card on top of the full detail-API response, which adds:
- the entire
jobobject - HTML description, key skills (preferred + other), industry, employment type, role category, education requirements - AmbitionBox details (reviews, salaries, benefits) when present
Local development
pip install -r requirements.txtpython -m src
Place an INPUT.json under storage/key_value_stores/default/ and run
apify run --purge (Apify CLI required). Output lands in
storage/datasets/default/.
Notes & limits
maxJobsis clamped to a minimum of 50.- Page size is 20 (the v1 endpoint default); pagination stops on the first short page.
- If a detail-API call fails despite retries, the job is still pushed using
only the search summary and billed as
job_itemrather than dropped. - The built-in city table covers ~30 high-volume cities; smaller cities take the free-text fallback path automatically.
- "Remote" / "Hybrid" inside
citiesis rejected with a helpful error message because Naukri's workMode filter is broken on v1 and would silently return the full database.