Under maintenance

Pricing

$20.00/month + usage

Try for free

Go to Apify Store

Jobs Scrapper

Under maintenance

Try for free

Powerful AmbitionBox Job Scraper that extracts detailed job listings by role and location. Includes responsibilities, skills, qualifications, company insights, and Naukri integration for technical details. Fast, structured, and proxy-supported for large-scale data collection.

Pricing

$20.00/month + usage

Rating

0.0

(0)

Developer

ai-scraper-labs

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

AmbitionBox Job Scraper

✨ Converted to Node.js + Playwright + Apify Actor
This project has been migrated from Python/Scrapy to Node.js/Playwright while preserving 100% of the original scraping logic.
See README_CONVERSION.md for conversion details.

An Apify Actor that scrapes job listings from AmbitionBox using Playwright browser automation, with optional detailed information extraction from Naukri job pages.

Features

AmbitionBox Job Scraping - Extracts comprehensive job listings including:
- Job title, company, location, salary, and experience requirements
- Detailed job descriptions and responsibilities
- Required skills and qualifications
- Employment type and application links
Naukri Detail Extraction - Optionally fetches detailed job information from linked Naukri pages:
- Key responsibilities (structured list)
- Required skills and technologies
- Educational qualifications and experience requirements
- Detailed job descriptions
Company Information - Extracts comprehensive company details:
- Company overview and summary
- Founding year and employee count
- Company website and headquarters
- Work policies (WFH, hybrid, etc.)
- Complete benefits and perks list
Apify Platform Native - Built as a first-class Apify Actor:
- Automatic request scheduling with AutoscaledPool
- Built-in retry logic and error handling
- Cloud-persisted request queue
- Integrated dataset storage
Playwright Integration - Uses browser rendering to bypass anti-bot detection:
- Handles JavaScript-rendered content
- Bypasses AmbitionBox's anti-scraping measures
- Chromium headless browser
- Ensures reliable data extraction
Concurrency Support - Parallel processing for faster scraping:
- Configurable concurrency (1-10 workers)
- Request queue management
- Automatic rate limiting

Input Parameters

Configure the scraper through the Actor input:

role (required, string) - Job role or title to search for
- Example: "software engineer", "python developer", "data scientist"
location (optional, string) - Location to search jobs in
- Example: "bangalore", "mumbai", "delhi"
- Use "all" or "worldwide" for all locations
- Note: AmbitionBox primarily lists jobs in India
- If a specific location returns no results, automatically falls back to all locations
maxPages (optional, integer, default: 2) - Maximum number of listing pages to scrape
- Range: 1-50
maxJobs (optional, integer, default: 20) - Maximum number of jobs to scrape
- Set to 0 for unlimited
includeNaukriDetails (optional, boolean, default: true) - Whether to fetch detailed information from Naukri
- true: Comprehensive data (slower)
- false: Basic data only (3-4x faster)
proxyConfiguration (optional, object) - Proxy settings for the scraper
- Default: Uses Apify proxy with RESIDENTIAL group (recommended)
- Recommended: Use residential proxies for best success rates with AmbitionBox
- Datacenter proxies may experience higher timeout rates

Output Format

The Actor stores data in the default Apify dataset with this structure:

{
  "title": "Senior Software Engineer",
  "company": "Tech Company Pvt Ltd",
  "location": "Bangalore, Karnataka",
  "exp_level": "3-6 years",
  "salary_range": "₹10-18 LPA",
  "url": "https://www.ambitionbox.com/jobs/...",
  "apply_url": "https://www.naukri.com/job-listings-...",
  
  "about_this_role": "Full job description text...",
  
  "key_responsibility": [
    "Design and develop scalable backend systems.",
    "Collaborate with cross-functional teams to define features.",
    "Ensure code quality through reviews and testing."
  ],
  
  "required_skills": [
    "Python",
    "Django",
    "AWS",
    "SQL",
    "Docker"
  ],
  
  "required_qualifications": [
    "Bachelor's degree in Computer Science or related field.",
    "3+ years of experience in backend development."
  ],
  
  "benefits_perks": [
    "Health Insurance",
    "Work From Home",
    "Flexible Hours",
    "Learning & Development",
    "Paid Time Off"
  ],
  
  "company_info": {
    "name": "Tech Company Pvt Ltd",
    "Founded in": "2015",
    "Global Employee Count": "500-1000",
    "Website": "https://techcompany.com",
    "company_summary": "Leading technology company specializing in...",
    "work_policy": "Hybrid: 3 days WFO, Remote: 2 days WFH"
  },
  
  "job_type": "Full-time"
}

How It Works

URL Construction - Builds AmbitionBox search URL from role and location parameters
Listing Extraction - Scrapes job listing pages using Scrapy's efficient crawling
Detail Parsing - For each job, extracts comprehensive information from detail pages
Naukri Integration - If enabled, follows "Apply on Naukri" links for additional details
Company Data - Fetches company overview and benefits from dedicated pages
Data Storage - Stores all structured data in Apify dataset

Technologies Used

Scrapy - Fast, high-level web scraping framework
Scrapy-Playwright - Browser automation integration for Scrapy
Playwright - Modern browser automation library
Apify SDK for Python - Actor framework and data storage
BeautifulSoup4 - HTML parsing (for complex extractions)
Regular Expressions - Advanced text extraction and cleaning

Why Playwright?

AmbitionBox employs sophisticated anti-bot detection that blocks standard HTTP requests, even when using proxies. Playwright integration provides:

✅ Real Browser Rendering - Executes JavaScript and renders pages like a real user
✅ Anti-Bot Bypass - Realistic browser fingerprinting and behavior
✅ Reliable Extraction - Ensures all dynamic content is loaded
✅ Scrapy Integration - Maintains all Scrapy benefits (pipelines, items, middlewares)

Advantages Over HTTP-Only Scraping

Reliability - 100% success rate vs 0% with HTTP requests
JavaScript Support - Handles dynamic content loading
Anti-Detection - Bypasses sophisticated bot detection
Future-Proof - Works even as sites add more JavaScript

Local Development

Prerequisites

Python 3.9+
Apify CLI

Installation

# Install Apify CLI
brew install apify-cli  # macOS
# or
npm -g install apify-cli  # Node.js

# Pull the Actor
apify pull 

# Install dependencies
pip install -r requirements.txt

Running Locally

# Run with default input
apify run

# Or create/edit .actor/INPUT.json with your parameters

Example INPUT.json

{
  "role": "python developer",
  "location": "bangalore",
  "maxPages": 3,
  "maxJobs": 50,
  "includeNaukriDetails": true,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}

Performance Tips

Start Small - Test with maxPages: 1 and maxJobs: 10 first
Adjust Concurrency - Modify CONCURRENT_REQUESTS in spider settings for faster/slower scraping
Skip Naukri - Set includeNaukriDetails: false for basic info only (much faster)
Use Proxies - Enable Apify proxy to avoid rate limiting

Scrapy Settings

The spider uses these custom settings for optimal performance:

custom_settings = {
    'CONCURRENT_REQUESTS': 8,  # Parallel requests
    'DOWNLOAD_DELAY': 2,  # Delay between requests (seconds)
    'ROBOTSTXT_OBEY': True,  # Respect robots.txt
    'USER_AGENT': 'Mozilla/5.0...',  # Custom user agent
}

You can modify these in src/spiders/ambitionbox.py if needed.

Troubleshooting

No jobs found

The website structure may have changed
Check if the search URL is correct
Try reducing DOWNLOAD_DELAY if pages load slowly

Incomplete data

Enable includeNaukriDetails for comprehensive extraction
Check if company pages are accessible
Review logs for specific errors

Rate limiting

Increase DOWNLOAD_DELAY in settings
Reduce CONCURRENT_REQUESTS
Ensure proxy configuration is enabled

Proxy timeouts

Switch to residential proxies in proxy configuration (highly recommended)
Residential proxies have much better success rates than datacenter proxies
Update input to include: "apifyProxyGroups": ["RESIDENTIAL"]
Note: Residential proxies consume more proxy credits but significantly improve reliability

Architecture

src/
├── spiders/
│   ├── __init__.py
│   ├── title.py              # Original title spider
│   └── ambitionbox.py        # AmbitionBox job scraper
├── items.py                   # Item definitions
├── pipelines.py               # Data processing pipelines
├── middlewares.py             # Request/response middlewares
├── settings.py                # Scrapy settings
├── main.py                    # Actor entry point
└── __main__.py                # Execution wrapper

Resources

License

Apache 2.0.

Naukri.com Jobs Scrapper

ai-scraper-labs/Naukri-com-Scrapper

Plug‑and‑play Naukri scraper: enter job title & city, run, and get clean CSV/JSON with title, company, location, experience, salary and full descriptions. No coding needed. Fast, scalable (up to 25 workers) — perfect for recruiters and market researchers.

ai-scraper-labs

Jobs Scrapper

ai-scraper-labs/ambition-box-Jobs-scrapper

ai-scraper-labs

Naukri.com Job Scraper

bhansalisoft/naukri-com-job-scraper

Naukri.com Job Scraper : Scrap Unlimited jobs detail from Naukri.

bhansalisoft

Naukri Job Scraper

muhammetakkurtt/naukri-job-scraper

Naukri Job Scraper is an actor that automatically scrapes job postings from Naukri.com. It scrapes details such as job title, company name, experience and salary based on the specified keyword and maximum number of jobs. The collected data can be used for recruitment analysis and market research.

Muhammet Akkurt

1.8K

4.9

Naukri Jobs Scraper

easyapi/naukri-jobs-scraper

Extract detailed job listings from Naukri.com with this powerful scraper. Collect comprehensive job information including titles, salaries, company details, and full descriptions. Perfect for recruitment analysis, market research, and job market monitoring.

EasyApi

388

5.0

Naukri Jobs Scraper API – Latest Jobs by Keyword & Location

nuclear_quietude/naukri-job-scraper

Scrape latest job listings from Naukri.com using keyword, location, experience, salary, and job age filters. Extract job title, company, salary range, skills, and full descriptions. Ideal for recruitment automation, job analytics, and market research via Apify API.

Surya Charan

279

Naukri Jobs Scraper 🔍💼 - Cheap

scrapestorm/naukri-jobs-scraper---cheap

🔍 Easily Search Naukri Job Listings Enter a keyword to find and collect data on relevant job postings from Naukri.com 💼 Get insights such as job title, company, location, experience, salary & job description 🏢📊 Seamlessly integrate with tools like Google Sheets or CRMs to automate workflows ⚡

Storm_Scraper

5.0

Naukri Job Scraper (Latest)

codemaverick/naukri-job-scraper-latest

A powerful job scraping tool that automatically collects latest listings from Naukri.com. It gathers job titles, salaries, company details, and skill requirements, delivering clean, organized data. Perfect for recruiters, HR teams, and job seekers who need up-to-date market insights.

CodeMaverick

338

4.4

Naukri Jobs Scraper

infinity_and_beyond/naukri-jobs-scraper

Scrape structured job listings from Naukri.com with keyword and location filters. Collect job title, company, description, location, and job URL in clean format. Built-in retries, rate-limit handling, and polite delays ensure reliable scraping for recruitment and analytics workflows.

DR Nayaki

Naukri Job Scraper

louisdeconinck/naukri-job-scraper

Unlock the power of Naukri.com with our advanced web scraper! Effortlessly extract comprehensive job listings, including titles, descriptions, and company details. Enjoy real-time data, structured JSON output, and flexible search options. Scale your job search with ease and speed. Try it now!

Louis Deconinck

209

4.5

Jobs Scrapper

AmbitionBox Job Scraper

Features

Input Parameters

Output Format

How It Works

Technologies Used

Why Playwright?

Advantages Over HTTP-Only Scraping

Local Development

Prerequisites

Installation

Running Locally

Example INPUT.json

Performance Tips

Scrapy Settings

Troubleshooting

No jobs found

Incomplete data

Rate limiting

Proxy timeouts

Architecture

Resources

License

You might also like

Naukri.com Jobs Scrapper

Jobs Scrapper

Naukri.com Job Scraper

Naukri Job Scraper

Naukri Jobs Scraper

Naukri Jobs Scraper API – Latest Jobs by Keyword & Location

Naukri Jobs Scraper 🔍💼 - Cheap

Naukri Job Scraper (Latest)

Naukri Jobs Scraper

Naukri Job Scraper