Jobs Scrapper avatar
Jobs Scrapper

Pricing

$20.00/month + usage

Go to Apify Store
Jobs Scrapper

Jobs Scrapper

Powerful AmbitionBox Job Scraper that extracts detailed job listings by role and location. Includes responsibilities, skills, qualifications, company insights, and Naukri integration for technical details. Fast, structured, and proxy-supported for large-scale data collection.

Pricing

$20.00/month + usage

Rating

0.0

(0)

Developer

ai-scraper-labs

ai-scraper-labs

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

1

Monthly active users

8 days ago

Last modified

Share

AmbitionBox Job Scraper

An Apify Actor that scrapes job listings from AmbitionBox and optionally fetches detailed information from Naukri job pages.

Features

  • AmbitionBox Job Scraping - Extracts job listings including title, company, location, salary, and experience requirements
  • Naukri Detail Extraction - Optionally fetches comprehensive job details from linked Naukri pages including:
    • Key responsibilities
    • Required skills and technologies
    • Educational qualifications
    • Detailed job descriptions
  • Company Information - Extracts company details including ratings, headquarters, employee count, and work policies
  • Parallel Processing - Multi-threaded detail extraction for faster scraping
  • Configurable Limits - Control number of pages and jobs to scrape

Input Parameters

  • role (required, string) - Job role or title to search for (e.g., "software engineer", "python developer")
  • location (optional, string) - Location to search jobs in (e.g., "nagpur", "bangalore"). Leave empty for all locations
  • maxPages (optional, integer, default: 2) - Maximum number of listing pages to scrape (1-50)
  • maxJobs (optional, integer, default: 20) - Maximum number of jobs to scrape. Set to 0 for unlimited
  • includeNaukriDetails (optional, boolean, default: true) - Whether to fetch detailed information from Naukri pages (slower but more comprehensive)
  • maxWorkers (optional, integer, default: 3) - Number of parallel workers for detail extraction (1-10)

Output

The Actor stores data in the default dataset with the following structure:

{
"title": "Software Engineer",
"company": "Tech Company",
"location": "Bangalore",
"exp_level": "2-5 years",
"salary_range": "₹8-15 LPA",
"url": "https://www.ambitionbox.com/jobs/...",
"apply_url": "https://www.naukri.com/job-listings-...",
"about_this_role": "Job description text...",
"key_responsibility": ["Responsibility 1", "Responsibility 2"],
"required_skills": ["Python", "SQL", "AWS"],
"required_qualifications": ["Bachelor's degree in CS"],
"preferred_qualifications": ["Master's degree preferred"],
"company_info": {
"name": "Tech Company",
"rating": "4.2",
"founded": "2010",
"headquarters": "Bangalore",
"website": "https://example.com"
},
"job_type": "Full-time",
"job_status": "Open",
"is_active": true
}

How It Works

  1. Build Search URL - Constructs AmbitionBox search URL based on role and location
  2. Scrape Listings - Extracts job listings from multiple pages using Selenium and BeautifulSoup
  3. Extract Details - For each job, visits the detail page to extract comprehensive information
  4. Fetch Naukri Data - If enabled, follows "Apply on Naukri" links to get additional details like:
    • Detailed responsibilities broken down by bullet points
    • Complete skills list
    • Educational requirements
  5. Store Data - Saves all data to Apify dataset in structured JSON format

Technologies Used

  • Apify SDK - Actor framework and data storage
  • Selenium - Browser automation
  • BeautifulSoup4 - HTML parsing
  • Python ThreadPoolExecutor - Parallel detail extraction

Local Development

Prerequisites

  • Python 3.9+
  • Chrome/Firefox browser
  • ChromeDriver/GeckoDriver

Installation

# Install Apify CLI
brew install apify-cli # macOS
# or
npm -g install apify-cli # Node.js
# Pull the Actor (if pulling from Apify)
apify pull <ActorId>
# Install dependencies
pip install -r requirements.txt

Running Locally

# Run with default input
apify run
# Or edit .actor/INPUT.json first with your desired parameters

Example INPUT.json

{
"role": "python developer",
"location": "bangalore",
"maxPages": 3,
"maxJobs": 50,
"includeNaukriDetails": true,
"maxWorkers": 5
}

Deployment

# Login to Apify
apify login
# Deploy to Apify platform
apify push

Performance Tips

  • Start small - Test with maxPages: 1 and maxJobs: 10 first
  • Adjust workers - Increase maxWorkers (up to 10) for faster processing on powerful machines
  • Skip Naukri - Set includeNaukriDetails: false for 3-4x faster scraping if you only need basic info
  • Use headless mode - The Actor runs in headless mode on Apify for better performance

Troubleshooting

  • No jobs found - The website structure may have changed. Check the page source
  • Slow performance - Reduce maxWorkers or disable includeNaukriDetails
  • Browser crashes - The Actor automatically falls back from Chrome to Firefox on Mac ARM

Resources

License

Apache 2.0