Job-nexus avatar
Job-nexus

Pricing

from $0.02 / 1,000 job_listings

Go to Apify Store
Job-nexus

Job-nexus

This Actor scrapes job listings from public job boards and enriches them into structured, analysis-ready data. It is designed for recruiters, job market analysts, startups, and AI/LLM pipelines that need reliable job data without manual effort.

Pricing

from $0.02 / 1,000 job_listings

Rating

5.0

(7)

Developer

sujan shetty

sujan shetty

Maintained by Community

Actor stats

4

Bookmarked

8

Total users

1

Monthly active users

a day ago

Last modified

Share

Job Nexus

A powerful Apify Actor that scrapes job listings from RemoteOK, LinkedIn, and Naukri. It allows you to search for jobs by keywords and optionally clean the job descriptions using Google's Gemini AI.

Note: LinkedIn and Naukri scrapers are experimental and work best with small volumes. They use headless browsers (Playwright) which makes the actor heavier and slower than the RemoteOK-only version.

Features

  • Multi-Source Scraping:
    • RemoteOK (Default): Fast, API-based scraping.
    • LinkedIn (Experimental): Scrapes public job search pages. Includes smart filtering to skip "masked" jobs (hidden titles/companies). Note: Strict rate limits apply.
    • Naukri (Experimental): Scrapes job listings from Naukri.com.
  • Keyword Search: Scrape jobs matching specific keywords (e.g., "devops", "react", "data scientist").
  • AI-Powered Cleaning: specific integration with Google Gemini 2.5 Flash Lite to convert messy HTML job descriptions into clean, readable plain text.
  • Configurable Limit: Set a maximum number of jobs to scrape to control costs and runtime.
  • Proxy Support: Built-in proxy configuration to avoid IP blocking.

Inputs

InputTypeDefaultDescription
keywordsArray["data scientist"]List of keywords to search for.
locationString"Bangalore"Job location filter (e.g. "San Francisco", "Remote").
maxJobsInteger10Maximum number of jobs to scrape per source.
useLLMBooleanfalseEnable AI cleaning of job descriptions.
geminiApiKeyString""Required if useLLM is true. Your Google Gemini API Key.
enableRemoteOKBooleantrueEnable scraping from RemoteOK.
enableLinkedInBooleanfalseEnable scraping from LinkedIn (Experimental).
enableNaukriBooleanfalseEnable scraping from Naukri (Experimental).

Gemini AI Integration

This actor supports Google Gemini 2.5 Flash Lite for cleaning job descriptions. When enabled, the actor sends the raw HTML description to Gemini and replaces it with a clean, plain-text version.

How to use AI Cleaning:

  1. Get a free API Key from Google AI Studio.
  2. Set useLLM to true in the input configuration.
  3. Paste your API Key into the geminiApiKey field.

Note on Rate Limits: The actor includes a built-in delay (4 seconds) between AI requests to respect the free tier rate limits of the Gemini API.

Output

The actor stores results in the default Apify dataset. Each item contains:

  • title
  • company
  • location
  • applyUrl (Direct link to job application)
  • date
  • source (RemoteOK, LinkedIn, or Naukri)
  • description (Cleaned text if AI is used, otherwise raw HTML or skipped for some sources)
  • salary (If available)

Pro Tip: If you see "There are no items on this page" in the Overview tab, click on the "All fields" tab to see all scraped data columns.

Usage Tips

  • LinkedIn & Naukri: These sites have strict anti-bot protections. If you see few results or errors, it's likely due to IP blocking. Using residential proxies (configured in Apify) can help.
  • Cost: Enabling LinkedIn/Naukri uses "Compute Units" faster because it launches a full Chrome browser for each request.

License

ISC