Job-nexus
Pricing
from $0.02 / 1,000 job_listings
Job-nexus
This Actor scrapes job listings from public job boards and enriches them into structured, analysis-ready data. It is designed for recruiters, job market analysts, startups, and AI/LLM pipelines that need reliable job data without manual effort.
Pricing
from $0.02 / 1,000 job_listings
Rating
5.0
(7)
Developer

sujan shetty
Actor stats
4
Bookmarked
8
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Job Nexus
A powerful Apify Actor that scrapes job listings from RemoteOK, LinkedIn, and Naukri. It allows you to search for jobs by keywords and optionally clean the job descriptions using Google's Gemini AI.
Note: LinkedIn and Naukri scrapers are experimental and work best with small volumes. They use headless browsers (Playwright) which makes the actor heavier and slower than the RemoteOK-only version.
Features
- Multi-Source Scraping:
- RemoteOK (Default): Fast, API-based scraping.
- LinkedIn (Experimental): Scrapes public job search pages. Includes smart filtering to skip "masked" jobs (hidden titles/companies). Note: Strict rate limits apply.
- Naukri (Experimental): Scrapes job listings from Naukri.com.
- Keyword Search: Scrape jobs matching specific keywords (e.g., "devops", "react", "data scientist").
- AI-Powered Cleaning: specific integration with Google Gemini 2.5 Flash Lite to convert messy HTML job descriptions into clean, readable plain text.
- Configurable Limit: Set a maximum number of jobs to scrape to control costs and runtime.
- Proxy Support: Built-in proxy configuration to avoid IP blocking.
Inputs
| Input | Type | Default | Description |
|---|---|---|---|
keywords | Array | ["data scientist"] | List of keywords to search for. |
location | String | "Bangalore" | Job location filter (e.g. "San Francisco", "Remote"). |
maxJobs | Integer | 10 | Maximum number of jobs to scrape per source. |
useLLM | Boolean | false | Enable AI cleaning of job descriptions. |
geminiApiKey | String | "" | Required if useLLM is true. Your Google Gemini API Key. |
enableRemoteOK | Boolean | true | Enable scraping from RemoteOK. |
enableLinkedIn | Boolean | false | Enable scraping from LinkedIn (Experimental). |
enableNaukri | Boolean | false | Enable scraping from Naukri (Experimental). |
Gemini AI Integration
This actor supports Google Gemini 2.5 Flash Lite for cleaning job descriptions. When enabled, the actor sends the raw HTML description to Gemini and replaces it with a clean, plain-text version.
How to use AI Cleaning:
- Get a free API Key from Google AI Studio.
- Set
useLLMtotruein the input configuration. - Paste your API Key into the
geminiApiKeyfield.
Note on Rate Limits: The actor includes a built-in delay (4 seconds) between AI requests to respect the free tier rate limits of the Gemini API.
Output
The actor stores results in the default Apify dataset. Each item contains:
titlecompanylocationapplyUrl(Direct link to job application)datesource(RemoteOK, LinkedIn, or Naukri)description(Cleaned text if AI is used, otherwise raw HTML or skipped for some sources)salary(If available)
Pro Tip: If you see "There are no items on this page" in the Overview tab, click on the "All fields" tab to see all scraped data columns.
Usage Tips
- LinkedIn & Naukri: These sites have strict anti-bot protections. If you see few results or errors, it's likely due to IP blocking. Using residential proxies (configured in Apify) can help.
- Cost: Enabling LinkedIn/Naukri uses "Compute Units" faster because it launches a full Chrome browser for each request.
License
ISC