Pricing

$20.00/month + usage

Advanced Linkedin Jobs Scraper With Ai

An intelligent, high-performance LinkedIn job scraper powered by LangGraph multi-agent system, LlamaIndex for semantic search, and Crawlee + Playwright for robust web scraping

Pricing

$20.00/month + usage

Rating

0.0

(0)

Developer

charith wijesundara

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Advanced LinkedIn Jobs Scraper with Agentic AI

An intelligent, high-performance LinkedIn job scraper powered by LangGraph multi-agent system, LlamaIndex for semantic search, Crawlee + Playwright for robust web scraping, and multi-LLM support including OpenAI (GPT-4o) and Google Gemini. This Actor goes beyond basic job scraping to provide AI-powered job matching, market insights, and personalized recommendations.

[!NOTE] Versatile Use Cases: Designed for both Professional (HR, Recruitment, Market Research) and Personal (Job Hunting, Career Planning) work. Whether you're building a talent pipeline or looking for your next dream role, this tool has you covered.

🌟 Features

🤖 Agentic AI System

Multi-LLM Support: Built-in support for OpenAI and Google Gemini models
Multi-agent orchestration using LangGraph
Intelligent workflow that scrapes, indexes, analyzes, and matches jobs
Semantic understanding of job requirements and user profiles

🎯 Smart Job Matching

Profile-based matching with relevance scoring (0-1)
Skill analysis with percentage match calculations
Location, salary, and experience matching
Explainable AI - get reasons why each job matches your profile

📊 Market Insights

Top skills in demand with frequency analysis
Salary trends and averages
Location distribution across jobs
Experience level and employment type breakdowns
Remote work availability statistics

🛡️ Robust Scraping

Anti-bot protection with residential proxies
Intelligent pagination and dynamic content handling
Error recovery with retry logic
Rate limiting to avoid blocks

🚀 Quick Start

Basic Usage

{
  "jobTitle": "Python Developer",
  "locations": ["New York, NY", "Remote"],
  "maxJobs": 50,
  "userSkills": ["Python", "FastAPI", "PostgreSQL", "Docker"],
  "userExperience": 3,
  "enableJobMatching": true,
  "enableMarketAnalysis": true
}

Advanced Configuration (Google Gemini)

{
  "jobTitle": "Machine Learning Engineer",
  "locations": ["San Francisco, CA", "Remote"],
  "maxJobs": 100,
  "datePosted": "week",
  "experienceLevel": ["Mid-Senior level", "Director"],
  "remoteFilter": "Remote",
  
  "userSkills": ["Python", "TensorFlow", "PyTorch", "MLOps"],
  "userExperience": 5,
  "salaryMin": 150000,
  "salaryMax": 250000,
  "preferredJobTypes": ["Full-time"],
  "mustHaveSkills": ["Python", "Machine Learning"],
  "remotePreference": "Remote",
  
  "llmProvider": "google",
  "modelName": "gemini-1.5-pro",
  "googleApiKey": "YOUR_GOOGLE_API_KEY",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

📥 Input Parameters

Job Search Criteria

Parameter	Type	Description	Required
`jobTitle`	string	Job title or keywords to search	✅
`locations`	array	Job locations (e.g., "New York, NY", "Remote")	❌
`maxJobs`	integer	Maximum jobs to scrape (1-500)	❌ (default: 100)
`datePosted`	enum	Filter by date: any, day, week, month	❌ (default: week)
`experienceLevel`	array	Filter by seniority level	❌
`remoteFilter`	enum	Remote, On-site, Hybrid, or empty	❌

User Profile (for Matching)

Parameter	Type	Description
`userSkills`	array	Your technical skills
`userExperience`	integer	Years of experience
`preferredLocations`	array	Preferred job locations
`salaryMin`	integer	Minimum desired salary (USD yearly)
`salaryMax`	integer	Maximum desired salary (USD yearly)
`preferredJobTypes`	array	Preferred employment types
`mustHaveSkills`	array	Non-negotiable required skills
`remotePreference`	enum	Remote work preference

Configuration

Parameter	Type	Description
`enableJobMatching`	boolean	Enable AI job matching (default: true)
`enableMarketAnalysis`	boolean	Generate market insights (default: true)
`modelName`	enum	OpenAI model: gpt-4o, gpt-4o-mini, o1, o3-mini
`proxyConfiguration`	object	Proxy settings (recommended: residential)
`debug`	boolean	Enable debug logging

📤 Output

Dataset: Job Matches

Each job includes:

Basic Info: Title, company, location, URL
Match Score: Relevance score (0-1) and reasons
Details: Skills, salary, employment type, seniority level
Metadata: Posted date, number of applicants, remote friendliness

Example:

{
  "title": "Senior Python Developer",
  "company": "TechCorp Inc.",
  "location": "New York, NY",
  "relevance_score": 0.87,
  "match_reasons": ["Strong skill match", "Preferred location", "Salary matches expectations"],
  "skills_required": ["Python", "FastAPI", "PostgreSQL", "Docker", "AWS"],
  "salary_range": {
    "min": 120000,
    "max": 160000,
    "currency": "USD",
    "period": "yearly"
  },
  "employment_type": "Full-time",
  "seniority_level": "Mid-Senior level",
  "posted_date": "2 days ago",
  "num_applicants": 47,
  "url": "https://www.linkedin.com/jobs/view/123456789"
}

Key-Value Store: Market Insights

{
  "total_jobs_analyzed": 100,
  "top_skills": [
    ["Python", 78],
    ["AWS", 56],
    ["Docker", 45]
  ],
  "avg_salary_range": {
    "min": 125000,
    "max": 175000,
    "currency": "USD"
  },
  "remote_jobs_percentage": 65.5,
  "avg_applicants": 42.3
}

🏗️ Architecture

Technology Stack

Agent Framework: LangGraph for multi-agent coordination
Semantic Search: LlamaIndex with OpenAI embeddings
Web Scraping: Crawlee + Playwright for robust scraping
LLM: OpenAI GPT-4o/GPT-4o-mini
Platform: Apify Actors (serverless)

Workflow

1. User Input → Parse criteria and profile
2. Scraper Agent → Scrape LinkedIn jobs with Crawlee
3. Indexer → Create vector index with LlamaIndex
4. Analysis Agent → Generate market insights
5. Matching Agent → Match jobs to profile with AI
6. Output → Ranked jobs + insights

⚙️ How It Works

1. Job Scraping

Crawlee navigates LinkedIn with Playwright
Handles pagination and dynamic content
Extracts comprehensive job details
Uses residential proxies to avoid blocking

2. Intelligent Indexing

LlamaIndex creates semantic embeddings
Jobs indexed for fast similarity search
Enables natural language queries

3. Profile Matching

Multi-factor scoring algorithm:
- Skill match (40% weight)
- Location match (20% weight)
- Experience match (15% weight)
- Salary match (15% weight)
- Employment type (10% weight)
Explainable results with match reasons

4. Market Analysis

Aggregates data across all scraped jobs
Identifies trending skills and technologies
Calculates salary benchmarks
Analyzes remote work availability

🛠️ Local Development

Prerequisites

Python 3.14+
Apify CLI: npm install -g apify-cli
OpenAI API key

Setup

# Clone or navigate to directory
cd langraph-linkedin-jobs-scraper

# Set environment variables
export OPENAI_API_KEY="your-openai-api-key"
export APIFY_PROXY_PASSWORD="your-proxy-password"  # Optional

# Install dependencies
pip install -r requirements.txt

# Install Playwright browsers
playwright install chromium

# Run locally
apify run

Input Format

Create storage/key_value_stores/default/INPUT.json:

{
  "jobTitle": "Data Scientist",
  "locations": ["Remote"],
  "maxJobs": 20,
  "userSkills": ["Python", "Machine Learning"],
  "debug": true
}

🚨 Important Notes

LinkedIn Blocking

LinkedIn actively blocks automated scrapers. To minimize blocking:

✅ Use residential proxies (configured by default)
✅ Keep maxJobs reasonable (<200)
✅ Don't run too frequently
✅ Respect rate limits

Privacy & Ethics

❌ Do not scrape personal data without permission
❌ Respect LinkedIn's Terms of Service
❌ Don't use for spam or unauthorized recruiting
✅ Use for personal job searching and market research

API Costs

OpenAI API usage for embeddings and LLM
Apify platform usage and proxy costs
Expect ~$0.10-0.50 per run depending on maxJobs

📚 Resources

🐛 Troubleshooting

"No jobs found"

Check if LinkedIn is blocking your IP
Try enabling residential proxies
Verify job title and location are correct

"LinkedIn blocking/captcha"

Use residential proxies (set in proxyConfiguration)
Reduce maxJobs parameter
Increase delays (modify scraper.py)

"OpenAI API errors"

Verify OPENAI_API_KEY is set correctly
Check API quota and billing
Try switching to gpt-4o-mini for lower cost

"Scraping timeout"

Reduce maxJobs parameter
Check internet connection
LinkedIn might be experiencing issues

🤝 Support

For issues, questions, or feedback:

Open an issue on Apify
Contact via Apify platform
Check Apify documentation

Made with ❤️ using LangGraph, LlamaIndex, and Crawlee

Linkedin Jobs Scraper

majorelle_scissors/linkedin-jobs-scraper

Saleem Khaja

Linkedin Job Scraper

saleleads.ai/linkedin-job-scraper

Linkedin Job Scraper

Saleleads

5.0

LinkedIn Search Jobs Scraper

scrapier/linkedin-search-jobs-scraper

Scrape job listings from LinkedIn with the LinkedIn Search Jobs Scraper. Extract job titles, companies, locations, posting dates, and descriptions by keyword or filters. Ideal for market research, recruitment, and job trend analysis. Fast, accurate, and scalable for bulk scraping.

Scrapier

LinkedIn Jobs Data Scraper

annabats/linkedin-scraper

Ahmad Awab

Linkedin Jobs Scraper

dead00/linkedin-jobs-scraper

A LinkedIn job scraper this scraper extracts comprehensive job listings from LinkedIn with advanced data processing and cleaning capabilities.

Dead

LinkedIn Agent

apexronin/linkedin-agent

A linkedin agent

Jensin

Linkedin Jobs Scraper

hungryai/apify-scraper-linkedin

LinkedIn job scraper powered by Playwright and Apify. Collects job URLs, titles, companies, locations, descriptions, and optional AI-generated summaries. Perfect for hiring pipelines, automation workflows, and data-driven analysis.

Bhavesh Walankar

Linkedin Search Jobs Scraper (no cookie)

unlimitedleadtestinbox/linkedin-search-jobs-scraper-no-cookie

Scrape Linkedin search jobs with details information for each job listing

unli

YouTube Autopilot: LangGraph-Powered Video Generation Agent

wedo_software/wedo-ai-video

An advanced LangGraph-powered agent that automates the entire YouTube video creation process. Generate high-quality, topic-driven videos with AI-driven scripting, visual coordination, and seamless workflow automation.

Benjamin

LinkedIn Search Jobs Scraper

api-empire/linkedin-search-jobs-scraper

LinkedIn Search Jobs Scraper extracts job listings from any LinkedIn search query. Capture titles, companies, locations, descriptions, salaries, and posting dates. Ideal for market research, hiring, lead generation, and workflows needing structured LinkedIn job search data.

API Empire

Advanced Linkedin Jobs Scraper With Ai

Advanced LinkedIn Jobs Scraper with Agentic AI

🌟 Features

🤖 Agentic AI System

🎯 Smart Job Matching

📊 Market Insights

🛡️ Robust Scraping

🚀 Quick Start

Basic Usage

Advanced Configuration (Google Gemini)

📥 Input Parameters

Job Search Criteria

User Profile (for Matching)

Configuration

📤 Output

Dataset: Job Matches

Key-Value Store: Market Insights

🏗️ Architecture

Technology Stack

Workflow

⚙️ How It Works

1. Job Scraping

2. Intelligent Indexing

3. Profile Matching

4. Market Analysis

🛠️ Local Development

Prerequisites

Setup

Input Format

🚨 Important Notes

LinkedIn Blocking

Privacy & Ethics

API Costs

📚 Resources

🐛 Troubleshooting

"No jobs found"

"LinkedIn blocking/captcha"

"OpenAI API errors"

"Scraping timeout"

🤝 Support

You might also like

Linkedin Jobs Scraper

Linkedin Job Scraper

LinkedIn Search Jobs Scraper

LinkedIn Jobs Data Scraper

Linkedin Jobs Scraper

LinkedIn Agent

Linkedin Jobs Scraper

Linkedin Search Jobs Scraper (no cookie)

YouTube Autopilot: LangGraph-Powered Video Generation Agent

LinkedIn Search Jobs Scraper