Pricing

from $1.00 / 1,000 results

LinkedIn Jobs Scraper

Scrape LinkedIn job postings with full enrichment. Works without cookies using the public guest API. Provide an optional li_at cookie to unlock applicant counts, recruiter details, and benefits.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Ani

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

LinkedIn Scraper - Profiles, Companies, Jobs (No Cookie)

viralanalyzer/linkedin-intelligence

Scrape LinkedIn WITHOUT a cookie or login. Public profiles (name, headline, location, About, full work history), company pages & jobs — no li_at required. Bring your own proxy. Optional li_at adds skills, certifications & languages. Optional Gemini AI analysis.

viralanalyzer

3.9

LinkedIn Job Scraper - No cookies required

harsh719/linkedin-job-postings

Scrape LinkedIn jobs without login. Get full descriptions, hiring urgency scores, applicant competition, salary, benefits, seniority & recruiter contacts. Try it today!

Harsh Shah

LinkedIn Jobs Scraper | No Cookie | Login

rexreus/LinkedIn-Jobs-Scraper

Lightweight LinkedIn jobs scraper using fast HTTP guest endpoints (no login). Bypasses the 1,000-result cap using query segmentation. Supports advanced filters (job type, experience, schedule) and optional li_at cookie for richer data. Cost-effective, fast, and residential proxy ready.

REXREUS D.O

✅ LinkedIn Jobs Scraper — No Cookies · No Login · Bulk Job API

k1ra/linkedin-jobs-scraper

LinkedIn jobs scraper — no cookies, no login. Use as a bulk job scraper, n8n job scraper, Make job scraper, job feed automation or job alert scraper. LinkedIn salary scraper: full descriptions, salaries, remote jobs, applicant counts, seniority & recruiter. LinkedIn jobs csv export. Job search API.

Kevin Savani

Linkedin Search Jobs Scraper (no cookie)

unlimitedleadtestinbox/linkedin-search-jobs-scraper-no-cookie

Scrape Linkedin search jobs with details information for each job listing

unli

LinkedIn Jobs Scraper - Salary, Recruiter & Company ($0.5/1k)

harshmaur/linkedin-jobs-scraper

Scrape LinkedIn jobs by keyword, location & filters, or any LinkedIn search URL - no login or cookies. Get salary, seniority, applicant count, recruiter contact, and full descriptions in text, Markdown & HTML. Optional company details; beat the ~1,000 cap. Pay per result. MCP & AI-agent ready.

Harsh Maur

5.0

LinkedIn Jobs Scraper

solidcode/linkedin-jobs-scraper

[💰 $0.95 / 1K] Extract LinkedIn job postings at scale — title, company, location, salary, description, seniority, employment type, applicant count, and recruiter details. Search by keyword and location with date, job-type, experience-level, and remote filters, or paste LinkedIn search URLs.

SolidCode

LinkedIn Jobs Scraper

pramodkonde17/linkedin-jobs-scraper

Scrape job listings from LinkedIn's public job search

Pramod Konde

LinkedIn Jobs Scraper

dataharvest/linkedin-jobs-scraper

Scrape job listings from LinkedIn Jobs.

Alex v

LinkedIn Post Scraper (Keyword + Cookie)

duyviet/public-data-linkedin

Scrape LinkedIn posts based on keywords using Playwright and authenticated session cookie (li_at). Extracts post content, author details, engagement metrics, and media links, and stores structured results in Apify Dataset

Nguyen Duy Viet

{ "title": "LinkedIn Jobs Scraper", "type": "object", "schemaVersion": 1, "properties": { "searches": { "title": "Search queries", "type": "array", "description": "One or more searches to run. Each entry specifies keywords, location, and filters. Results from all searches are deduplicated and merged into a single dataset.", "editor": "json", "prefill": [ { "keywords": "MBA intern", "location": "London, United Kingdom", "publishedWithin": "week", "maxResults": 50 } ] }, "liAtCookie": { "title": "LinkedIn li_at cookie (optional)", "type": "string", "description": "Your LinkedIn session cookie. NOT required — the scraper works without it using LinkedIn's public guest API. Providing it does two things: (1) improves search pagination so you get a broader, more varied set of results (without it, LinkedIn tends to recycle the same top results across pages for narrow searches); (2) unlocks bonus fields: applicant count, recruiter name & profile URL, and benefits. Get it from: DevTools → Application → Cookies → linkedin.com → li_at.", "editor": "textfield", "isSecret": true }, "proxyConfiguration": { "title": "Proxy configuration", "type": "object", "description": "Recommended: use Apify residential proxies to avoid rate limiting. The scraper will work without proxies for small runs.", "editor": "proxy", "prefill": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] } }, "requestDelay": { "title": "Delay between requests (seconds)", "type": "number", "description": "Wait time between requests. Increase if you see 429 rate-limit errors.", "default": 2.0, "minimum": 1.0, "maximum": 15.0, "editor": "number" }, "deduplicateAcrossSearches": { "title": "Deduplicate across searches", "type": "boolean", "description": "When running multiple searches, skip jobs already seen in earlier searches (matched by LinkedIn job ID).", "default": true, "editor": "checkbox" } }, "required": ["searches"] }

{ "actorSpecification": 1, "name": "linkedin-jobs-scraper", "title": "LinkedIn Jobs Scraper – No Cookie Required", "description": "Scrape LinkedIn job postings with full enrichment. Works without cookies using the public guest API. Provide an optional li_at cookie to unlock applicant counts, recruiter details, and benefits.", "version": "0.1", "buildTag": "latest", "dockerfile": "../Dockerfile", "input": "./INPUT_SCHEMA.json", "storages": { "dataset": { "actorSpecification": 1, "title": "LinkedIn Jobs", "views": { "overview": { "title": "Jobs Overview", "transformation": { "fields": [ "id", "title", "companyName", "location", "workplaceTypes", "employmentType", "seniorityLevel", "salary", "postedAt", "applicantsCount", "applyMethod", "jobUrl", "companyLogo" ] }, "display": { "component": "table", "columns": [ { "label": "Title", "field": "title" }, { "label": "Company", "field": "companyName" }, { "label": "Location", "field": "location" }, { "label": "Work Type", "field": "workplaceTypes" }, { "label": "Contract", "field": "employmentType" }, { "label": "Level", "field": "seniorityLevel" }, { "label": "Salary", "field": "salary" }, { "label": "Posted", "field": "postedAt" }, { "label": "Applicants", "field": "applicantsCount" }, { "label": "Apply", "field": "applyMethod" }, { "label": "URL", "field": "jobUrl" } ] } } } } } }

1""" 2Apify actor entry point for the LinkedIn Jobs Scraper. 3 4Flow: 5 1. Read actor input (searches, optional li_at cookie, proxy config) 6 2. For each search: build a scraper with a fresh proxy session, run it, 7 push each job to the dataset as it arrives (streaming output) 8 3. Deduplicate across searches by LinkedIn job ID (optional) 9""" 10 11import asyncio 12import logging 13 14from apify import Actor 15 16from .scraper import LinkedInScraper 17 18logging.basicConfig(level=logging.INFO) 19logger = logging.getLogger(__name__) 20 21 22async def main() -> None: 23 async with Actor: 24 actor_input = await Actor.get_input() or {} 25 26 searches = actor_input.get("searches") or [] 27 li_at = actor_input.get("liAtCookie", "").strip() 28 delay = float(actor_input.get("requestDelay", 2.0)) 29 dedup = actor_input.get("deduplicateAcrossSearches", True) 30 proxy_cfg = actor_input.get("proxyConfiguration") 31 32 if not searches: 33 await Actor.fail(status_message="No searches provided. Add at least one entry to the 'searches' input.") 34 return 35 36 proxy_configuration = None 37 if proxy_cfg: 38 proxy_configuration = await Actor.create_proxy_configuration( 39 actor_proxy_input=proxy_cfg, 40 ) 41 42 seen_ids: set[str] = set() 43 total_pushed = 0 44 45 for search_index, search in enumerate(searches): 46 keywords = search.get("keywords", "").strip() 47 location = search.get("location", "").strip() 48 published_within = search.get("publishedWithin", "any") 49 max_results = int(search.get("maxResults", 25)) 50 51 if not keywords: 52 logger.warning("Search #%d has no keywords — skipping.", search_index + 1) 53 continue 54 55 label = f'"{keywords}"' + (f' in {location}' if location else '') 56 await Actor.set_status_message( 57 f"Search {search_index + 1}/{len(searches)}: {label} — up to {max_results} jobs" 58 ) 59 logger.info("Starting search %d/%d: %s", search_index + 1, len(searches), label) 60 61 # Get a fresh proxy URL per search (rotates IP between searches) 62 proxy_url = None 63 if proxy_configuration: 64 session_id = f"linkedin_search_{search_index}" 65 proxy_url = await proxy_configuration.new_url(session_id=session_id) 66 67 scraper = LinkedInScraper( 68 li_at_cookie=li_at, 69 proxy_url=proxy_url, 70 request_delay=delay, 71 ) 72 73 from urllib.parse import urlencode 74 input_url = "https://www.linkedin.com/jobs/search/?" + urlencode({ 75 "keywords": keywords, 76 "location": location, 77 }) 78 79 search_count = 0 80 try: 81 for job in scraper.search_and_enrich( 82 keywords=keywords, 83 location=location, 84 published_within=published_within, 85 max_results=max_results, 86 input_url=input_url, 87 ): 88 linkedin_id = job.get("linkedinId") 89 90 if dedup and linkedin_id: 91 if linkedin_id in seen_ids: 92 logger.debug("Duplicate skipped: %s", linkedin_id) 93 continue 94 seen_ids.add(linkedin_id) 95 96 # Tag which search produced this result 97 job["_search"] = { 98 "keywords": keywords, 99 "location": location, 100 "publishedWithin": published_within, 101 } 102 103 await Actor.push_data(job) 104 search_count += 1 105 total_pushed += 1 106 107 await Actor.set_status_message( 108 f"Search {search_index + 1}/{len(searches)}: {label} — " 109 f"{search_count} jobs found (total: {total_pushed})" 110 ) 111 112 except Exception as exc: 113 logger.error("Search %d failed: %s", search_index + 1, exc) 114 await Actor.set_status_message( 115 f"Search {search_index + 1}/{len(searches)} error: {exc}" 116 ) 117 118 logger.info("Search %d done — %d jobs", search_index + 1, search_count) 119 120 await Actor.set_status_message( 121 f"Done. {total_pushed} job(s) saved across {len(searches)} search(es)." 122 ) 123 124 125if __name__ == "__main__": 126 asyncio.run(main())

1""" 2LinkedIn job scraper for the Apify actor. 3 4Guest API (no cookie) covers the core fields. 5Voyager API (li_at cookie) adds deep enrichment: company profile, 6structured salary, job poster photo/title, workplace type, expiry, and more. 7""" 8 9import json 10import logging 11import re 12import time 13from datetime import datetime, timezone 14from typing import Iterator, Optional 15from urllib.parse import parse_qs, urlparse, unquote 16 17import requests 18from bs4 import BeautifulSoup 19 20logger = logging.getLogger(__name__) 21 22TIME_FILTERS = { 23 "day": "r86400", 24 "week": "r604800", 25 "month": "r2592000", 26 "any": "", 27} 28 29WORKPLACE_TYPE_MAP = {"1": "On-site", "2": "Remote", "3": "Hybrid"} 30 31APPLY_METHOD_MAP = { 32 "OffsiteApply": "OffsiteApply", 33 "ComplexOnsiteApply": "ComplexOnsiteApply", 34 "EasyApplyMethod": "EasyApply", 35} 36 37 38class LinkedInScraper: 39 GUEST_SEARCH_URL = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search" 40 GUEST_DETAIL_URL = "https://www.linkedin.com/jobs-guest/jobs/api/jobPosting/{job_id}" 41 VOYAGER_JOB_URL = "https://www.linkedin.com/voyager/api/jobs/jobPostings/{job_id}" 42 VOYAGER_DECORATION = "com.linkedin.voyager.deco.jobs.web.shared.WebFullJobPosting-65" 43 44 BASE_HEADERS = { 45 "User-Agent": ( 46 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) " 47 "AppleWebKit/537.36 (KHTML, like Gecko) " 48 "Chrome/124.0.0.0 Safari/537.36" 49 ), 50 "Accept-Language": "en-US,en;q=0.9", 51 } 52 53 def __init__( 54 self, 55 li_at_cookie: str = "", 56 proxy_url: Optional[str] = None, 57 request_delay: float = 2.0, 58 max_retries: int = 3, 59 ): 60 self.li_at_cookie = li_at_cookie.strip() 61 self.proxy_url = proxy_url 62 self.request_delay = request_delay 63 self.max_retries = max_retries 64 self._csrf_token: Optional[str] = None 65 66 self.session = requests.Session() 67 self.session.headers.update(self.BASE_HEADERS) 68 69 if proxy_url: 70 self.session.proxies = {"http": proxy_url, "https": proxy_url} 71 72 if self.li_at_cookie: 73 self.session.cookies.set("li_at", self.li_at_cookie, domain=".linkedin.com") 74 self._init_csrf() 75 76 def _init_csrf(self) -> None: 77 try: 78 self.session.get("https://www.linkedin.com/jobs/", timeout=15) 79 csrf = self.session.cookies.get("JSESSIONID", "").strip('"') 80 if csrf: 81 self._csrf_token = csrf 82 self.session.headers["csrf-token"] = csrf 83 except Exception as exc: 84 logger.warning("Could not fetch CSRF token: %s", exc) 85 86 def _get(self, url: str, params: dict = None, accept: str = None) -> requests.Response: 87 headers = {"Accept": accept} if accept else {} 88 for attempt in range(1, self.max_retries + 1): 89 try: 90 resp = self.session.get(url, params=params, headers=headers, timeout=15) 91 if resp.status_code == 429: 92 wait = 30 * attempt 93 logger.warning("Rate-limited — waiting %ds (attempt %d/%d)", wait, attempt, self.max_retries) 94 time.sleep(wait) 95 continue 96 resp.raise_for_status() 97 return resp 98 except requests.RequestException as exc: 99 if attempt == self.max_retries: 100 raise 101 logger.warning("Request failed (%s) — retry %d/%d", exc, attempt, self.max_retries) 102 time.sleep(5 * attempt) 103 104 # ── Search ──────────────────────────────────────────────────────────── 105 106 def search( 107 self, 108 keywords: str, 109 location: str = "", 110 published_within: str = "any", 111 max_results: int = 25, 112 input_url: str = "", 113 ) -> Iterator[dict]: 114 time_filter = TIME_FILTERS.get(published_within.lower(), "") 115 collected = 0 116 start = 0 117 118 while collected < max_results: 119 params = {"keywords": keywords, "location": location, "start": start} 120 if time_filter: 121 params["f_TPR"] = time_filter 122 123 logger.info("Search page start=%d (collected %d/%d)", start, collected, max_results) 124 resp = self._get(self.GUEST_SEARCH_URL, params=params) 125 time.sleep(self.request_delay) 126 127 soup = BeautifulSoup(resp.text, "html.parser") 128 cards = soup.find_all("div", class_=re.compile(r"base-card")) 129 if not cards: 130 break 131 132 for card in cards: 133 if collected >= max_results: 134 break 135 job = self._parse_card(card, input_url=input_url) 136 if job: 137 yield job 138 collected += 1 139 140 start += 25 141 142 def _parse_card(self, card: BeautifulSoup, input_url: str = "") -> Optional[dict]: 143 job: dict = {} 144 145 # Full link (with tracking params) + clean jobUrl 146 link_el = ( 147 card.find("a", class_=re.compile(r"base-card__full-link")) 148 or card.find("a", href=re.compile(r"/jobs/view/")) 149 ) 150 if not link_el: 151 return None 152 153 raw_href = link_el.get("href", "") 154 job["link"] = raw_href 155 156 parsed = urlparse(raw_href) 157 params = parse_qs(parsed.query) 158 job["jobUrl"] = f"{parsed.scheme}://{parsed.netloc}{parsed.path}" 159 job["trackingId"] = unquote(params.get("trackingId", [""])[0]) 160 job["refId"] = unquote(params.get("refId", [""])[0]) 161 162 title_el = card.find(["h3", "span"], class_=re.compile(r"base-search-card__title")) 163 if title_el: 164 job["title"] = title_el.get_text(strip=True) 165 166 company_el = card.find(["h4", "a"], class_=re.compile(r"base-search-card__subtitle")) 167 if company_el: 168 job["companyName"] = company_el.get_text(strip=True) 169 if company_el.get("href"): 170 job["companyLinkedinUrl"] = company_el["href"].split("?")[0] 171 172 # Company logo (lazy-loaded image) 173 img = card.find("img", class_=re.compile(r"artdeco-entity-image")) 174 if img: 175 logo = img.get("data-delayed-url") or img.get("src", "") 176 if logo and "ghost" not in logo: 177 job["companyLogo"] = logo 178 179 loc_el = card.find("span", class_=re.compile(r"job-search-card__location")) 180 if loc_el: 181 job["location"] = loc_el.get_text(strip=True) 182 183 time_el = card.find("time") 184 if time_el: 185 dt_str = time_el.get("datetime", "") 186 job["postedAt"] = dt_str 187 # Convert date string to epoch ms for timestamp field 188 try: 189 dt = datetime.strptime(dt_str, "%Y-%m-%d").replace(tzinfo=timezone.utc) 190 job["postedAtTimestamp"] = int(dt.timestamp() * 1000) 191 except (ValueError, TypeError): 192 job["postedAtTimestamp"] = None 193 194 job["inputUrl"] = input_url 195 job["scrapedAt"] = datetime.now(timezone.utc).isoformat() 196 return job 197 198 # ── Enrichment orchestration ────────────────────────────────────────── 199 200 def enrich(self, job: dict) -> dict: 201 job_id = self._extract_job_id(job.get("jobUrl")) 202 if not job_id: 203 return job 204 205 job["id"] = job_id 206 207 try: 208 self._enrich_guest(job, job_id) 209 except Exception as exc: 210 logger.warning("Guest enrich failed for %s: %s", job_id, exc) 211 212 if self.li_at_cookie and self._csrf_token: 213 try: 214 self._enrich_voyager(job, job_id) 215 except Exception as exc: 216 logger.warning("Voyager enrich failed for %s: %s", job_id, exc) 217 218 return job 219 220 @staticmethod 221 def _extract_job_id(url: Optional[str]) -> Optional[str]: 222 if not url: 223 return None 224 m = re.search(r"-(\d{7,})(?:[/?]|$)", url) or re.search(r"/view/(\d+)", url) 225 return m.group(1) if m else None 226 227 # ── Guest enrichment ───────────────────────────────────────────────── 228 229 def _enrich_guest(self, job: dict, job_id: str) -> None: 230 resp = self._get(self.GUEST_DETAIL_URL.format(job_id=job_id)) 231 time.sleep(self.request_delay) 232 soup = BeautifulSoup(resp.text, "html.parser") 233 234 # JSON-LD 235 for script in soup.find_all("script", {"type": "application/ld+json"}): 236 try: 237 ld = json.loads(script.string or "") 238 if ld.get("@type") == "JobPosting": 239 self._apply_jsonld(ld, job) 240 except (json.JSONDecodeError, AttributeError): 241 pass 242 243 # Description — both HTML and plain text 244 desc_el = soup.find("div", class_=re.compile(r"description__text|show-more-less-html")) 245 if desc_el: 246 if not job.get("descriptionHtml"): 247 job["descriptionHtml"] = str(desc_el) 248 if not job.get("descriptionText"): 249 job["descriptionText"] = desc_el.get_text(separator="\n", strip=True) 250 251 # Applicant count 252 if not job.get("applicantsCount"): 253 found = soup.find(string=re.compile(r"\d+\s+applicant", re.I)) 254 if found: 255 raw = found.parent.get_text(strip=True) 256 job["applicantsCount"] = re.search(r"[\d,]+", raw).group() if re.search(r"[\d,]+", raw) else raw 257 258 # Criteria sidebar 259 for item in soup.find_all("li", class_=re.compile(r"description__job-criteria-item")): 260 h3 = item.find("h3") 261 span = item.find("span", class_=re.compile(r"description__job-criteria-text--criteria")) 262 if not h3 or not span: 263 continue 264 header = h3.get_text(strip=True).lower() 265 value = span.get_text(strip=True) 266 if "seniority" in header: 267 job.setdefault("seniorityLevel", value) 268 elif "employment" in header: 269 job.setdefault("employmentType", value) 270 elif "function" in header: 271 job.setdefault("jobFunction", value) 272 elif "industr" in header: 273 job.setdefault("industries", value) 274 275 # Work type badge 276 if not job.get("workplaceTypes"): 277 badge = soup.find(string=re.compile(r"\b(Remote|Hybrid|On-site)\b", re.I)) 278 if badge: 279 text = badge.strip() 280 if re.search(r"hybrid", text, re.I): 281 job["workplaceTypes"] = ["Hybrid"] 282 elif re.search(r"remote", text, re.I): 283 job["workplaceTypes"] = ["Remote"] 284 elif re.search(r"on.?site", text, re.I): 285 job["workplaceTypes"] = ["On-site"] 286 287 # Company LinkedIn URL + ID 288 if not job.get("companyLinkedinUrl"): 289 link = soup.find("a", href=re.compile(r"linkedin\.com/company/")) 290 if link: 291 m = re.search(r"company/([^/?]+)", link["href"]) 292 if m: 293 job.setdefault("companyId", m.group(1)) 294 job["companyLinkedinUrl"] = link["href"].split("?")[0] 295 296 # Apply URL + method 297 if not job.get("applyUrl"): 298 btn = soup.find("a", class_=re.compile(r"apply-button")) 299 if btn: 300 job["applyUrl"] = btn.get("href") or job.get("jobUrl", "") 301 job["applyMethod"] = "EasyApply" if "easy" in btn.get_text(strip=True).lower() else "OffsiteApply" 302 else: 303 job["applyUrl"] = job.get("jobUrl", "") 304 job["applyMethod"] = "Unknown" 305 306 def _apply_jsonld(self, data: dict, job: dict) -> None: 307 job.setdefault("title", data.get("title")) 308 309 # Dates 310 date_posted = data.get("datePosted") 311 if date_posted and not job.get("postedAt"): 312 job["postedAt"] = date_posted 313 if date_posted and not job.get("postedAtTimestamp"): 314 try: 315 dt = datetime.strptime(date_posted, "%Y-%m-%d").replace(tzinfo=timezone.utc) 316 job["postedAtTimestamp"] = int(dt.timestamp() * 1000) 317 except (ValueError, TypeError): 318 pass 319 320 valid_through = data.get("validThrough") 321 if valid_through and not job.get("expireAt"): 322 try: 323 dt = datetime.fromisoformat(valid_through.replace("Z", "+00:00")) 324 job["expireAt"] = int(dt.timestamp() * 1000) 325 except (ValueError, TypeError): 326 pass 327 328 # Description 329 raw = data.get("description", "") 330 if raw and not job.get("descriptionText"): 331 job["descriptionHtml"] = raw 332 job["descriptionText"] = BeautifulSoup(raw, "html.parser").get_text(separator="\n", strip=True) 333 334 # Company 335 org = data.get("hiringOrganization", {}) 336 job.setdefault("companyName", org.get("name")) 337 job.setdefault("companyLinkedinUrl", org.get("sameAs")) 338 339 # Salary — string + structured insights 340 sal = data.get("baseSalary", {}) 341 if sal and not job.get("salary"): 342 val = sal.get("value", {}) 343 mn = val.get("minValue") 344 mx = val.get("maxValue") 345 curr = sal.get("currency", "") 346 unit = val.get("unitText", "YEAR") 347 if mn or mx: 348 period_suffix = "/yr" if "YEAR" in unit.upper() else f"/{unit.lower()}" 349 parts = [f"${mn:,.2f}" if mn else "", f"${mx:,.2f}" if mx else ""] 350 job["salary"] = f"{' - '.join(p for p in parts if p)}{period_suffix}" 351 job["salaryInsights"] = { 352 "compensationBreakdown": [{ 353 "minSalary": str(mn) if mn else None, 354 "maxSalary": str(mx) if mx else None, 355 "payPeriod": unit.upper(), 356 "currencyCode": curr, 357 "compensationType": "BASE_SALARY", 358 }], 359 "compensationSource": "JOB_POSTER_PROVIDED", 360 } 361 362 # Employment type 363 emp = data.get("employmentType") 364 if emp and not job.get("employmentType"): 365 job["employmentType"] = emp if isinstance(emp, str) else ", ".join(emp) 366 367 # Industry 368 job.setdefault("industries", data.get("industry")) 369 370 # Remote flag 371 if not job.get("workplaceTypes"): 372 if data.get("jobLocationType") == "TELECOMMUTE": 373 job["workplaceTypes"] = ["Remote"] 374 job["workRemoteAllowed"] = True 375 376 # Location 377 job_loc = data.get("jobLocation") or {} 378 if isinstance(job_loc, list): 379 job_loc = job_loc[0] if job_loc else {} 380 addr = job_loc.get("address", {}) 381 if addr and not job.get("location"): 382 parts = filter(None, [addr.get("addressLocality"), addr.get("addressRegion"), addr.get("addressCountry")]) 383 loc = ", ".join(parts) 384 if loc: 385 job["location"] = loc 386 387 # Country 388 if addr and not job.get("country"): 389 job["country"] = addr.get("addressCountry", "") 390 391 # Direct apply 392 if not job.get("applyMethod") and data.get("directApply"): 393 job["applyMethod"] = "EasyApply" 394 395 # ── Voyager enrichment ──────────────────────────────────────────────── 396 397 def _enrich_voyager(self, job: dict, job_id: str) -> None: 398 resp = self._get( 399 self.VOYAGER_JOB_URL.format(job_id=job_id), 400 params={"decorationId": self.VOYAGER_DECORATION}, 401 accept="application/vnd.linkedin.normalized+json+2.1", 402 ) 403 time.sleep(self.request_delay) 404 payload = resp.json() 405 data = payload.get("data", {}) 406 included = payload.get("included", []) 407 408 # Description (Voyager HTML is cleaner) 409 desc = data.get("description", {}) 410 if isinstance(desc, dict): 411 html = desc.get("text", "") 412 if html: 413 job["descriptionHtml"] = html 414 job["descriptionText"] = BeautifulSoup(html, "html.parser").get_text(separator="\n", strip=True) 415 416 # Timestamps 417 listed_at = data.get("listedAt") 418 if listed_at: 419 job["postedAtTimestamp"] = listed_at 420 dt = datetime.fromtimestamp(listed_at / 1000, tz=timezone.utc) 421 job["postedAt"] = dt.strftime("%Y-%m-%dT%H:%M:%S.000Z") 422 423 expire_at = data.get("expireAt") 424 if expire_at: 425 job["expireAt"] = expire_at 426 427 # Applicant count 428 applies = data.get("applies") 429 if applies is not None: 430 job["applicantsCount"] = str(applies) 431 432 # Employment type 433 job.setdefault("employmentType", data.get("formattedEmploymentStatus") or "") 434 435 # Workplace types (array of strings) 436 workplace_urns = data.get("workplaceTypes", []) 437 if workplace_urns: 438 codes = [re.search(r":(\d+)$", u).group(1) for u in workplace_urns if re.search(r":(\d+)$", u)] 439 labels = [WORKPLACE_TYPE_MAP[c] for c in codes if c in WORKPLACE_TYPE_MAP] 440 if labels: 441 job["workplaceTypes"] = labels 442 remote = data.get("workRemoteAllowed") 443 if remote is not None: 444 job["workRemoteAllowed"] = remote 445 if not job.get("workplaceTypes"): 446 job["workplaceTypes"] = ["Remote"] if remote else ["On-site"] 447 448 # Industries / sector 449 industries = data.get("formattedIndustries") or [] 450 if industries: 451 job.setdefault("industries", ", ".join(industries)) 452 453 # Salary — structured insights + formatted string 454 sal = data.get("salary") or {} 455 if sal and not job.get("salary"): 456 mn = sal.get("min") 457 mx = sal.get("max") 458 curr = sal.get("currencyCode", "") 459 per = sal.get("payPeriod", "YEAR") 460 if mn or mx: 461 suffix = "/yr" if "YEAR" in per.upper() else f"/{per.lower()}" 462 parts = [f"${mn:,.2f}" if mn else "", f"${mx:,.2f}" if mx else ""] 463 job["salary"] = f"{curr} {' - '.join(p for p in parts if p)}{suffix}".strip() 464 job.setdefault("salaryInsights", { 465 "compensationBreakdown": [{ 466 "minSalary": str(mn) if mn else None, 467 "maxSalary": str(mx) if mx else None, 468 "payPeriod": per.upper(), 469 "currencyCode": curr, 470 "compensationType": "BASE_SALARY", 471 }], 472 "compensationSource": "JOB_POSTER_PROVIDED", 473 }) 474 475 if not job.get("salaryInsights"): 476 job["salaryInsights"] = {} 477 478 # Apply method 479 apply_method = data.get("applyMethod", {}) 480 raw_type = apply_method.get("$type", "") 481 for key, label in APPLY_METHOD_MAP.items(): 482 if key in raw_type: 483 job["applyMethod"] = label 484 break 485 if "EasyApply" in raw_type: 486 job["applyUrl"] = job.get("jobUrl", "") 487 else: 488 job.setdefault("applyUrl", apply_method.get("companyApplyUrl") or job.get("jobUrl", "")) 489 490 # Country from jobGeoLocation 491 if not job.get("country"): 492 geo = data.get("jobGeoLocation", {}) or {} 493 country_urn = geo.get("country", "") 494 if country_urn: 495 m = re.search(r":([A-Z]{2})$", country_urn) 496 if m: 497 job["country"] = m.group(1) 498 499 # Standardized title + seniority — from included Title 500 for item in included: 501 if item.get("$type") == "com.linkedin.voyager.jobs.shared.Title": 502 job.setdefault("standardizedTitle", item.get("localizedName", "")) 503 seniority = item.get("experienceLevel", {}) 504 if isinstance(seniority, dict): 505 job.setdefault("seniorityLevel", seniority.get("localizedName", "")) 506 break 507 508 # Job poster — name, title, photo, profile URL 509 for item in included: 510 if item.get("$type") == "com.linkedin.voyager.jobs.JobHiringTeam": 511 members = item.get("hiringTeamMembers", []) 512 if members: 513 first = members[0] 514 name = f"{first.get('firstName', '')} {first.get('lastName', '')}".strip() 515 slug = first.get("publicIdentifier", "") 516 job.setdefault("jobPosterName", name or None) 517 job.setdefault("jobPosterTitle", first.get("occupation") or None) 518 job.setdefault("jobPosterProfileUrl", f"https://www.linkedin.com/in/{slug}" if slug else None) 519 # Photo URL from miniProfile picture 520 picture = first.get("picture", {}) or {} 521 artifacts = picture.get("artifacts", []) 522 if artifacts: 523 root = picture.get("rootUrl", "") 524 best = max(artifacts, key=lambda a: a.get("width", 0)) 525 job.setdefault("jobPosterPhoto", root + best.get("fileIdentifyingUrlPathSegment", "")) 526 break 527 528 # Company profile — from included Company object 529 for item in included: 530 if item.get("$type") == "com.linkedin.voyager.organization.Company": 531 slug = item.get("universalName", "") 532 if slug: 533 job.setdefault("companyId", slug) 534 job.setdefault("companyLinkedinUrl", f"https://www.linkedin.com/company/{slug}") 535 536 job.setdefault("companyWebsite", item.get("companyPageUrl") or item.get("websiteUrl") or "") 537 job.setdefault("companySlogan", item.get("tagline") or "") 538 job.setdefault("companyDescription", item.get("description") or "") 539 job.setdefault("companyEmployeesCount", item.get("staffCount") or item.get("staffCountRange", {}).get("start")) 540 541 # Company logo 542 logo_obj = item.get("logoV2") or item.get("logo") or {} 543 artifacts = logo_obj.get("artifacts", []) 544 if artifacts: 545 root = logo_obj.get("rootUrl", "") 546 best = max(artifacts, key=lambda a: a.get("width", 0)) 547 job.setdefault("companyLogo", root + best.get("fileIdentifyingUrlPathSegment", "")) 548 549 # Headquarters address 550 hq = item.get("headquarter") or {} 551 if hq: 552 job.setdefault("companyAddress", { 553 "type": "PostalAddress", 554 "streetAddress": " ".join(filter(None, [hq.get("street1"), hq.get("street2")])), 555 "addressLocality": hq.get("city", ""), 556 "addressRegion": hq.get("geographicArea", ""), 557 "postalCode": hq.get("postalCode", ""), 558 "addressCountry": hq.get("country", ""), 559 }) 560 break 561 562 # Benefits — as array 563 try: 564 ben = self.session.get( 565 f"https://www.linkedin.com/voyager/api/jobs/jobPostings/{job_id}/benefits", 566 headers={"Accept": "application/vnd.linkedin.normalized+json+2.1"}, 567 timeout=10, 568 ) 569 if ben.status_code == 200: 570 ben_data = ben.json().get("data", {}) 571 items = ben_data.get("benefits") or ben_data.get("elements") or [] 572 job["benefits"] = [ 573 i.get("localizedName") or i.get("name") or "" 574 for i in items if isinstance(i, dict) 575 ] 576 else: 577 job.setdefault("benefits", []) 578 except Exception: 579 job.setdefault("benefits", []) 580 581 # ── Public interface ────────────────────────────────────────────────── 582 583 def search_and_enrich( 584 self, 585 keywords: str, 586 location: str = "", 587 published_within: str = "any", 588 max_results: int = 25, 589 input_url: str = "", 590 ) -> Iterator[dict]: 591 for job in self.search(keywords, location, published_within, max_results, input_url): 592 yield self.enrich(job)

LinkedIn Jobs Scraper

LinkedIn Scraper - Profiles, Companies, Jobs (No Cookie)

LinkedIn Job Scraper - No cookies required

LinkedIn Jobs Scraper | No Cookie | Login

✅ LinkedIn Jobs Scraper — No Cookies · No Login · Bulk Job API

Linkedin Search Jobs Scraper (no cookie)

LinkedIn Jobs Scraper - Salary, Recruiter & Company ($0.5/1k)

LinkedIn Jobs Scraper

LinkedIn Jobs Scraper

LinkedIn Jobs Scraper

LinkedIn Post Scraper (Keyword + Cookie)

.gitignore

Dockerfile

requirements.txt

.actor/INPUT_SCHEMA.json

.actor/actor.json

src/init.py

src/main.py

src/scraper.py

LinkedIn Jobs Scraper

You might also like

LinkedIn Scraper - Profiles, Companies, Jobs (No Cookie)

LinkedIn Job Scraper - No cookies required

LinkedIn Jobs Scraper | No Cookie | Login

✅ LinkedIn Jobs Scraper — No Cookies · No Login · Bulk Job API

Linkedin Search Jobs Scraper (no cookie)

LinkedIn Jobs Scraper - Salary, Recruiter & Company ($0.5/1k)

LinkedIn Jobs Scraper

LinkedIn Jobs Scraper

LinkedIn Jobs Scraper

LinkedIn Post Scraper (Keyword + Cookie)

.gitignore

Dockerfile

requirements.txt

.actor/INPUT_SCHEMA.json

.actor/actor.json

src/__init__.py

src/main.py

src/scraper.py

src/init.py