Linkedin Company Jobs Scraper
Pricing
from $5.00 / 1,000 results
Linkedin Company Jobs Scraper
Bulk scrape public job listings from lists of LinkedIn Company pages. Bypasses guest limits and data obfuscation (*****) to retrieve hundreds of clean results per company. Features smart proxy rotation and automated batch processing to scale your data collection.
Pricing
from $5.00 / 1,000 results
Rating
0.0
(0)
Developer
Gyanendra Thakur
Maintained by CommunityActor stats
4
Bookmarked
118
Total users
35
Monthly active users
13 days ago
Last modified
Categories
Share
Collect, filter, enrich, and monitor public jobs from LinkedIn company pages.
This Apify Actor is designed for people who need dependable company-level hiring data without maintaining their own scraper. Add one or more public LinkedIn company URLs, choose a run profile, and receive clean job records in an Apify Dataset. Advanced controls are available when you need tighter filtering, predictable runtime, persistent new-job monitoring, or job-detail enrichment.
The Actor uses public LinkedIn company pages and public guest job endpoints. It does not sign in to LinkedIn or access profiles, messages, private pages, or member-only data.
What you can do
- Scrape one company or up to 100 company pages in a run.
- Start quickly with Fast, Balanced, or Enriched profiles.
- Limit results by posting age, keywords, location, remote status, workplace type, employment type, or seniority.
- Fetch descriptions, applicant text, apply URLs, and other public detail fields.
- Run scheduled monitoring workflows that emit only newly discovered matching jobs.
- Choose whether the first monitoring run emits or silently stores its baseline.
- Control pages, total results, request attempts, retries, pacing, detail concurrency, and whole-run duration.
- Continue past an unavailable company or stop immediately.
- Keep partial job results when the configured run-time budget is reached.
- Export jobs as JSON, CSV, Excel, XML, HTML, or through the Apify Dataset API.
Quick start
- Open the Actor's Input tab.
- Add at least one URL such as
https://www.linkedin.com/company/google. - Leave Balanced selected for a normal discovery run.
- Optionally add filters.
- Start the Actor.
- Open Output > Job records to inspect or export the dataset.
- Open Run summary when you need counts, failures, stop reasons, or the resolved configuration.
Minimal API input:
{"companyUrls": ["https://www.linkedin.com/company/google"],"runProfile": "balanced"}
Choose a run profile
Profiles provide defaults. Any advanced field included in the input overrides the selected profile.
| Profile | Best for | Descriptions | Jobs/company | Request budget | Run budget | Detail concurrency |
|---|---|---|---|---|---|---|
fast | Quick discovery and inexpensive tests | No | 50 | 100 | 5 minutes | 1 |
balanced | Most regular scraping workflows | No | 50 | 300 | 15 minutes | 2 |
enriched | Research requiring descriptions and detail metadata | Yes | 50 | 500 | 30 minutes | 3 |
custom | Fully controlled API inputs | No | 50 | 300 | 15 minutes | 2 |
For an unfiltered run, the Actor derives a conservative page allowance from maxJobs instead of automatically scanning the profile's full page ceiling. Selective filters and monitoring retain the wider profile allowance. An explicit maxPagesPerCompany always wins.
Input reference
Companies
| Input | Type | Default | Behavior |
|---|---|---|---|
companyUrls | string[] | Required | Public https://www.linkedin.com/company/... URLs. Duplicates, query strings, fragments, and trailing slashes are normalized. |
The input form accepts up to 100 unique company URLs. Invalid URLs are rejected before network requests begin.
Profile and enrichment
| Input | Type | Default | Behavior |
|---|---|---|---|
runProfile | string | balanced | Selects Fast, Balanced, Enriched, or Custom defaults. |
maxJobs | integer | Profile value | Maximum jobs that pass all filters per company, from 1 to 2,000. |
maxTotalJobs | integer | 1,000 | Hard dataset-row cap across the entire run. Explicit maxJobs can raise it. |
scrapeDescription | boolean | Profile value | Fetches descriptions and public detail fields only for candidates still needed. |
Some filters require detail metadata. If remoteFilter, workplaceTypes, employmentTypes, or seniorityLevels is active, the Actor fetches job details even when scrapeDescription is false. In that case, metadata is used for filtering but description fields remain unrequested.
Date, keyword, and location filters
| Input | Type | Default | Behavior |
|---|---|---|---|
postedWithinDays | integer | 0 | Keeps jobs posted in the last N days. 0 disables the date filter. |
includeJobsWithUnknownDate | boolean | true | Keeps records whose public card has no valid posting date. |
keywordInclude | string[] | [] | Terms matched case-insensitively against title and location. |
keywordMatchMode | string | any | Requires any or all include terms. |
keywordExclude | string[] | [] | Rejects a job when any term appears in title or location. |
locationInclude | string[] | [] | Requires the raw location to contain at least one value. |
locationExclude | string[] | [] | Rejects the raw location when it contains any value. |
Exclusion always wins. The Actor rejects an input when the same normalized term is present in both an include and exclude list because that configuration cannot produce an intuitive result.
Detail metadata filters
| Input | Type | Default | Behavior |
|---|---|---|---|
remoteFilter | string | all | Accepts all, remote, or nonRemote. |
workplaceTypes | string[] | [] | Exact match against Remote, Hybrid, or On-site. |
employmentTypes | string[] | [] | Exact match against public values such as Full-time, Contract, or Internship. |
seniorityLevels | string[] | [] | Exact match against public LinkedIn seniority values. |
includeJobsWithUnknownMetadata | boolean | true | Controls whether jobs missing a selected detail field pass that filter. |
LinkedIn does not publish every metadata field on every job. For strict datasets, set includeJobsWithUnknownMetadata to false. For broader discovery, leave it enabled.
Monitoring
| Input | Type | Default | Behavior |
|---|---|---|---|
newJobsOnly | boolean | false | Emits only matching job keys not already stored for this company and namespace. |
monitoringStoreId | string | Named persistent store | Select an existing Apify key-value store. When omitted, the Actor uses or creates linkedin-company-scraper-monitoring. |
monitoringNamespace | string | default | Isolates schedules or filter sets sharing one monitoring store. |
monitoringFirstRunPolicy | string | emit | emit returns and stores the baseline. storeOnly stores it without writing baseline jobs to the dataset. |
resetMonitoring | boolean | false | Clears state for the requested companies and namespace before scraping. |
Monitoring is keyed by company and namespace. Use a different namespace whenever two schedules should have independent history, for example:
engineering_dailysales_us_weeklyall_jobs_archive
The Actor updates monitoring state only after a company completes successfully. If a company fails or the whole-run budget interrupts that company, its incomplete pass does not become the new baseline.
State uses Apify-safe record keys such as monitoring_daily_engineering_1441_job_keys and monitoring_daily_engineering_1441_last_run.
Reliability and cost controls
| Input | Type | Profile default | Behavior |
|---|---|---|---|
maxPagesPerCompany | integer | Automatic, max 100 | Hard company page limit. Unfiltered runs derive a smaller allowance from maxJobs. |
maxRequestsPerRun | integer | 100, 300, 500 | Hard attempt budget covering company, listing, detail, and retry requests. |
maxRunTimeSeconds | integer | 300, 900, or 1,800 | Whole-run budget. The Actor reserves a shutdown buffer and writes partial-run records. |
requestTimeoutSeconds | integer | 30, 45, or 60 | Per-request timeout. |
maxRequestRetries | integer | 2 or 3 | Bounded retries after transient request failures. |
retryDelaySeconds | integer | 1 or 2 | Initial backoff delay. Later retries use exponential backoff with jitter. |
requestDelayMillis | integer | 750 to 1,250 | Randomized pacing before public LinkedIn requests. |
detailConcurrency | integer | 1 to 3 | Parallel job-detail requests, capped at 5. |
failurePolicy | string | continue | Continue to other companies or stop with failFast. |
proxyConfiguration | object | Apify Residential Proxy | Standard Apify proxy editor. The Actor respects the supplied setting instead of forcing a proxy group. |
The Actor stops pagination on empty pages, exact repeated page signatures, explicit page limits, request limits, result limits, or the whole-run time budget. Short pages are not treated as final because LinkedIn can return fewer cards while later pages still exist.
The Actor definition defaults to 1,024 MB of memory and bounds automatic allocation between 512 MB and 2,048 MB. The scraper is HTTP-based and processes companies sequentially, so a 4,096 MB default adds cost without improving extraction quality.
Example inputs
Fast company hiring scan
{"companyUrls": ["https://www.linkedin.com/company/google", "https://www.linkedin.com/company/microsoft"],"runProfile": "fast","maxJobs": 100,"postedWithinDays": 30,"keywordExclude": ["intern"]}
Enriched recruiting research
{"companyUrls": ["https://www.linkedin.com/company/tesla-motors"],"runProfile": "enriched","maxJobs": 200,"postedWithinDays": 14,"keywordInclude": ["engineer", "developer"],"keywordMatchMode": "any","locationInclude": ["United States", "Remote"],"employmentTypes": ["Full-time"],"includeJobsWithUnknownMetadata": false}
Remote-only jobs
{"companyUrls": ["https://www.linkedin.com/company/github"],"runProfile": "balanced","remoteFilter": "remote","workplaceTypes": ["Remote"],"postedWithinDays": 30,"maxJobs": 100}
Remote and workplace filters trigger detail requests even with the Balanced profile.
Daily new-job monitoring
{"companyUrls": ["https://www.linkedin.com/company/google", "https://www.linkedin.com/company/anthropicresearch"],"runProfile": "balanced","newJobsOnly": true,"monitoringNamespace": "daily_engineering","monitoringFirstRunPolicy": "storeOnly","keywordInclude": ["engineer"],"maxJobs": 500}
The first run stores matching jobs without emitting them. Later runs using the same monitoring store and namespace emit newly discovered matching jobs.
Conservative custom run
{"companyUrls": ["https://www.linkedin.com/company/google"],"runProfile": "custom","maxJobs": 250,"maxPagesPerCompany": 20,"maxRunTimeSeconds": 1200,"requestTimeoutSeconds": 60,"maxRequestRetries": 4,"retryDelaySeconds": 2,"requestDelayMillis": 2000,"detailConcurrency": 1,"failurePolicy": "continue","proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Job output
The default dataset contains job records only. It does not mix run summaries into CSV or API exports.
Core fields
| Field | Description |
|---|---|
jobId | LinkedIn job ID, or null when it cannot be derived safely. |
title | Public job title. |
company | Company name shown on the listing. |
companyId | LinkedIn company identifier used for public job pagination. |
companyNameNormalized | Clean company name for matching and CRM workflows. |
companyLinkedinUrl | Normalized input company URL. |
location | Raw public LinkedIn location. |
city, state, country | Best-effort structured location components. |
isRemote | Remote signal inferred from public location and detail metadata. |
url | Canonical public LinkedIn job URL. |
datePosted | Public posting date when available. |
scrapedAt | UTC extraction timestamp. |
inputUrl | Company URL that produced the record. |
Detail fields
| Field | Description |
|---|---|
employmentType | Public employment type when available. |
seniorityLevel | Public seniority value when available. |
workplaceType | Remote, Hybrid, or On-site when inferred or published. |
applicants | Public applicant text when exposed. |
applyUrl | Public apply link when exposed. |
descriptionText | Plain-text description when requested and available. |
descriptionHtml | Original description HTML fragment when requested and available. |
detailStatus | not_requested, loaded, unavailable, or failed. |
descriptionStatus | not_requested, loaded, not_available, unavailable, or failed. |
detailError | Short error message when a detail request fails. |
Example:
{"jobId": "4201938821","title": "Senior Software Engineer","company": "Example Company","companyId": "1441","companyNameNormalized": "Example Company","companyLinkedinUrl": "https://www.linkedin.com/company/example-company","location": "San Francisco, CA","city": "San Francisco","state": "CA","country": "United States","isRemote": false,"workplaceType": "Hybrid","employmentType": "Full-time","seniorityLevel": "Mid-Senior level","applicants": "25 applicants","applyUrl": "https://www.linkedin.com/jobs/apply/4201938821","url": "https://www.linkedin.com/jobs/view/4201938821","datePosted": "2026-06-05","scrapedAt": "2026-06-06T08:30:00.000Z","inputUrl": "https://www.linkedin.com/company/example-company","detailStatus": "loaded","descriptionStatus": "loaded","descriptionText": "About the role...","descriptionHtml": "<p>About the role...</p>"}
Null detail fields are normal when LinkedIn does not publish them.
Run summary and diagnostics
The default key-value store contains:
| Record | Purpose |
|---|---|
RUN_SUMMARY | Final status, stop reason, totals, timing, resolved non-secret configuration, per-company outcomes, and failures. |
RUN_DIAGNOSTICS | Compact failure, malformed-card, and filter-reason diagnostics. |
MONITORING_INFO | Monitoring store and namespace used by the run, or { "enabled": false }. |
RUN_SUMMARY.status can be:
succeeded: all processed companies completed without an error.partial: at least one company failed, but jobs were emitted.failed: failures occurred and no jobs were emitted.timed_out: the configured whole-run budget was reached and partial results were preserved.
Common company stop reasons include:
max_jobs_reachedmax_pages_reachedno_more_jobsno_public_jobs_foundrepeated_page_detectedduplicate_company_skippedmax_total_jobs_reachedrequest_budget_reachedunparseable_pagecompany_error
Filtering and counting semantics
maxJobs means matching jobs accepted per company, not raw cards scanned.
The Actor processes records in this order:
- Parse and normalize the public job card.
- Remove duplicates across pages, company aliases, and the entire run.
- Skip known monitoring keys when
newJobsOnlyis enabled. - Apply posting-date, keyword, and raw-location filters.
- Fetch detail metadata when requested or required by filters.
- Apply remote, workplace, employment, and seniority filters.
- Save accepted jobs, unless the first monitoring baseline is configured as
storeOnly.
The summary separates scanned, accepted, emitted, filtered, duplicate, malformed-card, and HTTP-attempt counts. requestsPerEmittedJob makes low-yield or unusually expensive runs visible without inspecting every log line.
Scheduling new-job alerts
- Run the Actor once with
newJobsOnly: true. - Select
monitoringFirstRunPolicy: "storeOnly"if the initial backlog should not trigger alerts. - Create an Apify schedule with the same input.
- Keep the same
monitoringStoreIdandmonitoringNamespace. - Connect the dataset to a webhook, Make, Zapier, Slack, Google Sheets, a database, or your own API.
Changing filters while reusing a namespace changes what can be considered new. For independent alert logic, create a new namespace.
To intentionally rebuild a baseline, set resetMonitoring: true for one run, then turn it off again.
API and local use
The Actor accepts the same JSON through Apify Console, the Apify API, schedules, tasks, and integrations.
For local development, create:
storage/key_value_stores/default/INPUT.json
Example local input:
{"companyUrls": ["https://www.linkedin.com/company/google"],"runProfile": "fast","maxJobs": 10,"maxPagesPerCompany": 1,"maxRunTimeSeconds": 120,"proxyConfiguration": {"useApifyProxy": false}}
Then run:
$apify run
Local storage remains on your machine and is not uploaded to Apify Console.
Cost and performance
The main cost drivers are:
- Number of companies.
- Number of listing pages.
- Number of accepted candidate jobs.
- Whether detail requests are required.
- Proxy traffic.
- Retry volume and total runtime.
For lower cost:
- Start with the Fast profile.
- Keep
scrapeDescriptiondisabled. - Apply date, keyword, and location filters before detail enrichment.
- Set a realistic
maxJobs. - Keep
maxTotalJobsandmaxRequestsPerRunbounded. - Keep
maxPagesPerCompanybounded. - Use monitoring so recurring runs skip known matching jobs before detail requests.
For higher data completeness:
- Use the Enriched profile.
- Keep
includeJobsWithUnknownMetadataenabled unless strict filtering is required. - Use a Residential proxy group.
- Allow a larger whole-run budget.
Troubleshooting
The Actor says the URL is invalid
Use a full public company URL:
https://www.linkedin.com/company/google
Individual job URLs, personal profiles, /jobs/search pages, and non-LinkedIn URLs are not company targets.
A company fails before jobs are collected
The company may be unavailable, renamed, geographically restricted, or served behind an authentication wall. Check RUN_SUMMARY.failures and RUN_DIAGNOSTICS.
With failurePolicy: "continue", other companies are still processed.
The dataset is empty
Check:
RUN_SUMMARY.companies[].stoppedReasonjobsFilteredOutfilteredByduplicatesSkippedmonitoringBaselineStoredOnlyfailureCount
An empty dataset can be expected when the first monitoring run uses storeOnly, all matches are already known, filters reject all jobs, or LinkedIn exposes no public jobs.
Some detail fields are null
LinkedIn does not expose every field for every job. Inspect detailStatus and descriptionStatus. A null field with detailStatus: "loaded" usually means the public page omitted it.
Monitoring emits old jobs again
Confirm that scheduled runs use the same persistent monitoring store and namespace. A new store, namespace, or reset creates a new baseline.
The run stops with time_budget_reached
Already emitted jobs remain in the dataset. Increase maxRunTimeSeconds, lower maxJobs, reduce the number of companies, disable detail enrichment, or reduce retries.
The run stops at a request or result budget
request_budget_reached protects against retry, pagination, or enrichment amplification. max_total_jobs_reached protects against unexpectedly large datasets across many companies. Already emitted jobs remain available. Increase the relevant limit only after checking requestAttempts, requestsPerEmittedJob, filters, monitoring settings, and the number of companies.
Requests are blocked or unstable
Use Apify Residential Proxy, lower detail concurrency, increase request pacing, and avoid aggressive retry settings. Public LinkedIn page behavior can vary by region and over time.
Data quality and limitations
- The Actor only returns jobs visible through LinkedIn's public company and guest job surfaces.
- Public availability, selectors, fields, and pagination behavior can change without notice.
- Parsed location fields are best-effort. Always retain the raw
location. - Remote classification is based on public text and may not capture every employer-specific arrangement.
- Applicant counts and apply links are not always exposed.
- Job IDs are preferred for deduplication; the canonical job URL is used when an ID is unavailable.
- Deleted or expired postings may disappear between runs.
- The Actor cannot guarantee that LinkedIn publishes every open role for a company.
Responsible use
Use the Actor only for lawful purposes and public data you are permitted to process. Respect applicable privacy, employment, anti-discrimination, database, and contract rules. Avoid collecting or retaining personal information from job descriptions unless your workflow has a legitimate need and appropriate safeguards.
This Actor is not affiliated with, endorsed by, or sponsored by LinkedIn Corporation or Microsoft. LinkedIn is a trademark of its respective owner.
Support
When reporting an issue, include:
- Apify run ID.
- A minimal input that reproduces the problem.
- Affected company URL.
- Relevant
RUN_SUMMARYandRUN_DIAGNOSTICSfields. - Expected and actual behavior.
Do not include Apify tokens, proxy passwords, cookies, or other credentials.