# LinkedIn Company Scraper — Exact Employee Count (`bovi/linkedin-company-scraper`) Actor

Scrape LinkedIn public company pages: exact employee count, company type, size range, specialties, tagline, industry, HQ, website, founded year, funding rounds + investors, and featured employees. No login required. parse\_confidence on every row.

- **URL**: https://apify.com/bovi/linkedin-company-scraper.md
- **Developed by:** [Vitalii Bondarev](https://apify.com/bovi) (community)
- **Categories:** Lead generation, Business
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.99 / 1,000 company-records

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## LinkedIn Company Scraper — Exact Employee Count & Firmographics

Scrape LinkedIn public company pages for **exact employee counts** (not buckets), company type, size range, specialties, tagline, industry, full headquarters address, company description, website, founded year, funding rounds + investors, logo, and featured employee IDs. Optionally enrich featured employees with full person profiles.

No login required. No cookies. Scrapes only public data.

---

### What it does

For each LinkedIn company slug or URL you provide, the actor:

1. Fetches the **public** company home page (`/company/{slug}/`) — the only LinkedIn surface that is not auth-walled.
2. Parses the `Organization` node from `<script type="application/ld+json">` for core firmographics (exact employee count, industry, HQ, website, founded year).
3. Reads the public **about-us overview card** for company type, size range, specialties, tagline, and funding history (rounds, last amount/date, investors, Crunchbase link).
4. Extracts featured employee public-profile IDs from `trk=org-employees` links.
4. Optionally enriches featured employees via a managed proxy (the `enrichEmployees` mode).
5. Outputs one flat row per company, charged as a single PPE event.

#### Auth-walled surfaces (NOT scraped)
- `/company/{slug}/about/` → redirects to login
- `/company/{slug}/people/` → redirects to login

Only the home page is public. This is documented to set accurate expectations.

---

### Edge: EXACT employee count

Every other LinkedIn company scraper returns **employee buckets** ("1001–5000") from the UI text. This actor reads `numberOfEmployees.value` from the structured `Organization` ld+json, which LinkedIn embeds with the **exact integer** (e.g. `231732`).

The output field `employee_count_source` is labeled `"ldjson_exact"` so buyers know they are getting the real figure, not a range.

Competitors tested return the bucket string, wrong member counts, or null.

---

### Input

| Field | Type | Description |
|---|---|---|
| `companies` | array | Slugs (`"microsoft"`), full `/company/` URLs, or numeric IDs. |
| `maxCompanies` | integer | Max companies to scrape. `0` = all. |
| `enrichEmployees` | boolean | Enrich 1–4 featured employees with full profile data (default: false). |
| `proxyConfiguration` | object | Proxy settings. Defaults to Apify Residential (buyer-paid). |

**Example input:**
```json
{
  "companies": ["microsoft", "stripe", "openai", "https://www.linkedin.com/company/anthropic/"],
  "enrichEmployees": false
}
````

***

### Output schema (one row per company)

| Field | Type | Notes |
|---|---|---|
| `company_name` | string | From ld+json |
| `slug` | string | Normalized slug |
| `linkedin_url` | string | Full company URL |
| `tagline` | string|null | Short company tagline (about-us card / og:description) |
| `description` | string | Company description |
| `industry` | string|null | Opportunistic HTML extraction |
| `company_type` | string|null | e.g. `"Public Company"`, `"Privately Held"` (about-us card) |
| `company_size` | string|null | Employee range, e.g. `"5,001-10,000 employees"` (about-us card) |
| `employee_count` | integer|null | **Exact count from ld+json** (our edge) |
| `employee_count_source` | string|null | `"ldjson_exact"` or null |
| `specialties` | array | Company specialties / focus areas (about-us card) |
| `website` | string|null | Company website (decoded from the LinkedIn redirect) |
| `hq_address` | string|null | Flat formatted HQ address |
| `hq` | object | Structured: street, city, region, postal, country |
| `logo_url` | string|null | LinkedIn CDN logo URL |
| `follower_count` | integer|null | Extracted from page HTML |
| `funding` | object|null | Funding history: total rounds, last round type/date/amount, investors, Crunchbase URL |
| `featured_employee_ids` | array | Public profile IDs (1–4 per company) |
| `featured_employees` | array | Enriched person dicts (when `enrichEmployees=true`) |
| `founded_year` | integer|null | Opportunistic HTML extraction |
| `scraped_at` | string | ISO-8601 UTC timestamp |
| `parse_source` | string | `"ldjson"` or `"html_fallback"` |
| `parse_confidence` | float | 0.0–1.0 quality signal |
| `warnings` | array | Machine-readable parse warnings |

***

### `enrichEmployees` mode (hybrid)

When `enrichEmployees: true`, for each company the actor takes the `featured_employee_ids` (1–4 employees featured on the company home page) and fetches each profile via a managed proxy.

**Each enriched employee includes:**

```json
{
  "full_name": "Reid Hoffman",
  "headline": "Co-Founder, LinkedIn...",
  "location": "United States, US",
  "current_title": "Co-Founder, Board Chair",
  "current_company": "Manas AI",
  "profile_url": "https://www.linkedin.com/in/reidhoffman",
  "photo_url": "https://media.licdn.com/...",
  "followers": 2767834
}
```

**Honest limitation:** This enriches only the **1–4 publicly-featured employees** shown on the company home page — not the full roster. The full `/people/` list requires authentication and is not accessible. Billing remains 1 charge per company (employees are embedded in the row, not separate rows).

Proxy infrastructure is handled automatically via your Apify Proxy configuration — no external credentials required.

***

### Proxy note

LinkedIn requires **residential IPs** — datacenter IPs typically receive HTTP 999 or a login redirect. The actor uses Apify Residential proxy (billed to your Apify account) by default, which provides the residential IP rotation LinkedIn expects. Configure via the `proxyConfiguration` input.

***

### `parse_confidence`

Every row includes a `parse_confidence` float (0.0–1.0). Values below 0.7 indicate structural drift — LinkedIn changed the page layout and the parser may have fallen back to HTML heuristics. Acts as an early warning system; watch it in aggregate across runs.

***

### Pricing

**$0.00205 per company record** ($2.05 / 1 000 companies). Pay only for the rows you get.

Scraping 1,000 companies costs approximately **$2.05** in actor fees. Residential proxy usage is billed by Apify to your own account on top (typically a fraction of a cent per company). This **undercuts the niche leader** (≈$4 / 1 000 companies) while returning a richer flat row: exact-count integer (vs competitors' bucket strings), company type, size range, specialties, tagline, funding history, a structured `hq` object, `parse_confidence`, and optional featured-employee enrichment.

***

### Integrations

Built for sales and market-intelligence teams enriching firmographics with exact employee counts and company details from LinkedIn — the JSON/dataset output drops into the tools you already run, no glue code:

- **n8n / Make / Zapier** — trigger a run or pipe every new dataset item into 500+ apps (Google Sheets, Airtable, Slack, HubSpot, your database) with no code: [n8n](https://docs.apify.com/platform/integrations/n8n), [Make](https://docs.apify.com/platform/integrations/make), [Zapier](https://docs.apify.com/platform/integrations/zapier).
- **Webhooks** — fire your own endpoint the moment a run finishes, to push results straight into your pipeline ([docs](https://docs.apify.com/platform/integrations/webhooks)).
- **MCP server** — expose this actor as a tool to Claude, Cursor, or any [MCP client](https://mcp.apify.com) so an AI agent can pull this data mid-conversation ([guide](https://blog.apify.com/how-to-use-mcp/)).
- **API & SDKs** — fetch the dataset as JSON, CSV, or Excel through the Apify REST API or the Python / JS SDKs.

See all [Apify integrations](https://apify.com/integrations).

### Legal

Scrapes **public company pages only** — no login, no authentication, no data behind a paywall. LinkedIn company pages are publicly accessible to any web browser.

Not affiliated with, endorsed by, or officially connected to LinkedIn Corporation.

***

### Competitors

| Actor | Stars | Employee count | parse\_confidence | Enrich |
|---|---|---|---|---|
| harvestapi/linkedin-company | 4.7★ | Bucket string | No | No |
| This actor | — | **Exact integer** | Yes | Yes (1–4) |

# Actor input Schema

## `companies` (type: `array`):

LinkedIn company slugs, full /company/ URLs, or numeric IDs to scrape. Examples: "microsoft", "https://www.linkedin.com/company/stripe/", "1441". Each item is a string.

## `maxCompanies` (type: `integer`):

Maximum number of companies to scrape. Set to 0 (default) to scrape all companies in the list.

## `enrichEmployees` (type: `boolean`):

When enabled, fetches the LinkedIn profile of each featured employee (1–4 per company) via proxy and adds enriched Person data to the featured\_employees field. Note: enriches only the 1–4 publicly-featured employees on the company home page, NOT the full staff roster (which is auth-walled).

## `proxyConfiguration` (type: `object`):

Proxy settings. Defaults to Apify Residential proxy (buyer-paid). Leave as-is unless you have a specific proxy requirement.

## `debugDumpHtml` (type: `boolean`):

Internal diagnostic. When enabled, stores the raw fetched HTML for each company to the key-value store under RAW\_HTML\_<slug>. Used for parser tuning; leave disabled for normal runs.

## Actor input object example

```json
{
  "companies": [
    "microsoft",
    "stripe",
    "openai"
  ],
  "maxCompanies": 0,
  "enrichEmployees": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  },
  "debugDumpHtml": false
}
```

# Actor output Schema

## `results` (type: `string`):

Dataset containing Linkedin Company Scraper records (company\_name, slug, tagline, description, industry, company\_type, company\_size, employee\_count, employee\_count\_source, specialties, website, hq\_address, follower\_count, founded\_year, funding, parse\_confidence).

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "companies": [
        "microsoft",
        "stripe",
        "openai"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("bovi/linkedin-company-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "companies": [
        "microsoft",
        "stripe",
        "openai",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("bovi/linkedin-company-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "companies": [
    "microsoft",
    "stripe",
    "openai"
  ]
}' |
apify call bovi/linkedin-company-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=bovi/linkedin-company-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "LinkedIn Company Scraper — Exact Employee Count",
        "description": "Scrape LinkedIn public company pages: exact employee count, company type, size range, specialties, tagline, industry, HQ, website, founded year, funding rounds + investors, and featured employees. No login required. parse_confidence on every row.",
        "version": "0.1",
        "x-build-id": "xhmmID3Fjkdsah6EK"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/bovi~linkedin-company-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-bovi-linkedin-company-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/bovi~linkedin-company-scraper/runs": {
            "post": {
                "operationId": "runs-sync-bovi-linkedin-company-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/bovi~linkedin-company-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-bovi-linkedin-company-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "companies": {
                        "title": "Company slugs or URLs",
                        "type": "array",
                        "description": "LinkedIn company slugs, full /company/ URLs, or numeric IDs to scrape. Examples: \"microsoft\", \"https://www.linkedin.com/company/stripe/\", \"1441\". Each item is a string.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxCompanies": {
                        "title": "Max companies",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of companies to scrape. Set to 0 (default) to scrape all companies in the list.",
                        "default": 0
                    },
                    "enrichEmployees": {
                        "title": "Enrich featured employees",
                        "type": "boolean",
                        "description": "When enabled, fetches the LinkedIn profile of each featured employee (1–4 per company) via proxy and adds enriched Person data to the featured_employees field. Note: enriches only the 1–4 publicly-featured employees on the company home page, NOT the full staff roster (which is auth-walled).",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy settings. Defaults to Apify Residential proxy (buyer-paid). Leave as-is unless you have a specific proxy requirement.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    },
                    "debugDumpHtml": {
                        "title": "Debug: dump raw HTML",
                        "type": "boolean",
                        "description": "Internal diagnostic. When enabled, stores the raw fetched HTML for each company to the key-value store under RAW_HTML_<slug>. Used for parser tuning; leave disabled for normal runs.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
