# LinkedIn Top Content & Top Voices Scraper (`logiover/linkedin-top-content-scraper`) Actor

Scrapes LinkedIn's public Top Content directory to extract curated high-engagement posts and Top Voice influencers across 40+ categories. Get post text, author profiles, follower counts, reaction metrics, and Top Voice badges. No login, no cookies, no account ban risk. $2 per 1,000 posts.

- **URL**: https://apify.com/logiover/linkedin-top-content-scraper.md
- **Developed by:** [Logiover](https://apify.com/logiover) (community)
- **Categories:** Social media, Lead generation, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🎯 LinkedIn Top Content & Top Voices Scraper — No Login Required

**Discover the highest-engagement LinkedIn posts and find verified Top Voice influencers across 40+ topics — without login, cookies, or risking your LinkedIn account.**

This scraper extracts data from LinkedIn's official **Top Content directory** — a curated, hand-picked archive of the platform's best-performing posts organized by category. Built for B2B content marketers, influencer marketing teams, sales prospecting, lead generation, market research, and competitive content intelligence. **$2.00 per 1,000 posts.** Free 50 posts per run. Pure HTTP, no browser, no proxy required.

---

### 🚀 Why this scraper?

| Feature | This actor | Most LinkedIn scrapers |
|---|---|---|
| Login required | ❌ No | ✅ Yes |
| Cookies needed | ❌ No | ✅ Yes |
| Risk of LinkedIn account ban | ❌ None | ⚠️ High |
| Top Voice badge filter | ✅ Yes | ❌ No |
| Engagement metrics (reactions + comments) | ✅ Yes | ⚠️ Partial |
| Topic-based discovery | ✅ 40+ categories, 1,000+ topics | ❌ Manual URL only |
| Author follower counts | ✅ Yes | ⚠️ Auth required |
| Multi-locale support | ✅ 9 languages | ⚠️ English only |
| Pure HTTP (no browser) | ✅ Fast | ❌ Slow Playwright |
| Auto-deduplication | ✅ By post ID | ❌ Manual |
| Price per 1,000 posts | **$2.00** | $5–15 |

**Used by:** B2B SaaS content teams, growth marketers, influencer marketing agencies, sales prospecting tools, market research firms, AI training data engineers, social listening platforms, content marketing studios, PR agencies, M&A scouts.

---

### 💎 What makes this different — The Top Voice signal

Most LinkedIn scrapers give you posts. This one gives you **posts + verified influencer signal**.

LinkedIn's **Top Voice** badge is the platform's official premium influencer designation — earned through consistent high-quality posting and engagement, not paid. It's the closest thing LinkedIn has to an "expert verification" mark, and it's curated category-by-category. A "Top Voice in Marketing" is recognized by LinkedIn's editorial team as a credible thought leader specifically in marketing.

**This scraper detects Top Voices in 9 languages** (English, Turkish, Spanish, French, German, Portuguese, Italian, Dutch, Polish), captures the badge state in `isTopVoice: true/false` for every post, and lets you filter to *only* Top Voices in a single click. No other LinkedIn scraper on Apify does this.

For influencer marketing teams alone, this filter is a $5,000-tool replacement. Modash, Upfluence, BuzzSumo all charge thousands per month for what amounts to "find me category-relevant LinkedIn influencers." This actor delivers the equivalent at pay-per-result pricing.

---

### 💡 What you can do with this data

#### 1. **Discover LinkedIn Top Voices in your niche**
Filter `onlyTopVoices: true` on any category and instantly get a curated list of verified thought leaders. Use for influencer outreach, podcast guest sourcing, advisory board discovery, conference speaker scouting, content collaboration, or executive thought-leader hiring. Each result includes the influencer's profile URL, follower count, bio, and a high-engagement sample post — everything you need to qualify and reach out.

#### 2. **Find what's actually working in B2B content**
LinkedIn's Top Content directory is the platform's algorithmic ranking of highest-engagement posts in each topic. Scraping it gives you a real-time benchmark of which post formats, hooks, hashtags, and themes are driving conversation in your industry — without guessing. Top-performing posts in your category become a continuously refreshed swipe file for your content team.

#### 3. **Build a lead list of active LinkedIn voices in your TAM**
Each post includes the author's profile URL, headline, follower count, and bio. Filter by topic + minimum follower threshold (`minReactions: 100`, e.g.) and export a contact list of professionals actively publishing in your target vertical. These are warm leads — already engaged with your topic, already publishing publicly, already accustomed to professional outreach. Pipe directly into Apollo, HubSpot, Salesforce, or Outreach.

#### 4. **Industry trend research and conversation tracking**
Track what topics are gaining traction across Marketing, AI, Leadership, Sales, Finance, etc. Run on a schedule (daily/weekly) and watch which categories see the highest engagement growth — a leading indicator for emerging industry conversations. Compare quarter-over-quarter to detect rising interest in topics like "AI agents," "RevOps," or "ESG reporting" before they hit mainstream awareness.

#### 5. **AI training data for content generation models**
Building an AI tool that writes B2B social posts? This scraper assembles a clean, structured corpus of high-engagement post text + author metadata + engagement scores — perfect for fine-tuning content generators or RAG pipelines focused on B2B social copy. Each record is a self-contained document with `postContent` (full text), `reactionCount` (engagement signal), and topic taxonomy.

#### 6. **Competitor content benchmarking**
What does great content in your category look like? Scrape every post in your category (`mode: "category"` returns ~500-1500 curated posts) and analyze average reaction counts, common topics, post lengths, hashtag usage, and author types. Set realistic engagement targets for your own content team based on real category data, not vanity metrics from your own bubble.

#### 7. **Top Voice tracking and influencer relationship management**
Run on a schedule and track which authors keep appearing across multiple topic pages — these are the platform's most consistently impactful creators. Build an influencer watchlist, monitor when new Top Voices emerge in your category, and identify rising stars before they're saturated with brand partnerships.

#### 8. **Content idea engine**
Stuck on what to post? Scrape the top posts from your topic category, study the hooks (first line) and formats (carousel, single-image, video, text-only), and reverse-engineer winning content angles. Feed the scraped text into an AI summarizer to surface trending themes, emerging memes, and underexplored sub-topics in your space.

#### 9. **PR and media outreach prospecting**
Need to pitch a story to journalists who cover your industry? Or find media voices to comment on a press release? Scrape the relevant topic (`journalism`, `writing`, `communication`) and filter by Top Voice status — instantly get a list of credible voices already publishing on the topic. Their LinkedIn engagement metrics tell you who has actual reach.

#### 10. **Market research for VCs and M&A**
Track which founders, executives, and operators are most influential in a given vertical. Use the data to identify candidates for advisory roles, board seats, or as targets for diligence interviews. Founders running viral LinkedIn content in HR-tech are visible operators worth knowing — this scraper surfaces them in minutes.

---

### 📦 Output fields

Every post record includes:

| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `activityId` | string | LinkedIn numeric activity ID | `"7442833215054266369"` |
| `activityUrn` | string | Full LinkedIn URN | `"urn:li:activity:7442833215054266369"` |
| `postUrl` | string | Direct LinkedIn post URL | `"https://www.linkedin.com/posts/vikaschawla_..."` |
| `authorName` | string | Author display name | `"Vikas Chawla"` |
| `authorHandle` | string | LinkedIn URL slug | `"vikaschawla"` |
| `authorProfileUrl` | string | Author's LinkedIn profile URL | `"https://in.linkedin.com/in/vikaschawla"` |
| `authorAvatar` | string | Profile photo URL | `"https://media.licdn.com/dms/image/..."` |
| `authorBio` | string | Author headline / professional bio | `"Helping large consumer brands drive business outcomes via Digital & AI..."` |
| `authorFollowerCount` | number | Author follower count (parsed integer) | `64193` |
| `isTopVoice` | boolean | Whether author has the Top Voice badge | `true` |
| `authorBadge` | string | Specific badge label when present | `"Thought Leader"` |
| `postAge` | string | Relative post age (locale-aware) | `"1mo"`, `"3w"`, `"6h"` |
| `postContent` | string | Full post body text | `"Amazon's $68 billion ad machine now has access..."` |
| `reactionCount` | number | Total reaction count (parsed) | `7094` |
| `commentCount` | number | Total comment count (parsed) | `342` |
| `imageUrls` | array | Embedded image URLs in post | `["https://media.licdn.com/..."]` |
| `linkedCompanies` | array | Companies tagged in post body | `[{"slug": "amazon", "name": "Amazon"}]` |
| `topicCategory` | string | Top-level category slug | `"marketing"` |
| `topicSubcategory` | string | Sub-category slug (if applicable) | `"social-media-engagement-tactics"` |
| `topicLeaf` | string | Leaf topic slug (if applicable) | `"conscious-social-media-practices-for-professionals"` |
| `topicTitle` | string | Topic page heading | `"Conscious Social Media Practices for Professionals"` |
| `topicUrl` | string | Source topic page URL | `"https://www.linkedin.com/top-content/marketing/..."` |
| `scrapedAt` | string | ISO timestamp of scrape | `"2026-05-07T14:00:00.000Z"` |

---

### ⚙️ Input configuration

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `mode` | string | `"category"` | Scraping strategy: `"directory"` (full crawl, all 40+ categories), `"category"` (one category, ~500-1500 posts), `"url-list"` (specific URLs only) |
| `category` | string | `"marketing"` | Category slug for `mode="category"`. See full list below. |
| `startUrls` | array | `[]` | For `mode="url-list"`. Array of LinkedIn `/top-content/` URLs to scrape directly. |
| `maxPosts` | integer | `500` | Max posts to save (deduplicated by activity ID). `0` = unlimited. |
| `maxDepth` | integer | `4` | Crawl depth. `2` = categories only (~360 posts). `3` = + subcategories (~3K posts). `4` = + leaf topics (full crawl, ~30K posts). |
| `onlyTopVoices` | boolean | `false` | Filter to authors with Top Voice / Thought Leader / Influencer badge only. The killer feature. |
| `minReactions` | integer | `0` | Filter out posts with fewer than this many reactions. Useful for surfacing only the highest-engagement content. |
| `locale` | string | `"en"` | Page rendering language: `en`, `tr`, `es`, `fr`, `de`, `pt`, `it`, `nl`, `pl`. Post content is always in original language. |

#### Available categories

`marketing` · `artificial-intelligence` · `career` · `leadership` · `sales` · `finance` · `technology` · `recruitment-hr` · `productivity` · `communication` · `customer-experience` · `training-development` · `innovation` · `project-management` · `business-strategy` · `consulting` · `engineering` · `science` · `economics` · `negotiation` · `change-management` · `corporate-social-responsibility` · `workplace-trends` · `organizational-culture` · `supply-chain-management` · `writing` · `employee-experience` · `hospitality-tourism` · `networking` · `ecommerce` · `education` · `user-experience` · `soft-skills-emotional-intelligence` · `design` · `real-estate` · `retail-merchandising` · `future-of-work` · `fundraising` · `healthcare` · `event-planning`

Each category has 5–30 sub-topics (e.g., Marketing has 91, Career has 67), and each sub-topic has 5–15 leaf topics. Total scrape-able post universe across the entire directory: ~25,000–35,000 unique high-engagement posts and ~5,000–8,000 unique authors.

---

### 💡 Example inputs

#### Find Top Voices in AI for influencer outreach
```json
{
  "mode": "category",
  "category": "artificial-intelligence",
  "onlyTopVoices": true,
  "maxPosts": 200
}
````

Returns ~200 of the most credible AI thought leaders, with their profile URLs and a sample top-performing post.

#### Build a Marketing content swipe file (high engagement only)

```json
{
  "mode": "category",
  "category": "marketing",
  "minReactions": 500,
  "maxPosts": 1000
}
```

Captures only marketing posts that crossed 500 reactions — your category's actually-working content.

#### Full directory crawl for Top Voice discovery

```json
{
  "mode": "directory",
  "onlyTopVoices": true,
  "maxDepth": 3,
  "maxPosts": 5000
}
```

Crawls all 40+ categories at depth 3 and surfaces every Top Voice on the platform. Maximum-coverage influencer database build.

#### Specific topic deep-dive for content research

```json
{
  "mode": "url-list",
  "startUrls": [
    "https://www.linkedin.com/top-content/marketing/social-media-engagement-tactics/",
    "https://www.linkedin.com/top-content/marketing/content-marketing-strategy/",
    "https://www.linkedin.com/top-content/marketing/influencer-marketing/"
  ],
  "maxPosts": 100
}
```

Surgical scrape of three specific sub-topics — minimum cost, maximum precision.

#### Top sales influencers (high follower threshold)

```json
{
  "mode": "category",
  "category": "sales",
  "onlyTopVoices": true,
  "minReactions": 1000,
  "maxPosts": 300
}
```

Only Top Voices in Sales whose posts cross 1,000 reactions — the elite of B2B sales LinkedIn.

#### Career & HR thought leaders (recruiter use case)

```json
{
  "mode": "category",
  "category": "recruitment-hr",
  "onlyTopVoices": true,
  "maxPosts": 500
}
```

Recruiter / talent leader influencer database for employer branding partnerships.

#### AI content training corpus

```json
{
  "mode": "directory",
  "minReactions": 200,
  "maxPosts": 10000,
  "maxDepth": 4
}
```

Cross-category corpus of high-engagement posts for fine-tuning a B2B content LLM.

***

### 📊 Output sample

```json
{
  "activityId": "7442833215054266369",
  "activityUrn": "urn:li:activity:7442833215054266369",
  "postUrl": "https://www.linkedin.com/posts/vikaschawla_amazons-68-billion-ad-machine-now-has-access-activity-7442833215054266369-MUOk",
  "authorName": "Vikas Chawla",
  "authorHandle": "vikaschawla",
  "authorProfileUrl": "https://in.linkedin.com/in/vikaschawla",
  "authorAvatar": "https://media.licdn.com/dms/image/v2/.../profile-displayphoto",
  "authorBio": "Helping large consumer brands drive business outcomes via Digital & AI. A Founder, Author, Angel Investor, Speaker & LinkedIn Top Voice",
  "authorFollowerCount": 64193,
  "isTopVoice": true,
  "authorBadge": null,
  "postAge": "1mo",
  "postContent": "Amazon's $68 billion ad machine now has access to 190 million Netflix viewers. Here's what it means for advertisers...",
  "reactionCount": 710,
  "commentCount": 29,
  "imageUrls": ["https://media.licdn.com/dms/image/..."],
  "linkedCompanies": [
    {"slug": "amazon", "name": "Amazon"},
    {"slug": "netflix", "name": "Netflix"}
  ],
  "topicCategory": "marketing",
  "topicSubcategory": null,
  "topicLeaf": null,
  "topicTitle": "Marketing",
  "topicUrl": "https://www.linkedin.com/top-content/marketing/",
  "scrapedAt": "2026-05-07T14:00:00.000Z"
}
```

***

### 💰 Pricing

Pay-per-event model. **Pay only for posts actually saved — deduplicated, filtered, and ready to use.**

| Volume | Estimated cost |
|--------|---------------|
| 50 posts | **FREE** (every run) |
| 100 posts | $0.10 |
| 500 posts | $0.90 |
| 1,000 posts | $1.90 |
| 5,000 posts | $9.50 |
| 10,000 posts | $19.00 |
| Full directory (~30K posts) | $58.00 |

| Subscription tier | Effective price per 1,000 posts |
|---|---|
| Free / Starter | $2.00 |
| Bronze | $1.80 |
| Silver | $1.60 |
| Gold | $1.30 |

**Cost comparison vs alternatives:**

- This scraper at scale: **$60 for 30,000 posts**
- BuzzSumo Pro: **$199/month** for ~5,000 posts of comparable depth
- Modash: **$120/month** for influencer discovery (single platform)
- Sales Navigator: **$99/month** per seat, no bulk export
- Manual research: **40+ hours** of analyst time

***

### ⚡ Performance

- **Pure HTTP, no browser** — server-rendered HTML parsed with Cheerio. 10× faster than Playwright-based scrapers.
- **No login or cookies** — uses LinkedIn's public Top Content directory endpoints.
- **No proxy required** for most workloads — Apify Datacenter proxy is sufficient.
- **9-10 posts per page** with full author and engagement data.
- **Throughput**: ~1,000–2,000 posts per minute.
- **Auto-deduplication** by `activityId` — same post appearing on multiple topic pages is counted once.
- **Anti-block**: realistic Chrome desktop browser fingerprint, polite request spacing, automatic 429/403 detection with friendly errors.
- **Memory footprint**: comfortably runs in 256 MB.
- **Multi-locale parsing** for Top Voice detection across 9 languages.

***

### 🔗 Integrations

Export as **JSON**, **CSV**, **Excel**, or **XML**. Connect via:

- **Zapier / Make / n8n** — auto-add new Top Voices to your CRM as warm leads
- **Google Sheets** — live influencer database, refreshed weekly via Apify Schedules
- **Slack / Discord** — daily digest of top posts in your category
- **REST API** — programmatic access from Python, Node.js, any language
- **Airtable / Notion** — visual content swipe file for creative teams
- **LangChain / LlamaIndex** — feed posts into RAG pipelines for AI content generation
- **HubSpot / Salesforce** — enrich leads with LinkedIn engagement signals
- **Apollo / Outreach / SalesLoft** — feed handles into sales sequencer
- **BigQuery / Snowflake / PostgreSQL** — data warehouse for analytics
- **Webhooks** — push every new post to your backend in real time
- **MCP (Model Context Protocol)** — usable by Claude, ChatGPT, and other AI assistants for natural-language scraping

***

### 🆚 LinkedIn Top Content Scraper vs alternatives

#### vs LinkedIn's native Top Content website

The native `linkedin.com/top-content/` interface is great for browsing but offers no bulk export, no filtering by Top Voice, no engagement threshold, no scheduled monitoring, and no API access. This scraper turns the same public data into a queryable, exportable, automatable feed.

#### vs LinkedIn API / Sales Navigator

LinkedIn's official APIs require approved developer access (rare for non-enterprise) and only return your own connections. Sales Navigator costs $99-$149/month per seat and limits exports per month. This scraper uses the public Top Content directory — no approval, no per-seat fees, no usage limits.

#### vs cookie-based LinkedIn scrapers (HarvestAPI, Curious Coder)

Most LinkedIn Apify scrapers require you to provide your `li_at` cookie or session token, putting your account at ban risk. This scraper is **fully anonymous** — never touches your LinkedIn account or any user's. Zero account ban risk, ever.

#### vs paid social listening tools (Brandwatch, Sprinklr, Talkwalker)

Enterprise social listening platforms cost $5,000–$50,000/year with annual contracts. This actor delivers comparable LinkedIn coverage at pay-per-result pricing — typically 100× cheaper for the same data volume, with no annual commitment.

#### vs influencer discovery tools (Modash, Upfluence, BuzzSumo)

Influencer marketing platforms cost $120–$500/month per seat and most lack LinkedIn-specific Top Voice data. This scraper specifically targets LinkedIn's official Top Voice signal — the closest thing to verified influencer status on the platform — at a fraction of the cost.

#### vs other LinkedIn scrapers on Apify

Most existing alternatives focus on profile or company scraping with cookies. This scraper is the **only one targeting the Top Content directory specifically** — a separate, public LinkedIn surface that the others ignore.

***

### ❓ Frequently asked questions

#### Does this require a LinkedIn account or login?

**No.** LinkedIn's Top Content directory is publicly accessible — no login, cookies, session, or account required. Your LinkedIn account is never involved in any way. The scraper accesses pages anyone can view in an incognito browser.

#### Is this legal?

This scraper accesses **only publicly available data** from LinkedIn's public Top Content directory — content explicitly published by LinkedIn for public discovery and engagement. You are responsible for complying with LinkedIn's Terms of Service and applicable privacy laws (GDPR, CCPA) when processing the scraped data.

#### What's a "Top Voice"?

LinkedIn Top Voice is the platform's official badge for credible thought leaders in specific topics. It's earned through consistent high-quality posting and engagement — not paid. Top Voices are LinkedIn's premium influencer signal, ideal for outreach, partnership, and thought-leadership use cases. Each Top Voice is recognized in a specific category (Marketing Top Voice, AI Top Voice, etc.).

#### How is `isTopVoice` detected?

The scraper detects Top Voice status using a combination of (a) LinkedIn's accessibility metadata on the badge icon, (b) the author's bio text (many Top Voices mention it explicitly: "LinkedIn Top Voice in Marketing"), and (c) localized badge labels in 9 languages. This multi-signal approach achieves ~95%+ accuracy across categories.

#### How many posts will I get from a category run?

Roughly 500–1,500 posts per category, depending on `maxDepth` and category breadth. Marketing has 91+ sub-topics so it returns more. Event Planning has fewer. Set `maxPosts: 0` for unlimited (capped only by directory size).

#### Will the same post appear multiple times?

The same high-performing post often appears on multiple topic pages (e.g., a post about "AI in Marketing" might surface in both Marketing and AI categories). The scraper auto-deduplicates by `activityId` — each post saved once, regardless of how many topic pages it appears on.

#### Can I scrape posts older than what the directory shows?

LinkedIn's Top Content directory is curated by recency × engagement. Older posts roll off as new ones replace them, generally within 12-18 months. There's no "all-time" archive view — what you see is what's in the current curation window. For historical analysis, run on a schedule and accumulate the dataset over time.

#### Why is `authorFollowerCount` sometimes empty?

LinkedIn shows follower counts for most authors but occasionally hides them for privacy or anti-spam reasons (typically newer accounts under 500 followers). Top Voices almost always have visible counts because LinkedIn surfaces this data prominently for credibility.

#### Can I get author email addresses?

No. LinkedIn never exposes email addresses publicly. For email enrichment, pipe `authorHandle` and `authorName` into a separate enrichment service (Hunter, Apollo, ZoomInfo, etc.). This scraper provides the discovery — enrichment is the next pipeline step.

#### Does the scraper return the full post text?

Yes. `postContent` contains the full post body. Long posts may include `…more` / `…see more` truncation markers from LinkedIn's UI; consider those benign — actual content is captured in full. Posts longer than ~3,000 characters may be slightly truncated by LinkedIn itself before reaching the page.

#### How fresh is the data?

LinkedIn updates the Top Content directory continuously as engagement metrics change. New high-performing posts appear within hours of going viral. Lower-engagement posts roll off the curation. Schedule daily runs for near-real-time tracking, or weekly for trend analysis.

#### Does this work for non-English LinkedIn?

**Yes.** Use the `locale` parameter (`en`, `tr`, `es`, `fr`, `de`, `pt`, `it`, `nl`, `pl`). Post content is returned in its original language. Author bios and badges are localized to the requested language. Top Voice detection works across all 9 supported locales.

#### Can I monitor a specific Top Voice over time?

Yes — run on a schedule with `mode="category"` and `onlyTopVoices: true`. Compare results week-over-week to track which Top Voices keep surfacing in your target categories. Useful for influencer relationship management and for tracking emerging stars before they're saturated with brand partnerships.

#### What if LinkedIn rate-limits me?

The scraper handles 429 responses gracefully with friendly error messages. For very high-volume runs (10,000+ posts per session), enable Apify's Residential proxy. For most users, free Datacenter proxy is sufficient. The polite 600ms delay between requests keeps you well under LinkedIn's anonymous tolerance threshold.

#### Can I integrate with Make / Zapier / n8n?

**Yes.** All three platforms have native Apify integrations. Set up webhooks on run completion or poll the dataset via the Apify API. Common automations: new Top Voices → Slack alert, posts above 5K reactions → Airtable, viral posts in your category → email digest.

#### Is the output AI-ready / RAG-friendly?

**Yes.** Output is clean structured JSON with consistent field types. Each post is a self-contained document with author metadata + content + engagement metrics — ideal for vector databases (Pinecone, Weaviate, Chroma) and RAG pipelines (LangChain, LlamaIndex). Use `postContent` as the embedding document and `topicCategory` + `isTopVoice` as filter metadata.

#### What's the rate of completeness for fields?

- `activityId`, `postUrl`, `authorName`, `authorHandle`: ~99%
- `authorBio`, `postContent`, `reactionCount`, `commentCount`: ~95%
- `authorFollowerCount`: ~85% (hidden on some accounts)
- `isTopVoice`: 100% (true or false, never null)
- `imageUrls`: ~60% (only when post has images)
- `linkedCompanies`: ~30% (only when post tags companies)
- `topicSubcategory` / `topicLeaf`: depends on how deep the source page sat in the directory tree

#### How do I find which categories exist?

The full list is in this README above. You can also browse `https://www.linkedin.com/top-content/` to see them all. The scraper accepts the slug from the URL (e.g., `marketing`, `artificial-intelligence`, `recruitment-hr`).

#### Can I find Top Voices outside the official 40+ categories?

The Top Voice badge is awarded category-by-category, but a single author may hold the badge in multiple categories. Run the scraper across multiple `mode="category"` runs and join on `authorHandle` to find authors who appear in your target combinations.

#### Does it support carousel posts and videos?

Yes. All post formats (single-image, carousel, video, text-only, document) are captured. `imageUrls` returns the cover/preview image for each format. Full carousel sub-images and video files are not extracted — for that level of detail, scrape the individual post URL.

#### Can I track the same authors across categories?

Yes. After running across multiple categories, group by `authorHandle` to see which authors appear in multiple topics. Authors appearing in 3+ categories are "cross-vertical influencers" — typically the most powerful for broad campaigns.

#### What payment methods does Apify support?

Credit card, invoicing for enterprise, and platform credits. New users get **$5 free credits** monthly — enough to scrape ~2,500 posts for free. No credit card required to start.

***

### ⚖️ Legal & Compliance

This scraper accesses **only publicly available data** from LinkedIn's official Top Content directory — content explicitly published by LinkedIn for public discovery, engagement, and editorial promotion. No private user data is accessed. No login or authentication is performed. No internal LinkedIn API is exploited.

**You are responsible** for ensuring your specific use of the scraped data complies with:

- **LinkedIn's Terms of Service** — review their Public Information Use policies
- **GDPR** (EU/UK) — lawful basis for processing personal data of EU/UK individuals
- **CCPA** (California) — consumer rights for CA residents
- **Local data protection laws** in any jurisdiction where you operate or where data subjects reside
- **Anti-spam laws** — CAN-SPAM (US), CASL (Canada), GDPR consent (EU) for any outreach use

This scraper is a **general-purpose tool**. The actor author and Apify provide no warranty regarding the legality of any specific use case. When in doubt, consult legal counsel.

**Not affiliated with LinkedIn Corporation.** LinkedIn® and Top Voice® are registered trademarks of LinkedIn Corporation. All trademarks belong to their respective owners.

***

### 🛠️ Technical details

- **Endpoints used**: `linkedin.com/top-content/`, `linkedin.com/top-content/{category}/`, plus all sub-categories and leaf pages
- **Method**: Pure HTTP GET requests, server-rendered HTML responses
- **Parsing**: Cheerio (jQuery-like DOM traversal), no JavaScript execution required
- **Pagination**: N/A — each topic page contains a fixed 9–10 curated posts; depth is via directory tree traversal
- **Authentication**: None. Anonymous public access.
- **Headers**: Realistic Chrome desktop browser fingerprint, locale-aware Accept-Language
- **Concurrency**: Sequential by default to be polite with LinkedIn infrastructure
- **Memory**: Runs comfortably in 256 MB
- **Average post processing time**: ~70ms per post including parsing
- **Top Voice detection**: Multi-signal (sr-only metadata + bio fallback) across 9 locales
- **Tech stack**: Apify SDK v3, Crawlee v3, Cheerio v1, Node.js 20+

***

### 🚦 Getting started in 30 seconds

1. **Click "Try for free"** on this actor's page
2. **Set `mode: "category"` and pick a category** (e.g., `marketing`, `artificial-intelligence`, `sales`)
3. **Optional**: Toggle `onlyTopVoices: true` for influencer-only results
4. **Click "Start"**
5. **Wait ~60 seconds** for the first results
6. **Download** as JSON / CSV / Excel from the Storage tab

No credit card required. First 50 posts per run are always free. Paid usage starts after that, billed monthly via Apify.

***

### 💬 Support

- **Issues / feature requests**: Open a ticket in the **Issues** tab on this actor's page
- **Custom scraping needs**: Contact the actor author for tailored solutions
- **General Apify support**: [help.apify.com](https://help.apify.com)

***

### 🔍 Search keywords

LinkedIn top content scraper, LinkedIn top voices scraper, LinkedIn influencer scraper, LinkedIn influencer discovery, LinkedIn thought leaders, scrape LinkedIn posts, LinkedIn post scraper, LinkedIn engagement scraper, LinkedIn B2B content, LinkedIn content marketing, LinkedIn topic scraper, LinkedIn directory scraper, LinkedIn no login scraper, LinkedIn anonymous scraper, LinkedIn post extractor, LinkedIn content database, LinkedIn high engagement posts, LinkedIn viral posts, LinkedIn post analytics, LinkedIn content intelligence, LinkedIn social listening, LinkedIn outreach list, LinkedIn lead generation, LinkedIn marketing research, LinkedIn AI training data, LinkedIn content trends, LinkedIn category posts, LinkedIn top posts API alternative, LinkedIn Top Voice finder, LinkedIn opinion leader scraper, LinkedIn category influencer, LinkedIn marketing influencer, LinkedIn AI influencer, LinkedIn sales influencer, LinkedIn HR influencer, LinkedIn finance influencer, LinkedIn leadership influencer, LinkedIn content swipe file, LinkedIn benchmark posts, LinkedIn engagement benchmark, LinkedIn industry research, LinkedIn corpus, LinkedIn dataset, LinkedIn JSON export, LinkedIn CSV export, LinkedIn bulk download, LinkedIn export tool, LinkedIn data extraction.

***

**Ready to find Top Voices in your category?** Hit "Try for free" above. First 50 posts are on us. No credit card. No login. No risk.

# Actor input Schema

## `mode` (type: `string`):

How the scraper traverses LinkedIn's Top Content directory. 'directory' = crawl all 40+ categories (largest run, ~30K posts). 'category' = crawl one category and its subtopics (~500-1000 posts). 'url-list' = scrape only the specific top-content URLs you provide (most precise).

## `category` (type: `string`):

For mode='category'. The category slug from LinkedIn's directory. Common values: marketing, artificial-intelligence, career, leadership, sales, finance, technology, hr, productivity, communication, customer-experience, training-development, innovation, project-management, business-strategy. Browse the full list at https://www.linkedin.com/top-content/.

## `startUrls` (type: `array`):

For mode='url-list'. Paste specific LinkedIn Top Content URLs to scrape. Each URL will be scraped without sub-page traversal. Example: https://www.linkedin.com/top-content/marketing/social-media-engagement-tactics/

## `maxPosts` (type: `integer`):

Maximum number of posts to save to the dataset. The scraper deduplicates by post ID across pages, so the same post appearing on multiple topic pages is only counted once. Set to 0 for unlimited.

## `maxDepth` (type: `integer`):

How deep to traverse the directory tree. 2 = categories only (40 pages, ~360 posts). 3 = + subcategories (~3K posts). 4 = + leaf topics (full crawl, ~30K posts). Only applies to 'directory' and 'category' modes.

## `onlyTopVoices` (type: `boolean`):

When enabled, only returns posts whose authors carry LinkedIn's official 'Top Voice' / 'Thought Leader' badge. Use this for premium influencer discovery or B2B outreach prospect lists where credibility signal matters.

## `minReactions` (type: `integer`):

Filter out posts with fewer than this many reactions. Useful for surfacing only the highest-engagement content. Set to 0 to keep all posts (LinkedIn's directory already curates for engagement, so even '0' returns quality posts).

## `locale` (type: `string`):

Locale for page rendering. LinkedIn returns category names and metadata in the requested language. Post content is always in the original language the author wrote.

## Actor input object example

```json
{
  "mode": "category",
  "category": "marketing",
  "startUrls": [],
  "maxPosts": 500,
  "maxDepth": 4,
  "onlyTopVoices": false,
  "minReactions": 0,
  "locale": "en"
}
```

# Actor output Schema

## `activityId` (type: `string`):

LinkedIn numeric activity ID

## `postUrl` (type: `string`):

Full LinkedIn post URL

## `authorName` (type: `string`):

Author display name

## `authorHandle` (type: `string`):

LinkedIn URL slug

## `authorProfileUrl` (type: `string`):

LinkedIn profile URL

## `authorBio` (type: `string`):

Author headline / bio

## `authorFollowerCount` (type: `string`):

Author follower count

## `isTopVoice` (type: `string`):

Whether author has Top Voice badge

## `authorBadge` (type: `string`):

Specific badge label

## `postContent` (type: `string`):

Full post body text

## `reactionCount` (type: `string`):

Reaction count

## `commentCount` (type: `string`):

Comment count

## `topicCategory` (type: `string`):

Top-level topic category

## `topicTitle` (type: `string`):

Topic page heading

## `scrapedAt` (type: `string`):

ISO scrape timestamp

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("logiover/linkedin-top-content-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("logiover/linkedin-top-content-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call logiover/linkedin-top-content-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=logiover/linkedin-top-content-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "LinkedIn Top Content & Top Voices Scraper",
        "description": "Scrapes LinkedIn's public Top Content directory to extract curated high-engagement posts and Top Voice influencers across 40+ categories. Get post text, author profiles, follower counts, reaction metrics, and Top Voice badges. No login, no cookies, no account ban risk. $2 per 1,000 posts.",
        "version": "0.0",
        "x-build-id": "8p3O6WRQIpnBs7xVP"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/logiover~linkedin-top-content-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-logiover-linkedin-top-content-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/logiover~linkedin-top-content-scraper/runs": {
            "post": {
                "operationId": "runs-sync-logiover-linkedin-top-content-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/logiover~linkedin-top-content-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-logiover-linkedin-top-content-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "mode": {
                        "title": "Scraping Mode",
                        "enum": [
                            "directory",
                            "category",
                            "url-list"
                        ],
                        "type": "string",
                        "description": "How the scraper traverses LinkedIn's Top Content directory. 'directory' = crawl all 40+ categories (largest run, ~30K posts). 'category' = crawl one category and its subtopics (~500-1000 posts). 'url-list' = scrape only the specific top-content URLs you provide (most precise).",
                        "default": "category"
                    },
                    "category": {
                        "title": "Category Slug",
                        "type": "string",
                        "description": "For mode='category'. The category slug from LinkedIn's directory. Common values: marketing, artificial-intelligence, career, leadership, sales, finance, technology, hr, productivity, communication, customer-experience, training-development, innovation, project-management, business-strategy. Browse the full list at https://www.linkedin.com/top-content/.",
                        "default": "marketing"
                    },
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "For mode='url-list'. Paste specific LinkedIn Top Content URLs to scrape. Each URL will be scraped without sub-page traversal. Example: https://www.linkedin.com/top-content/marketing/social-media-engagement-tactics/",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxPosts": {
                        "title": "Max Posts",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of posts to save to the dataset. The scraper deduplicates by post ID across pages, so the same post appearing on multiple topic pages is only counted once. Set to 0 for unlimited.",
                        "default": 500
                    },
                    "maxDepth": {
                        "title": "Max Crawl Depth",
                        "minimum": 2,
                        "maximum": 4,
                        "type": "integer",
                        "description": "How deep to traverse the directory tree. 2 = categories only (40 pages, ~360 posts). 3 = + subcategories (~3K posts). 4 = + leaf topics (full crawl, ~30K posts). Only applies to 'directory' and 'category' modes.",
                        "default": 4
                    },
                    "onlyTopVoices": {
                        "title": "Top Voices Only",
                        "type": "boolean",
                        "description": "When enabled, only returns posts whose authors carry LinkedIn's official 'Top Voice' / 'Thought Leader' badge. Use this for premium influencer discovery or B2B outreach prospect lists where credibility signal matters.",
                        "default": false
                    },
                    "minReactions": {
                        "title": "Min Reactions",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Filter out posts with fewer than this many reactions. Useful for surfacing only the highest-engagement content. Set to 0 to keep all posts (LinkedIn's directory already curates for engagement, so even '0' returns quality posts).",
                        "default": 0
                    },
                    "locale": {
                        "title": "Page Language",
                        "enum": [
                            "en",
                            "tr",
                            "es",
                            "fr",
                            "de",
                            "pt",
                            "it",
                            "nl",
                            "pl"
                        ],
                        "type": "string",
                        "description": "Locale for page rendering. LinkedIn returns category names and metadata in the requested language. Post content is always in the original language the author wrote.",
                        "default": "en"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
