LinkedIn Keyword Posts Monitor avatar

LinkedIn Keyword Posts Monitor

Pricing

from $2.40 / 1,000 post-results

Go to Apify Store
LinkedIn Keyword Posts Monitor

LinkedIn Keyword Posts Monitor

Monitor LinkedIn posts by keyword, hashtag, brand, or topic. Extract matching public posts with text, author, URL, date, engagement counts, matched keywords, and a monitoring score. Built for B2B content research, social listening, competitor tracking, and market intelligence.

Pricing

from $2.40 / 1,000 post-results

Rating

0.0

(0)

Developer

Delowar Munna

Delowar Munna

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

7 days ago

Last modified

Share

LinkedIn Keyword Posts Monitor

Monitor LinkedIn posts by keyword, hashtag, brand, or topic and export matching public posts as clean, flat, CSV-ready rows — with author, engagement counts, matched-keyword context, and a lightweight monitoring score. Built for B2B marketers, sales intelligence teams, content researchers, agencies, and competitor monitoring.

No LinkedIn login, no cookies, no session IDs. Discovery runs through Google (site:linkedin.com/posts "<keyword>" via Apify's apify/google-search-scraper); each LinkedIn post URL is then fetched as a public page. You pay one flat event per saved unique post row.

✨ Why this scraper

  • Keyword-and-hashtag first, monitoring focused — not a generic LinkedIn scraper. Each row carries the keyword that found it, every keyword that matched it (matched_keywords), and a monitoring_score so you can rank what to read.
  • Hashtag inputs in the same field#fintech and AI agents mix freely in one run.
  • 30 flat fields — post identity, author, company (when visible), engagement counts, match context, derived signals. CSV/Sheets/CRM friendly.
  • Pay-Per-Event — one flat post-result event per saved unique post. Duplicates and filtered rows are never charged.
  • No login / cookies / session IDs — uses public surfaces only.
  • Transparent monitoring score — rule-based (no AI), explained below.

🚀 Quick start — sample inputs

Example 1 — multi-keyword + hashtag, fresh + high-engagement

{
"keywords": ["AI agents", "SOC 2", "#fintech"],
"maxResults": 100,
"sortBy": "recent",
"dateFilter": "pastWeek",
"authorType": "any",
"minReactions": 10,
"minComments": 0,
"includeReposts": false,
"companyOrProfileUrls": [],
"dedupe": true,
"proxyConfiguration": { "useApifyProxy": true }
}

Example 2 — company-scoped monitoring

{
"keywords": ["hiring", "we are hiring", "series a", "launch"],
"maxResults": 50,
"sortBy": "recent",
"dateFilter": "pastMonth",
"authorType": "company",
"minReactions": 0,
"minComments": 0,
"includeReposts": true,
"companyOrProfileUrls": ["https://www.linkedin.com/company/openai/", "https://www.linkedin.com/company/anthropic/"],
"dedupe": true,
"proxyConfiguration": { "useApifyProxy": true }
}

Discovery delegates to apify/google-search-scraper via Actor.call(...). Local runs need APIFY_TOKEN set (or apify login once); platform runs use the run user's token automatically.


📦 Output

One dataset view: LinkedIn posts — a 30-column flat table.

LinkedIn Keyword Posts Monitor — LinkedIn posts table view

Output fields (30)

search_keyword, matched_keywords, source_constraint_url, post_id, post_url, post_text, post_text_preview, posted_at_text, posted_at_iso, author_name, author_type, author_profile_url, author_headline, company_name, company_url, reactions_count, comments_count, reposts_count, engagement_total, post_type, contains_external_link, external_link_domain, keyword_match_in_text, keyword_match_location, monitoring_score, monitoring_label, reason_tags, scrape_status, source_url, scraped_at.

Sample record — person-authored post

(Real run output; post_text truncated for readability.)

{
"search_keyword": "SOC 2",
"matched_keywords": "SOC 2",
"source_constraint_url": null,
"post_id": "urn:li:activity:7399469516180738049",
"post_url": "https://www.linkedin.com/posts/mycroftmike_soc-2-auditors-be-like-its-actually-soc-activity-7399469516180738049-KPkH",
"post_text": "SOC 2 auditors be like: \"It's actually SOC 2, not SOC2\" Meanwhile, every startup founder: Types \"HIPPA compliance\" into ChatGPT or Google · Writes \"SOC2 certifications\" in their pitch deck · Says \"We're ISO certified\" (Which one? There's a lot)...",
"post_text_preview": "SOC 2 auditors be like: \"It's actually SOC 2, not SOC2\" Meanwhile, every startup founder: Types \"HIPPA compliance\" into ChatGPT or Google · Writes \"SOC2 certifications\" in their pitch deck...",
"posted_at_text": "6mo",
"posted_at_iso": "2025-11-26T15:30:08.126Z",
"author_name": "Mike Kim",
"author_type": "person",
"author_profile_url": "https://www.linkedin.com/in/mycroftmike",
"author_headline": "Compliance & security at Mycroft · ex-auditor",
"company_name": null,
"company_url": null,
"reactions_count": 408,
"comments_count": 99,
"reposts_count": 0,
"engagement_total": 507,
"post_type": "external_link",
"contains_external_link": true,
"external_link_domain": "lnkd.in",
"keyword_match_in_text": true,
"keyword_match_location": "text",
"monitoring_score": 60,
"monitoring_label": "Medium",
"reason_tags": "keyword_in_text,high_engagement,has_comments,external_link",
"scrape_status": "ok",
"source_url": "http://www.google.com/search?q=site%3Alinkedin.com%2Fposts+%22SOC+2%22&hl=en",
"scraped_at": "2026-05-28T01:37:45.672Z"
}

Sample record — company-authored post

{
"search_keyword": "SOC 2",
"matched_keywords": "SOC 2",
"source_constraint_url": null,
"post_id": "urn:li:activity:7424327031087120384",
"post_url": "https://www.linkedin.com/posts/hrvcertpro_soc-2-audit-checklist-evidence-controls-activity-7424327031087120384-Y3at",
"post_text": "SOC 2 audits reward preparation, not last-minute fixes. A well-defined checklist ensures your policies, controls, and audit evidence align with Trust Services Criteria — before auditors ask. Read more: https://lnkd.in/g9RawZD3 #SOC2Compliance #ComplianceChecklist #RiskManagement #DataProtection",
"post_text_preview": "SOC 2 audits reward preparation, not last-minute fixes. A well-defined checklist ensures your policies, controls, and audit evidence align with Trust Services Criteria...",
"posted_at_text": "3mo",
"posted_at_iso": "2026-02-03T05:45:01.511Z",
"author_name": "CertPro",
"author_type": "company",
"author_profile_url": "https://www.linkedin.com/company/hrvcertpro",
"author_headline": null,
"company_name": "CertPro",
"company_url": "https://www.linkedin.com/company/hrvcertpro",
"reactions_count": 2,
"comments_count": 0,
"reposts_count": 0,
"engagement_total": 2,
"post_type": "article_share",
"contains_external_link": true,
"external_link_domain": "lnkd.in",
"keyword_match_in_text": true,
"keyword_match_location": "text",
"monitoring_score": 40,
"monitoring_label": "Low",
"reason_tags": "keyword_in_text,company_author,external_link",
"scrape_status": "ok",
"source_url": "http://www.google.com/search?q=site%3Alinkedin.com%2Fposts+%22SOC+2%22&hl=en",
"scraped_at": "2026-05-28T01:37:45.281Z"
}

🎯 Monitoring score

Transparent rule-based score (0–100). No AI, no external enrichment.

SignalPoints
Keyword found directly in post_text+25
Keyword found in a hashtag+15
engagement_total >= 100+20
engagement_total >= 25 (and < 100)+15
comments_count >= 5+10
Author is a company / page+10
Post falls inside the selected dateFilter window+10
Post includes an external link+5

Score capped at 100.

Labels: High (80–100) · Medium (50–79) · Low (0–49).

reason_tags is a comma-separated list explaining the score — e.g. keyword_in_text, keyword_in_hashtag, high_engagement, moderate_engagement, has_comments, company_author, recent_post, external_link, repost.


💰 Pricing

Pay-Per-Event. One flat event per saved row (per-event price configured on the Apify console):

EventCharged when
post-resultOnce per unique post row that passed all filters and was successfully written to the dataset.

So your bill is simply results_saved × price_per_event. The actor honours the user-configured per-run spending cap (Apify eventChargeLimitReached): it caps how many results it discovers up-front to what the limit can pay for, and stops cleanly the moment the cap is reached during charging.

Not charged:

  • Duplicates (deduped by post_id, canonical post_url, and author+text keys; see PRD §9).
  • Rows filtered out by author / engagement / repost / date filters.
  • Failed or authwalled requests.
  • Rows missing a post_url, post_id, and (post_text + author_name).

🚦 Proxy policy

Use Apify Datacenter proxy or no proxy for normal runs — both work reliably for the public LinkedIn post pages at this actor's conservative concurrency.

Apify Residential proxy is not supported. The actor will fail at startup if proxyConfiguration.apifyProxyGroups includes RESIDENTIAL. Reason: in pay-per-event actors, residential bandwidth (~/GB) is billed to the developer, not the run user, so a single bandwidth-heavy run could exceed the per-result event revenue.

If you genuinely need residential routing, supply your own provider via the proxy editor's Custom proxy URLs field — that traffic goes through your provider, not Apify, and is unaffected:

http://user:pass@proxy.iproyal.com:12321
http://user:pass@proxy.brightdata.com:22225
http://user:pass@proxy.oxylabs.io:7777

📊 Run summary

After each run, a RUN_SUMMARY entry is written to the key-value store:

{
"inputs_total": 3,
"successful_inputs": 2,
"failed_inputs": 1,
"keywords_total": 3,
"raw_results_found": 180,
"results_saved": 100,
"duplicates_removed": 25,
"filtered_out": 55,
"charged_events": 100,
"charge_failures": 0,
"blocked_requests": 2,
"retry_count": 6,
"empty_result_keywords": 0,
"runtime_seconds": 148,
"scraped_at": "2026-05-28T04:00:00.000Z"
}

charged_events equals the number of successfully saved unique rows.


⚙️ Filters

FilterStageEffect
authorTypePre-extractionany / person / company. Rows with a detected type the opposite of the filter are dropped. unknown is kept (we do not guess by filtering).
minReactionsPost-extractionDrop rows with reactions_count < minReactions (missing counts as 0).
minCommentsPost-extractionDrop rows with comments_count < minComments (missing counts as 0).
includeRepostsPost-extractionWhen false, drop post_type = repost.
dateFilterPost-extractionany / past24h / pastWeek / pastMonth. Rows with unparseable dates are kept but marked scrape_status = partial.
dedupePer-keyword mergeWhen true (default), duplicate posts found by more than one keyword are saved once with merged matched_keywords.
companyOrProfileUrlsDiscovery scopeOptional list of LinkedIn company/profile URLs. Discovery candidates are kept only if the post URL slug matches one. Empty list = no constraint.

Filters are applied before any dataset push or event charge.


🚧 Limitations (V1)

  • Public surfaces only. No login, cookies, or member-only content. LinkedIn does not expose a no-auth post-search endpoint, so discovery runs through Google (site:linkedin.com/posts "<keyword>") via the apify/google-search-scraper actor. Coverage depends on Google's index for linkedin.com/posts.
  • Page authwall. Some LinkedIn post pages, when fetched without cookies, return an authwall HTML even for previously-indexed posts. Those rows are counted as blocked_requests and retried; persistent failures are skipped, not fatal.
  • Best-effort fields. Author identity is parsed from JSON-LD when present, then from og:title / page slug / og:image hints. When LinkedIn returns only minimal authwall metadata, author_type may remain unknown and company_name / company_url may be null.
  • No comment or reaction-user scraping. V1 returns post-level rows only.
  • No media download.
  • maxResults caps saved unique rows across the whole run (not per keyword).
  • No recent ordering at the LinkedIn level. sortBy influences Google's ranking, not LinkedIn's, so true recency depends on Google's index freshness.

❓ FAQ

Do I need a LinkedIn account or cookies? No. Discovery uses Google (site:linkedin.com/posts "<keyword>"); extraction reads each LinkedIn post page over public HTTP.

How does discovery work technically? For each keyword the actor builds a Google query and calls Apify's apify/google-search-scraper actor once per run. Result URLs are filtered to public LinkedIn post URLs, canonicalised, and deduped across keywords. Then each URL is fetched with Crawlee CheerioCrawler and parsed.

Why are some author_type values unknown? LinkedIn's no-cookie response sometimes contains only minimal Open Graph metadata. When neither JSON-LD, the post URL slug, nor the og:image give a confident person-vs-company signal, author_type is left unknown rather than guessed.

Why is company_name sometimes null even when the author works at a known company? For person-authored posts we only populate company_name when the company is clearly visible in the author headline (e.g. "VP Sales at Acme"). We never visit the author's profile page to enrich it in V1.

Can I scope a run to specific companies or people? Yes — list their LinkedIn URLs in companyOrProfileUrls. Discovery candidates whose post URL slug matches one of those handles are kept; everything else is skipped. The matched constraint URL is stamped on every kept row as source_constraint_url.

Can I export to CSV? Yes — every field is flat. Use Apify's CSV / Excel export, or call the dataset API with format=csv.

Will I get blocked? Concurrency is min=1 / max=5 with retries, session rotation, and randomised user agents. Apify Datacenter Proxy is sufficient for typical runs. For large runs, split keywords across runs or supply your own proxy provider via Custom proxy URLs.

Hashtag vs keyword — same input? Yes. Drop them into the same keywords list. Hashtags keep their # (e.g. "#fintech"); the actor quotes them inside the Google query so they match as literal tokens.


🛠️ Technical notes

  • Stack: Node.js 22 · Apify SDK 3 · Crawlee CheerioCrawler · Cheerio + native fetch. No browser.
  • Discovery: apify/google-search-scraper via Actor.call(...). Requires APIFY_TOKEN for local runs.
  • Extraction: each LinkedIn post URL is fetched directly. Parsing is layered: JSON-LD (SocialMediaPosting / Article) → Open Graph meta tags → visible markup fallback.
  • Concurrency: min=1, max=5 (conservative, LinkedIn is blocking-sensitive).
  • Memory: 1 GB min · 2 GB default · 4 GB max.
  • Proxy: Apify Proxy enabled by default; custom configs accepted; Apify Residential rejected at startup.

Local run

cd actor
npm install
# Either: apify login (writes APIFY_TOKEN to your env), or:
# $env:APIFY_TOKEN = "..." (PowerShell) / export APIFY_TOKEN=... (bash)
npm start