VC & PE Intel — Know Your Investor Before You Pitch avatar

VC & PE Intel — Know Your Investor Before You Pitch

Under maintenance

Pricing

from $100.00 / 1,000 vc firm intelligence reports

Go to Apify Store
VC & PE Intel — Know Your Investor Before You Pitch

VC & PE Intel — Know Your Investor Before You Pitch

Under maintenance

Before your next fundraise, know exactly what every VC and PE firm is thinking. Scrapes blogs, LinkedIn posts, portfolio companies, and partner emails from 95 India-focused firms. Get investment thesis, sentiment, focus areas, and per-post summaries — 79 columns per firm, ready to export as Excel.

Pricing

from $100.00 / 1,000 vc firm intelligence reports

Rating

5.0

(1)

Developer

Charu Somani

Charu Somani

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

0

Monthly active users

4 days ago

Last modified

Share

VC & PE Firm Intelligence

Apify actor for low-cost VC/PE intelligence from free public sources.

It does not use paid scraper actors, residential proxies, login cookies, or paid LLM calls by default. It produces one dataset row per firm with thesis, focus areas, portfolio signal, team/outreach signal, blog/news summaries, and best-effort LinkedIn public signals.

What changed

  • LinkedIn is best-effort only. Anonymous LinkedIn post scraping is unreliable and often blocked. This actor now collects public company metadata when available and indexed public mentions from Google News RSS.
  • Blog/news collection prefers the firm's own RSS feeds and blog pages before falling back to Google News.
  • Bot-protected or JavaScript-heavy pages use respectful free fallbacks: RSS, common page paths, and Jina Reader text extraction for public pages.
  • Summaries are pure JavaScript extractive summaries. No LLM is required.
  • Groq is optional and used at most once per firm when a key is supplied.
  • Actor.charge() is called once per firm through the firm-analyzed PPE event and respects the user's run spending limit.

Inputs

{
"firms": ["peakxv", "blume", "elevation"],
"maxBlogPosts": 10,
"maxLinkedInPosts": 5,
"includePortfolio": true,
"includeLinkedIn": true,
"includeTeamEmails": true
}

Custom firm:

{
"name": "Acme Ventures",
"website": "https://acmevc.com",
"blogUrl": "https://acmevc.com/blog",
"portfolioUrl": "https://acmevc.com/portfolio",
"thesisUrl": "https://acmevc.com/about",
"linkedinSlug": "acme-ventures"
}

Output

Each firm gets one flat row with identity, source status, thesis/focus/sentiment fields, portfolio counts, partner names/emails/LinkedIns, 15 blog slots, and 15 LinkedIn/free-signal slots.

Check sourceStatus first. It tells you which sources were actually useful for that firm, for example:

{"blog":"ok:3","thesis":"ok:direct","portfolio":"ok:60","linkedin":"linkedin-news-mentions","team":"ok:2","groq":"disabled"}

Free-mode limits

  • LinkedIn: reliable scraping usually requires either LinkedIn authorization, a paid data provider, or an Apify Store actor. This project avoids those costs, so LinkedIn rows are public/indexed signals, not guaranteed recent posts.
  • Bot protection: the actor does not bypass CAPTCHAs or access controls. It falls back to public feeds, public text renderers, and indexed sources.
  • Portfolio pages: many VC sites render logos with messy alt text. The actor filters aggressively, but some cleanup may still be needed for edge-case sites.

Run locally

npm install --omit=dev --omit=optional
npm start

For local Apify input, edit storage/key_value_stores/default/INPUT.json.