# VC & PE Intel — Know Your Investor Before You Pitch (`som_coder11/vc-intel-prefundraise`) Actor

Before your next fundraise, know exactly what every VC and PE firm is thinking. Scrapes blogs, LinkedIn posts, portfolio companies, and partner emails from 95 India-focused firms. Get investment thesis, sentiment, focus areas, and per-post summaries — 79 columns per firm, ready to export as Excel.

- **URL**: https://apify.com/som\_coder11/vc-intel-prefundraise.md
- **Developed by:** [Charu Somani](https://apify.com/som_coder11) (community)
- **Categories:** Automation, Lead generation, AI
- **Stats:** 4 total users, 0 monthly users, 88.9% runs succeeded, NaN bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $100.00 / 1,000 vc firm intelligence reports

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## VC & PE Firm Intelligence

Apify actor for low-cost VC/PE intelligence from free public sources.

It does not use paid scraper actors, residential proxies, login cookies, or paid LLM calls by default. It produces one dataset row per firm with thesis, focus areas, portfolio signal, team/outreach signal, blog/news summaries, and best-effort LinkedIn public signals.

### What changed

- LinkedIn is best-effort only. Anonymous LinkedIn post scraping is unreliable and often blocked. This actor now collects public company metadata when available and indexed public mentions from Google News RSS.
- Blog/news collection prefers the firm's own RSS feeds and blog pages before falling back to Google News.
- Bot-protected or JavaScript-heavy pages use respectful free fallbacks: RSS, common page paths, and Jina Reader text extraction for public pages.
- Summaries are pure JavaScript extractive summaries. No LLM is required.
- Groq is optional and used at most once per firm when a key is supplied.
- `Actor.charge()` is called once per firm through the `firm-analyzed` PPE event and respects the user's run spending limit.

### Inputs

```json
{
  "firms": ["peakxv", "blume", "elevation"],
  "maxBlogPosts": 10,
  "maxLinkedInPosts": 5,
  "includePortfolio": true,
  "includeLinkedIn": true,
  "includeTeamEmails": true
}
````

Custom firm:

```json
{
  "name": "Acme Ventures",
  "website": "https://acmevc.com",
  "blogUrl": "https://acmevc.com/blog",
  "portfolioUrl": "https://acmevc.com/portfolio",
  "thesisUrl": "https://acmevc.com/about",
  "linkedinSlug": "acme-ventures"
}
```

### Output

Each firm gets one flat row with identity, source status, thesis/focus/sentiment fields, portfolio counts, partner names/emails/LinkedIns, 15 blog slots, and 15 LinkedIn/free-signal slots.

Check `sourceStatus` first. It tells you which sources were actually useful for that firm, for example:

```json
{"blog":"ok:3","thesis":"ok:direct","portfolio":"ok:60","linkedin":"linkedin-news-mentions","team":"ok:2","groq":"disabled"}
```

### Free-mode limits

- LinkedIn: reliable scraping usually requires either LinkedIn authorization, a paid data provider, or an Apify Store actor. This project avoids those costs, so LinkedIn rows are public/indexed signals, not guaranteed recent posts.
- Bot protection: the actor does not bypass CAPTCHAs or access controls. It falls back to public feeds, public text renderers, and indexed sources.
- Portfolio pages: many VC sites render logos with messy alt text. The actor filters aggressively, but some cleanup may still be needed for edge-case sites.

### Run locally

```bash
npm install --omit=dev --omit=optional
npm start
```

For local Apify input, edit `storage/key_value_stores/default/INPUT.json`.

# Actor input Schema

## `firms` (type: `array`):

Use built-in keys like peakxv, blume, elevation, 3one4, stellaris, accel, kkr; or pass custom objects with name, website, blogUrl, portfolioUrl, thesisUrl, and linkedinSlug.

## `groqApiKey` (type: `string`):

Optional free-tier enhancement for six intelligence fields. Leave blank for pure JavaScript extractive summaries with zero external LLM calls.

## `maxBlogPosts` (type: `integer`):

How many public posts or news items to collect per firm. Maximum 15 to keep compute low.

## `maxLinkedInPosts` (type: `integer`):

How many LinkedIn company posts to keep when LinkedIn collection is enabled.

## `includePortfolio` (type: `boolean`):

Collect company names from public portfolio pages.

## `includeLinkedIn` (type: `boolean`):

Uses the configured Apify LinkedIn posts actor to fetch recent company-page posts. Disable to avoid downstream LinkedIn actor cost.

## `includeTeamEmails` (type: `boolean`):

Collect public team names, titles, LinkedIn profile links, and visible/guessed work emails where possible.

## `linkedinScraperActorId` (type: `string`):

Default is harvestapi/linkedin-company-posts. Must accept LinkedIn company URLs and return one item per post.

## `linkedinScraperProvider` (type: `string`):

Choose the input schema used by the configured LinkedIn actor.

## `linkedinScraperWaitSecs` (type: `integer`):

How long to wait for the downstream LinkedIn actor run.

## `useLinkedInFallback` (type: `boolean`):

If the paid LinkedIn actor fails or returns no posts, fill LinkedIn columns from Google News indexed mentions instead of leaving them blank. Keep disabled for honest LinkedIn output.

## Actor input object example

```json
{
  "firms": [
    "peakxv",
    "blume",
    "elevation",
    "3one4",
    "stellaris"
  ],
  "maxBlogPosts": 10,
  "maxLinkedInPosts": 5,
  "includePortfolio": true,
  "includeLinkedIn": true,
  "includeTeamEmails": true,
  "linkedinScraperActorId": "harvestapi/linkedin-company-posts",
  "linkedinScraperProvider": "harvestapi",
  "linkedinScraperWaitSecs": 180,
  "useLinkedInFallback": false
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "firms": [
        "peakxv",
        "blume",
        "elevation",
        "3one4",
        "stellaris"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("som_coder11/vc-intel-prefundraise").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "firms": [
        "peakxv",
        "blume",
        "elevation",
        "3one4",
        "stellaris",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("som_coder11/vc-intel-prefundraise").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "firms": [
    "peakxv",
    "blume",
    "elevation",
    "3one4",
    "stellaris"
  ]
}' |
apify call som_coder11/vc-intel-prefundraise --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=som_coder11/vc-intel-prefundraise",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "VC & PE Intel — Know Your Investor Before You Pitch",
        "description": "Before your next fundraise, know exactly what every VC and PE firm is thinking. Scrapes blogs, LinkedIn posts, portfolio companies, and partner emails from 95 India-focused firms. Get investment thesis, sentiment, focus areas, and per-post summaries — 79 columns per firm, ready to export as Excel.",
        "version": "1.1",
        "x-build-id": "9bmMgy3iL6xJrTkNc"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/som_coder11~vc-intel-prefundraise/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-som_coder11-vc-intel-prefundraise",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/som_coder11~vc-intel-prefundraise/runs": {
            "post": {
                "operationId": "runs-sync-som_coder11-vc-intel-prefundraise",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/som_coder11~vc-intel-prefundraise/run-sync": {
            "post": {
                "operationId": "run-sync-som_coder11-vc-intel-prefundraise",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "firms"
                ],
                "properties": {
                    "firms": {
                        "title": "Firms to analyze",
                        "type": "array",
                        "description": "Use built-in keys like peakxv, blume, elevation, 3one4, stellaris, accel, kkr; or pass custom objects with name, website, blogUrl, portfolioUrl, thesisUrl, and linkedinSlug.",
                        "default": [
                            "peakxv",
                            "blume",
                            "elevation",
                            "3one4",
                            "stellaris"
                        ]
                    },
                    "groqApiKey": {
                        "title": "Groq API key (optional)",
                        "type": "string",
                        "description": "Optional free-tier enhancement for six intelligence fields. Leave blank for pure JavaScript extractive summaries with zero external LLM calls."
                    },
                    "maxBlogPosts": {
                        "title": "Blog/news posts per firm",
                        "minimum": 1,
                        "maximum": 15,
                        "type": "integer",
                        "description": "How many public posts or news items to collect per firm. Maximum 15 to keep compute low.",
                        "default": 10
                    },
                    "maxLinkedInPosts": {
                        "title": "LinkedIn public signals per firm",
                        "minimum": 1,
                        "maximum": 15,
                        "type": "integer",
                        "description": "How many LinkedIn company posts to keep when LinkedIn collection is enabled.",
                        "default": 5
                    },
                    "includePortfolio": {
                        "title": "Scrape portfolio pages",
                        "type": "boolean",
                        "description": "Collect company names from public portfolio pages.",
                        "default": true
                    },
                    "includeLinkedIn": {
                        "title": "Collect LinkedIn company posts",
                        "type": "boolean",
                        "description": "Uses the configured Apify LinkedIn posts actor to fetch recent company-page posts. Disable to avoid downstream LinkedIn actor cost.",
                        "default": true
                    },
                    "includeTeamEmails": {
                        "title": "Scrape team and public emails",
                        "type": "boolean",
                        "description": "Collect public team names, titles, LinkedIn profile links, and visible/guessed work emails where possible.",
                        "default": true
                    },
                    "linkedinScraperActorId": {
                        "title": "LinkedIn posts actor ID",
                        "type": "string",
                        "description": "Default is harvestapi/linkedin-company-posts. Must accept LinkedIn company URLs and return one item per post.",
                        "default": "harvestapi/linkedin-company-posts"
                    },
                    "linkedinScraperProvider": {
                        "title": "LinkedIn actor input format",
                        "enum": [
                            "harvestapi",
                            "data-slayer",
                            "scrapier"
                        ],
                        "type": "string",
                        "description": "Choose the input schema used by the configured LinkedIn actor.",
                        "default": "harvestapi"
                    },
                    "linkedinScraperWaitSecs": {
                        "title": "LinkedIn actor wait seconds",
                        "minimum": 30,
                        "maximum": 600,
                        "type": "integer",
                        "description": "How long to wait for the downstream LinkedIn actor run.",
                        "default": 180
                    },
                    "useLinkedInFallback": {
                        "title": "Fallback to indexed mentions",
                        "type": "boolean",
                        "description": "If the paid LinkedIn actor fails or returns no posts, fill LinkedIn columns from Google News indexed mentions instead of leaving them blank. Keep disabled for honest LinkedIn output.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
