# Crunchbase Scraper — Funding, Investors & Profiles ✅ (`themineworks/crunchbase-companies`) Actor

Scrape Crunchbase company profiles by name or organization slug: company name, description, total funding raised, last/largest round, number of investors, location, founded year and website. No login, no API key. Works in Claude, ChatGPT & any MCP-compatible AI agent.

- **URL**: https://apify.com/themineworks/crunchbase-companies.md
- **Developed by:** [The Mine Works](https://apify.com/themineworks) (community)
- **Categories:** Business, Developer tools, MCP servers
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$2.00 / 1,000 company scrapeds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Crunchbase Scraper - Companies, Funding & Investors

Pull clean, structured **company profiles from Crunchbase** by company name, keyword, or exact organization slug — **no login, no Crunchbase API key, no paid subscription**. For every matching organization you get the company name, description, total funding raised, the last/largest funding round, number of investors, headquarters location, founded year, and website, ready to drop into your CRM, lead list, investment tracker, or AI agent.

This actor is built for the real world: Crunchbase sits behind a Cloudflare *managed challenge*. It runs a **real, stealth-hardened Chromium browser** (Crawlee PlaywrightCrawler + puppeteer-extra stealth) on a residential IP, lets the browser execute Cloudflare's JS challenge to mint a `cf_clearance` token, then operates entirely inside that cleared browser context: it calls Crunchbase's **own native search (autocomplete) endpoint** to resolve your query into organizations, and navigates each company's profile page to read the embedded data. No third-party search engines, no Crunchbase API key, no Enterprise contract.

> **Heads-up on access (read before running):** Crunchbase applies a Cloudflare *managed challenge* to its search endpoint and profile pages — one of the toughest anti-bot walls in production. A real browser on residential clears it far more reliably than plain HTTP, but on a hot IP pool Cloudflare can still refuse to issue `cf_clearance`. When that happens the run does not crash, does not spin, and charges nothing: it rotates a few residential sessions, then exits cleanly with `cloudflare_challenge_blocked` / `unblocker_required` flags in the summary. If your account hits persistent blocks, point `proxyConfiguration` at a true unblocker tier (a Bright Data Web Unlocker endpoint, or Apify's Anti-Cloudflare / Unblocker proxy group).

### What you can do with it

- Build **investor & funding lists** — see how much each company raised, over how many rounds, and the size and date of their largest round.
- **Enrich a list of company names** with descriptions, websites, locations and founded years.
- Feed an **AI agent / MCP client** (Claude, ChatGPT, Cursor, any MCP-compatible tool) a company name and get a structured funding profile back.
- Track competitors, portfolio companies, or acquisition targets.

### Input

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `query` | string | one of | Company name or keyword to search Crunchbase for, e.g. `openai`, `stripe`, `fintech payments`. Each matching organization becomes one record. |
| `organizationSlug` | string | one of | Exact Crunchbase slug from `crunchbase.com/organization/<slug>`, e.g. `openai`, `databricks`. When set, scrapes that one company directly and ignores `query`. |
| `maxResults` | integer | no | Max companies to return. Default `25`, min `1`, max `1000`. |
| `proxyConfiguration` | object | recommended | Proxy settings. **Residential US** with session rotation is required — a real browser on residential is what clears Cloudflare's challenge. Defaults to Apify Residential, US. |

#### Example input

```json
{
  "query": "openai",
  "maxResults": 10,
  "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"], "apifyProxyCountry": "US" }
}
````

To scrape one exact company:

```json
{ "organizationSlug": "stripe", "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"], "apifyProxyCountry": "US" } }
```

### Output

One dataset item per company, plus a final `summary` record. Sample:

```json
{
  "name": "OpenAI",
  "slug": "openai",
  "description": "OpenAI is an AI research and deployment company dedicated to advancing artificial intelligence safely and beneficially.",
  "website": "https://openai.com",
  "founded": "2015",
  "location": "San Francisco, California, United States",
  "funding_total_usd": "$180B",
  "num_funding_rounds": 15,
  "last_round": { "type": "Series G round", "amount": "$122B", "date": "Feb 2026" },
  "num_investors": 32,
  "url": "https://www.crunchbase.com/organization/openai",
  "scraped_at": "2026-06-16T00:00:00.000Z"
}
```

Every record carries a `scraped_at` ISO-8601 timestamp. Fields that could not be resolved for a given company are omitted rather than returned empty. The funding fields (`funding_total_usd`, `num_funding_rounds`, `last_round`) come from cached Crunchbase funding captions; `website`, `founded`, `location` and `num_investors` are best-effort from the live profile and depend on the residential IP clearing Cloudflare on that request.

### Pricing

This actor is **Pay-Per-Event**: you are charged **$0.004 per company** delivered.

- The **first 25 companies are free** for each Apify account (lifetime), so you can evaluate the actor before paying anything.
- Empty searches, failed lookups, and the final summary record are **never charged**.
- You also pay Apify's standard platform usage (compute + residential proxy) as normal.

### How it works (transparency)

1. **Browser warm-up** — the actor opens a real stealth-patched Chromium tab (Crawlee fingerprints + puppeteer-extra stealth: no `navigator.webdriver`, realistic UA/viewport/locale, automation flags off) on a residential exit and navigates `crunchbase.com/`. It waits for Cloudflare's JS challenge to resolve and confirms the `cf_clearance` cookie was issued. Crawlee's `blockedStatusCodes` is emptied so the 403 challenge page reaches the handler instead of retiring the session early.
2. **Discovery** — from inside the cleared browser context, a `fetch()` hits Crunchbase's **own autocomplete API** (`/v4/data/autocompletes?...&collection_ids=organization.organizations`), the same endpoint the site's search box calls, carrying `cf_clearance` and the real fingerprint. Each result yields the org name, canonical slug, and short description.
3. **Profile** — for each slug, the browser navigates `crunchbase.com/organization/<slug>` and the embedded Apollo cache (`apollo.state` JSON island) is mined from the rendered DOM for website, founded year, location, funding total, last round type and investor count, with og/meta fallbacks.

Results are deduped by organization slug. If Cloudflare's challenge never clears (no `cf_clearance` after several residential rotations), the run never crashes, never charges, and reports `cloudflare_challenge_blocked` / `unblocker_required` plus the last `cf-mitigated` header value in the summary record.

### FAQ

**Do I need a Crunchbase account or API key?** No. Discovery uses Crunchbase's own public autocomplete endpoint and the public profile page.

**Why residential proxies (and why might I still need an unblocker)?** Crunchbase is behind a Cloudflare managed challenge. A real browser on residential mints the `cf_clearance` token the challenge demands far more reliably than plain HTTP. Datacenter proxies are always blocked. If your account still hits persistent challenges (hot IP pool), point `proxyConfiguration` at a true unblocker tier (Bright Data Web Unlocker, or Apify's Anti-Cloudflare / Unblocker proxy group).

**Why are some fields missing on some companies?** `website`, `location`, `founded`, funding and `num_investors` are mined from the live Crunchbase profile page. The browser reads it after the challenge clears; if a particular page does not fully render those fields, the record still ships the discovery fields (name, slug, description).

**Can I scrape a specific company directly?** Yes — pass `organizationSlug` with the exact slug from the company's Crunchbase URL.

**Is this an official Crunchbase product?** No. This is an independent scraper for publicly available data. Respect Crunchbase's terms and applicable law in your jurisdiction.

### MCP

Works in Claude, ChatGPT & any MCP-compatible AI agent. Expose it as a tool and ask for a company by name to get a structured funding profile back.

# Actor input Schema

## `query` (type: `string`):

Company name or keyword to find on Crunchbase (e.g. "openai", "fintech payments", "stripe"). Each matching Crunchbase organization is returned as one company record. Use this OR organizationSlug.

## `organizationSlug` (type: `string`):

Exact Crunchbase organization slug from the URL crunchbase.com/organization/<slug> (e.g. "openai", "stripe", "databricks"). When set, scrapes that one company directly and ignores the search query.

## `maxResults` (type: `integer`):

Maximum number of company records to return.

## `proxyConfiguration` (type: `object`):

Crunchbase sits behind Cloudflare. RESIDENTIAL US proxies with session rotation are strongly recommended for reliable access.

## Actor input object example

```json
{
  "query": "openai",
  "maxResults": 10,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "US"
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "openai",
    "maxResults": 10,
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ],
        "apifyProxyCountry": "US"
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("themineworks/crunchbase-companies").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "query": "openai",
    "maxResults": 10,
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
        "apifyProxyCountry": "US",
    },
}

# Run the Actor and wait for it to finish
run = client.actor("themineworks/crunchbase-companies").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "openai",
  "maxResults": 10,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "US"
  }
}' |
apify call themineworks/crunchbase-companies --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=themineworks/crunchbase-companies",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Crunchbase Scraper — Funding, Investors & Profiles ✅",
        "description": "Scrape Crunchbase company profiles by name or organization slug: company name, description, total funding raised, last/largest round, number of investors, location, founded year and website. No login, no API key. Works in Claude, ChatGPT & any MCP-compatible AI agent.",
        "version": "0.2",
        "x-build-id": "hZySe6vp5eE2SJYrK"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/themineworks~crunchbase-companies/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-themineworks-crunchbase-companies",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/themineworks~crunchbase-companies/runs": {
            "post": {
                "operationId": "runs-sync-themineworks-crunchbase-companies",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/themineworks~crunchbase-companies/run-sync": {
            "post": {
                "operationId": "run-sync-themineworks-crunchbase-companies",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "query": {
                        "title": "Search query (company name / keyword)",
                        "type": "string",
                        "description": "Company name or keyword to find on Crunchbase (e.g. \"openai\", \"fintech payments\", \"stripe\"). Each matching Crunchbase organization is returned as one company record. Use this OR organizationSlug."
                    },
                    "organizationSlug": {
                        "title": "Organization slug (optional, exact)",
                        "type": "string",
                        "description": "Exact Crunchbase organization slug from the URL crunchbase.com/organization/<slug> (e.g. \"openai\", \"stripe\", \"databricks\"). When set, scrapes that one company directly and ignores the search query."
                    },
                    "maxResults": {
                        "title": "Max companies",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of company records to return.",
                        "default": 25
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Crunchbase sits behind Cloudflare. RESIDENTIAL US proxies with session rotation are strongly recommended for reliable access.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ],
                            "apifyProxyCountry": "US"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
