Crunchbase Scraper [$8πŸ’°] β€” Companies + Funding Rounds avatar

Crunchbase Scraper [$8πŸ’°] β€” Companies + Funding Rounds

Pricing

from $8.00 / 1,000 results

Go to Apify Store
Crunchbase Scraper [$8πŸ’°] β€” Companies + Funding Rounds

Crunchbase Scraper [$8πŸ’°] β€” Companies + Funding Rounds

Crunchbase scraper returning ONE clean structured row per company β€” funding, people, M&A, tech stack, traffic, IT spend, growth/IPO predictions β€” not a 1,500-line raw blob. Now also reads Discover saved-search URLs for funding-round signals. Cloudflare bypass built in, no token. $8/1k.

Pricing

from $8.00 / 1,000 results

Rating

0.0

(0)

Developer

Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

20

Total users

19

Monthly active users

6 hours ago

Last modified

Share

Crunchbase Scraper β€” Companies + Funding Rounds

How It Works

How It Works

Turn any Crunchbase company URL or slug into one clean, structured row β€” no raw blob to untangle. Paste a company link, a bare slug, or an organization/... path and get a ready-to-use profile: identity, funding, people, M&A, tech stack, web traffic, IT spend, and Crunchbase's own growth/funding/acquisition/IPO predictions. Or paste a Crunchbase Discover / saved-search URL (e.g. crunchbase.com/discover/funding_rounds/…) and get one funding-round signal row per result β€” company, round type, and Crunchbase links β€” with each company enriched from its org page. JSON or CSV out. No unblocker token to manage β€” Cloudflare is handled for you.

Why Use This Scraper?

  • βœ… Clean structured output β€” 34 grouped fields, ready for a spreadsheet or a model, not a 1,500-line raw dump
  • βœ… Two modes in one actor β€” company-page enrichment and Discover/saved-search funding-round signals
  • βœ… One row per company β€” paste a URL, a slug, or a path and get a single tidy profile back
  • βœ… Surfaces the buried gold β€” Aberdeen IT spend, SEMrush traffic, BuiltWith + Siftery tech stack, Crunchbase ML predictions
  • βœ… Cloudflare handled for you β€” built-in managed unblocker, nothing to configure
  • βœ… Optional raw passthrough β€” flip one switch to also get the full unprocessed Crunchbase cards
  • βœ… Flat JSON / CSV export for analysis, CRM enrichment, or lead scoring

Overview

The Crunchbase Company Scraper is built for sales and revenue teams, investors and analysts, market researchers, and data engineers who need structured company intelligence from Crunchbase without paying for an enterprise API seat.

The actor runs in two modes. Company mode is the core: each company input β€” a full URL, a bare slug, or an organization/... path β€” resolves to exactly one company-shaped row (investor names, executives, acquisition targets, and funding-round counts all appear nested inside it). Discover mode is optional: paste a Crunchbase Discover / saved-search URL and the actor returns one funding-round signal row per result (company + round type + Crunchbase links), with each company enriched from its org page. It is not a people- or investor-search crawler.

Most Crunchbase actors on the Store dump the raw cards object Crunchbase ships to its own front-end: roughly 236 KB and 1,500 lines per company, full of 540-element history arrays, ten-times-duplicated competitor trees, and internal query stubs. This actor returns a clean ~24 KB structured row instead β€” about 10Γ— smaller β€” with the noise dropped and the useful signals lifted to the top level. If you still want everything, rawMode adds the full cards back as a passthrough field.

Supported Inputs

Input types

Input typePatternExample
Full company URLhttps://www.crunchbase.com/organization/{slug}https://www.crunchbase.com/organization/openai
Bare slug{slug}stripe
Organization pathorganization/{slug}organization/anthropic

Copy-pasteable startUrls

{
"startUrls": [
"https://www.crunchbase.com/organization/openai",
"stripe",
"organization/anthropic"
]
}

Discover / saved-search URLs (funding-round signals)

Paste a Crunchbase Discover URL β€” https://www.crunchbase.com/discover/<collection>/<hash> (e.g. a saved Funding Rounds search) β€” and the actor switches to signal mode for that input: one row per result with the funding round, round type, funded company, and Crunchbase links, each company enriched from its org page.

{
"startUrls": [
"https://www.crunchbase.com/discover/funding_rounds/a0620e0d48eb17727ffdd27d9afa1807"
]
}

Anonymous cap: Crunchbase returns the first 15 results per search to anonymous callers and gates the funding amount, announced date, investors, and pagination beyond 15 behind a paid login. So signal rows carry company + round type + links; the $ amount/date come through only in optional logged-in mode with a Crunchbase Pro session. Need a tighter list? Narrow the saved search itself.

Unsupported inputs

  • ❌ Person profiles β€” crunchbase.com/person/{slug}
  • ❌ Individual funding-round, acquisition, investor, hub, or event entity pages β€” crunchbase.com/funding_round/..., /acquisition/... (the Discover saved-search URL above is supported)
  • ❌ Ad-hoc search pages with no saved-search hash β€” save the search first to get a /discover/<collection>/<hash> URL
  • ❌ Any host outside crunchbase.com

Use Cases

AudienceUse case
Sales / RevOps teamsEnrich CRM accounts with funding stage, headcount band, tech stack, and IT spend for lead scoring
Investors / analystsPull funding history, investor lists, and Crunchbase growth/IPO predictions for deal sourcing
Market researchersBulk-export competitor sets with categories, rank, and web-traffic signals
Data / growth engineersFeed clean company rows into a warehouse or model without writing a Crunchbase parser
AgenciesDeliver client-ready company datasets without an enterprise Crunchbase license
  1. Input β€” provide Crunchbase company URLs, slugs, organization/... paths, or a Discover/saved-search URL
  2. Unblock β€” each page is fetched through a built-in managed unblocker that clears Crunchbase's Cloudflare protection
  3. Extract β€” the actor reads Crunchbase's own hydration state (the data its front-end renders from) for complete, accurate fields
  4. Structure β€” raw cards are parsed into one clean, grouped row; history bloat and duplicated trees are dropped
  5. Output β€” export as structured JSON or flattened CSV, with optional rawMode for the full unprocessed cards

Input Configuration

Input fields

FieldTypeRequiredNotes
startUrlsarray<string>yesCrunchbase company URLs, slugs, organization/... paths, or Discover/saved-search URLs (/discover/<collection>/<hash>)
rawModebooleanoptionalWhen true, also include the full raw Crunchbase cards as _rawCards. Default false
maxItemsintegeroptionalHard cap on company rows emitted. Default 1000
maxConcurrencyintegeroptionalParallel unblocker requests. Default 3 (keep low β€” the premium pool is metered per success)
maxRequestRetriesintegeroptionalRetries on transient unblocker errors before giving up on a company. Default 2
sdoKeystring (secret)optionalLeave blank β€” the actor uses its built-in unblocker by default. Advanced: paste your own scrape.do token to bill unblocker requests to your own account
crunchbaseCookiestring (secret)optionalLogged-in mode. Paste your own Crunchbase Cookie header to unlock the gated funding amount / date / investors on Discover results and lift the 15-result cap. Requires a Crunchbase Pro account; the session expires every few minutes, so it's for manual one-off runs, not scheduled jobs. Leave blank for anonymous signal mode β€” a free or stale cookie safely falls back to anonymous instead of erroring
proxyobjectoptionalReserved for a future direct-fetch path; not required for normal runs

Common scenarios

1. A few companies, clean output

{
"startUrls": ["openai", "stripe", "anthropic"]
}

2. Clean row plus the full raw cards

{
"startUrls": ["https://www.crunchbase.com/organization/databricks"],
"rawMode": true
}

3. A larger batch with a cap

{
"startUrls": ["openai", "stripe", "anthropic", "databricks", "figma"],
"maxItems": 5,
"maxConcurrency": 3
}

4. A Discover / saved-search URL (funding-round signals)

{
"startUrls": ["https://www.crunchbase.com/discover/funding_rounds/a0620e0d48eb17727ffdd27d9afa1807"],
"maxItems": 15
}

Output Overview

Each dataset item is a single company row containing:

  • Identity β€” name, permalink, UUID, description, type, operating status, IPO status, global rank, aliases
  • Location β€” city, region, country, continent, offices
  • Web & contact β€” website, LinkedIn / Facebook / Twitter, contact email, phone, contact count
  • Categories β€” category tags and per-category rank
  • Funding β€” total (when public), round count, investor count, rounds, investor list
  • People β€” employee band, current executives, advisors/board, alumni
  • M&A β€” acquisitions, acquired-by, exits, IPO fields
  • Tech stack β€” technology count, BuiltWith stack, Siftery products
  • Signals β€” heat score, SEMrush traffic, Aberdeen IT spend, mobile apps
  • Predictions β€” Crunchbase ML scores for growth, funding, acquisition, IPO
  • Products / Similar / Press β€” products, similar companies with similarity score, recent press timeline

Some fields are null when Crunchbase no longer ships them on the default page load (see FAQ). Set rawMode: true to additionally receive the full unprocessed cards as _rawCards.

Output Samples

Bare slug start ("openai") β€” trimmed

{
"name": "OpenAI",
"permalink": "openai",
"uuid": "cf2c678c-b81a-80c3-10d1-9c5e76448e51",
"url": "https://www.crunchbase.com/organization/openai",
"description": "OpenAI is an AI research and deployment company that develops advanced AI models, including ChatGPT.",
"operatingStatus": "active",
"companyType": "for_profit",
"ipoStatus": "private",
"rank": 4,
"aliases": ["OpenAI LP", "OpenAI Group PBC"],
"city": "San Francisco",
"region": "California",
"country": "United States",
"website": "https://www.openai.com",
"socials": {
"linkedin": "https://www.linkedin.com/company/openai",
"twitter": "https://x.com/OpenAI"
},
"contactEmail": "support@openai.com",
"numContacts": 1384,
"categories": [
{ "name": "Agentic AI", "permalink": "agentic-ai-17fa" },
{ "name": "Artificial Intelligence (AI)", "permalink": "artificial-intelligence" }
],
"funding": {
"totalUsd": null,
"numFundingRounds": 14,
"numInvestors": 95,
"investors": [ { "name": "Blackstone Group investment in Venture Round - OpenAI", "permalink": "blackstone-invested-in-openai-..." } ]
},
"people": {
"employeeRange": "1001-5000",
"current": [
{ "name": "Sam Altman Co-Founder and CEO @ OpenAI", "permalink": "sam-altman-executive-openai--cdec28a8" },
{ "name": "Greg Brockman President, Chairman, & Co-Founder @ OpenAI", "permalink": "greg-brockman-executive-openai--d0858d5a" }
]
},
"techStack": {
"numTechnologies": 94,
"builtwith": [ { "name": "Cloudflare CDN", "category": "cdn" } ],
"siftery": [ { "name": "HTML5", "status": "using" } ]
},
"signals": {
"heatScore": 92,
"heatScoreDelta90": -2,
"semrush": { "globalRank": null, "monthlyVisits": 487467460 },
"aberdeenItSpendUsd": 285484278,
"apps": { "total": 4 }
},
"predictions": {
"growth": { "score": 0.7599, "tier": "p200_positive_low", "generatedOn": "2026-05-30" },
"funding": { "score": 0.6439, "generatedOn": "2026-05-09" },
"acquisition": { "score": 0.0368, "tier": "p500_negative_high" },
"ipo": { "score": 0.9337, "tier": "p200_positive_low" }
},
"products": [
{ "name": "ChatGPT", "description": "An AI conversational agent…" }
],
"similar": [
{ "name": "Anthropic", "permalink": "anthropic", "score": 100 },
{ "name": "Google", "permalink": "google", "score": 99.64 }
],
"pressTimeline": [
{ "title": "ChatGPT tests a new jobs interface", "publisher": "AIM Group", "date": "2026-06-02", "url": "https://aimgroup.com/2026/06/02/chatgpt-tests-a-new-jobs-interface/" }
],
"scrapedAt": "2026-06-02T16:32:32.297Z"
}

Discover search start (".../discover/funding_rounds/...") β€” one signal row per result, trimmed

{
"searchCollection": "funding_rounds",
"name": "Series D - Factorial",
"investmentType": "series_d",
"moneyRaisedUsd": null, // gated for anonymous runs; unlocked in logged-in Pro mode
"announcedOn": null, // gated for anonymous runs
"companyName": "Factorial",
"companyPermalink": "factorial",
"companyUrl": "https://www.crunchbase.com/organization/factorial",
"gatedFields": ["announced_on", "money_raised"],
"company": { // each result enriched from its org page (same shape as above)
"name": "Factorial",
"country": "Spain",
"rank": 64,
"funding": { "numFundingRounds": 8 }
}
}

Key Output Fields

Identity

  • name, permalink, uuid, url, description
  • operatingStatus, companyType, ipoStatus, rank, aliases[]

Location & contact

  • city, region, country, continent, offices[]
  • website, socials.linkedin, socials.facebook, socials.twitter, contactEmail, phone

Categories & funding

  • categories[].name, categoryRanks[].rank
  • funding.totalUsd, funding.numFundingRounds, funding.numInvestors, funding.investors[], funding.rounds[]

People & M&A

  • people.employeeRange, people.current[], people.advisors[], people.alumni[], numContacts
  • ma.acquisitions[], ma.acquiredBy, ma.exits[], ma.ipo

Tech stack & signals

  • techStack.numTechnologies, techStack.builtwith[], techStack.siftery[]
  • signals.heatScore, signals.semrush.monthlyVisits, signals.aberdeenItSpendUsd, signals.apps

Predictions, products & press

  • predictions.growth, predictions.funding, predictions.acquisition, predictions.ipo (each { score, tier, generatedOn })
  • products[], similar[].score, pressTimeline[]

FAQ

Which Crunchbase URLs are supported?

Two kinds. Company (organization) pages β€” a full URL (https://www.crunchbase.com/organization/openai), a bare slug (openai), or a path (organization/openai) β†’ one company row each. And Discover / saved-search URLs (https://www.crunchbase.com/discover/<collection>/<hash>) β†’ one funding-round signal row per result. Individual person, funding-round, acquisition, investor, hub, and event entity pages are not supported as inputs.

What do Discover / saved-search URLs return?

One row per search result: the funding round, round type (investmentType), funded company, Crunchbase links, and a gatedFields marker β€” with the company enriched from its org page under company. Anonymous runs return the first 15 results and leave the $ amount, announced date, and investors null (Crunchbase gates those behind a paid Pro login). To unlock them, supply a Pro session via crunchbaseCookie β€” see below.

Do I get company rows or people / investor rows?

Company inputs give company rows β€” one per input. Discover / saved-search inputs give funding-round signal rows β€” up to 15 per search β€” each with the company enriched under company. Either way, people, investors, and acquisition targets appear as nested fields (e.g. people.current[], funding.investors[], ma.acquisitions[]), never as separate people/investor dataset items.

Do I need a proxy or an unblocker token?

No. Crunchbase is Cloudflare-protected, but the actor ships with a built-in managed unblocker, so a normal run needs nothing extra. The optional sdoKey field only exists for advanced users who want to bill unblocker requests to their own scrape.do account.

Why are funding.totalUsd and signals.semrush.globalRank sometimes null?

Crunchbase moved a few fields (notably total funding amount and the SEMrush global rank) behind a secondary request that no longer ships on the default page load. Rather than double the per-company cost, the actor returns these as null and populates everything else β€” round counts, investor counts, SEMrush monthly visits, IT spend, predictions, and the full tech stack all still come through.

What does rawMode do?

When true, each row keeps all the clean structured fields and adds _rawCards β€” the full, unprocessed Crunchbase cards object. Use it when you need a field the structured output doesn't surface. It makes rows roughly 10Γ— larger, so leave it off unless you need it.

What's gated, and can logged-in mode unlock it?

By default the actor reads only public Crunchbase data, so Discover funding amounts / dates / investors and a few company fields (e.g. funding.totalUsd) come back null, and each search returns its first 15 results. Those are gated by Crunchbase behind a paid Pro login β€” a Crunchbase limit, not a scraper one. The optional crunchbaseCookie field lets you supply your own logged-in Crunchbase Pro session to unlock them. Caveat: the session token expires every few minutes, so it suits manual one-off runs, not scheduled jobs β€” and a free-tier or stale cookie simply falls back to the public signal output instead of erroring. Leave it blank for normal public runs.

How is it priced and how fast is it?

Each company is one dataset item and one unblocker request, billed per result (see the Apify Store pricing on this actor's page). In testing, batches run at a few companies per second with default concurrency.

Support

Additional Services

  • Need a custom export shape, additional Crunchbase fields, or scheduled monitoring? Email muhamed.didovic@gmail.com.
  • For a direct API of this scraper (no Apify fee, usage-based), contact the same address.

Explore More Scrapers

If you found this useful, you might also like:

Full list at apify.com/memo23.


⚠️ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Crunchbase, Inc. or any of its subsidiaries. All trademarks mentioned are the property of their respective owners.

By default the scraper accesses only publicly available Crunchbase pages β€” no authenticated endpoints or content behind the crunchbase.com login wall. The optional logged-in mode is opt-in and uses your own Crunchbase session and subscription that you choose to supply; if you enable it, you are responsible for using it within your own Crunchbase account terms. Users are responsible for ensuring their use complies with Crunchbase's Terms of Service, applicable data-protection law (GDPR, CCPA, etc.), and any contractual obligations of their own organization.


SEO Keywords

crunchbase scraper, scrape crunchbase, crunchbase company scraper, crunchbase API, crunchbase.com scraper, Apify crunchbase, company data scraper, company funding scraper, startup data scraper, tech stack scraper, firmographic data, company enrichment data, lead enrichment scraper, investor data scraper, market research data, competitive intelligence scraper, sales prospecting data, company profile API, business intelligence scraper, startup funding data, crunchbase funding rounds scraper, crunchbase discover scraper, funding round data, startup funding rounds, saved search scraper