PitchBook Data Extractor avatar

PitchBook Data Extractor

Pricing

from $2.99 / 1,000 results

Go to Apify Store
PitchBook Data Extractor

PitchBook Data Extractor

PitchBook investor scraper that pulls firm profiles by investor ID, so you can build sourcing lists and keep your CRM current without clicking through profiles manually.

Pricing

from $2.99 / 1,000 results

Rating

0.0

(0)

Developer

Kawsar

Kawsar

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

PitchBook Data Extractor scrapes public investor profile pages from PitchBook by investor ID or profile URL. Feed it a list of IDs and it returns structured JSON with firm details, deal counts, contact info, social links, and a sample of recent investments -- no manual browsing required.


What is PitchBook?

PitchBook is a financial data platform covering private equity, venture capital, and M&A activity. Each investor on the platform has a public profile page showing the firm's overview, investment history, portfolio companies, and contact details. This actor collects the data visible on those public pages.


What you get

Each scraped profile includes:

Identity and overview

  • Firm name and logo URL
  • Investor type (Venture Capital, Private Equity, Angel, Corporate, etc.)
  • Active/Inactive status
  • Investor status (e.g. Actively Seeking New Investments)
  • Professionals count
  • Total investments count
  • Portfolio companies count
  • Exits count

Company details

  • Firm description / bio
  • Website
  • Year founded
  • Trade association membership
  • Primary and other investor types
  • Full corporate office address (street, city, state, zip, country)
  • LinkedIn profile link
  • Twitter / X profile link

Recent investments (public sample)

  • Up to 10 most recent deals showing: company name, PitchBook company URL, deal date, deal type, industry, company stage, and lead partner (where publicly available)

Metadata

  • Profile URL
  • Scraped timestamp (UTC ISO 8601)
  • Error field (null on success, message on failure)

Input

FieldTypeRequiredDescription
investorIdsarray of stringsYesOne or more PitchBook investor IDs or full profile URLs
maxItemsintegerNoMax profiles to process (default: 100, max: 1000)
requestTimeoutSecsintegerNoPer-request timeout in seconds (default: 30, max: 120)

How to find a PitchBook investor ID

Open any investor profile on PitchBook. The ID is the last segment of the URL path:

https://pitchbook.com/profiles/investor/41716-90
^^^^^^^^^
investor ID

You can paste the full URL or just the numeric ID into investorIds -- both work.

Known investor IDs (confirmed working)

Investor IDFirm NameType
41716-90Andreessen Horowitz (a16z)Venture Capital
11295-73Sequoia CapitalVenture Capital

To find IDs for other firms, open the firm's PitchBook profile page and copy the ID from the URL.

Example input -- minimal

{
"investorIds": ["41716-90"]
}

Example input -- batch with IDs only

{
"investorIds": [
"41716-90",
"11295-73"
],
"maxItems": 100
}

Example input -- batch with full URLs

{
"investorIds": [
"https://pitchbook.com/profiles/investor/41716-90",
"https://pitchbook.com/profiles/investor/11295-73"
],
"maxItems": 100
}

Example input -- mixed IDs and URLs

{
"investorIds": [
"41716-90",
"https://pitchbook.com/profiles/investor/11295-73"
],
"maxItems": 50,
"requestTimeoutSecs": 60
}

Output

Each item in the dataset looks like this:

{
"investorId": "41716-90",
"profileUrl": "https://pitchbook.com/profiles/investor/41716-90",
"name": "Andreessen Horowitz",
"logoUrl": "https://image.pitchbook.com/KQfgZcIVkUmergLPYcA33weU7tH...",
"investorType": "Venture Capital",
"status": "Active",
"investorStatus": "Actively Seeking New Investments",
"professionalsCount": 159,
"investmentsCount": 2702,
"portfolioCount": 1154,
"exitsCount": 564,
"firmDescription": "Founded in 2009, Andreessen Horowitz is a venture capital firm based in Menlo Park, California. The firm prefers to invest in bio healthcare, artificial intelligence, consumer, crypto, enterprise, fintech, games, infrastructure, and American dynamism sectors.",
"website": "https://www.a16z.com",
"yearFounded": 2009,
"tradeAssociation": "National Venture Capital Association (NVCA)",
"primaryInvestorType": "Venture Capital",
"otherInvestorTypes": "Accelerator/Incubator",
"address": {
"street": "2865 Sand Hill Road, Suite 101",
"city": "Menlo Park",
"stateRegion": "CA",
"postalCode": "94025",
"country": "United States"
},
"linkedinUrl": "https://www.linkedin.com/company/a16z",
"twitterUrl": "https://twitter.com/a16z",
"recentInvestments": [
{
"companyName": "Sparq",
"companyProfileUrl": "https://pitchbook.com/profiles/company/1396426-69",
"dealDate": "21-May-2026",
"dealType": "Seed Round",
"industry": "Software Development Applications",
"companyStage": "Startup",
"leadPartner": null
},
{
"companyName": "Catena Labs",
"companyProfileUrl": "https://pitchbook.com/profiles/company/531088-39",
"dealDate": "20-May-2026",
"dealType": null,
"industry": "Other Financial Services",
"companyStage": "Generating Revenue",
"leadPartner": null
}
],
"scrapedAt": "2026-05-24T10:00:00+00:00",
"error": null
}

Output field reference

FieldTypeNotes
investorIdstringNumeric ID extracted from the URL
profileUrlstringFull PitchBook profile URL
namestringFirm name
logoUrlstringAbsolute URL to the firm's logo image
investorTypestringe.g. Venture Capital, Private Equity, Angel
statusstringActive or Inactive
investorStatusstringe.g. Actively Seeking New Investments
professionalsCountintegerNumber of listed professionals
investmentsCountintegerTotal investment count shown on profile
portfolioCountintegerActive portfolio company count
exitsCountintegerTotal exit count
firmDescriptionstringCompany bio paragraph
websitestringFirm website URL
yearFoundedintegerYear the firm was founded
tradeAssociationstringe.g. NVCA
primaryInvestorTypestringMain investor classification
otherInvestorTypesstringAdditional investor type labels
addressobjectStreet, city, stateRegion, postalCode, country
linkedinUrlstring or nullLinkedIn company page URL
twitterUrlstring or nullTwitter/X profile URL
recentInvestmentsarrayUp to 10 recent deals (see below)
scrapedAtstringUTC ISO 8601 timestamp
errorstring or nullnull on success, error message on failure

recentInvestments item fields:

FieldTypeNotes
companyNamestringPortfolio company name
companyProfileUrlstring or nullLink to the company's PitchBook profile
dealDatestringe.g. 21-May-2026
dealTypestring or nulle.g. Seed Round, Series A. null if paywalled
industrystring or nullIndustry classification
companyStagestring or nulle.g. Startup, Generating Revenue, Profitable
leadPartnerstring or nullnull if paywalled or not listed

Use cases

Good for building VC and PE firm lists by sector, keeping CRM records fresh with current descriptions and social links, comparing portfolio sizes across funds, or pulling addresses and contact links for outreach in bulk. Anything that would otherwise mean clicking through dozens of profiles manually.


Limitations

  • Only data visible on public PitchBook profile pages is collected. Subscription-gated content (deal sizes, fund performance metrics, full team rosters, LP data) is not available.
  • PitchBook shows up to 10 recent investments on the public profile. Full investment history requires a PitchBook account.
  • Some deal types, deal sizes, and lead partner names are behind a paywall and return null. This is expected.
  • For large batch runs (500+ profiles), increase requestTimeoutSecs to 60 if you see timeout errors.

Valid input formats

All of the following are accepted in investorIds:

41716-90
11295-73
https://pitchbook.com/profiles/investor/41716-90
https://pitchbook.com/profiles/investor/11295-73

Mixed formats in one run also work:

{
"investorIds": [
"41716-90",
"11295-73",
"https://pitchbook.com/profiles/investor/41716-90"
]
}

Duplicate IDs (same ID entered as both a raw ID and a full URL) are de-duplicated automatically.