Sequoia Capital Portfolio Scraper avatar

Sequoia Capital Portfolio Scraper

Pricing

from $750.00 / 1,000 portfolio companies

Go to Apify Store
Sequoia Capital Portfolio Scraper

Sequoia Capital Portfolio Scraper

The only structured Sequoia + Peak XV portfolio feed (~710 companies).

Pricing

from $750.00 / 1,000 portfolio companies

Rating

0.0

(0)

Developer

Stephan Corbeil

Stephan Corbeil

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

The only structured feed of the full Sequoia + Peak XV portfolio. Returns ~710 companies across Sequoia Capital (US, ~406) and Peak XV Partners (former Sequoia India / Southeast Asia, ~304). One dataset row per portfolio company, with the fields VC sourcing analysts actually look up: sector, partnered year, founded year, current status (Active / Acquired / Public), partner name, founders, website, twitter, linkedin, geography.

Why this exists (and why no one else has it)

Sequoia's public site at https://www.sequoiacap.com/our-companies/ shows only ~52 curated highlights out of a 400+ company portfolio. Same on Peak XV. There is no public portfolio export — no CSV, no API, no downloadable list.

This scraper consumes the canonical XML sitemap (/company-sitemap.xml) that Sequoia uses to feed search engines, then visits each /companies/{slug}/ detail page in parallel. Result: a clean, structured, complete portfolio you can import into a CRM, an outbound sourcing pipeline, or a CB Insights / PitchBook replacement.

Premium positioning: $0.75 per company. That is roughly 5–10× the entry-tier VC scrapers because (a) Sequoia is a top-three brand, (b) data scarcity is real, (c) the analyst on the buyer side earns this price back the first time they don't pay a $5K/seat database subscription for the same lookup.

What you get

Per portfolio company:

  • name — company name
  • slug — URL-safe identifier (matches the Sequoia detail page slug)
  • url — Sequoia / Peak XV detail page URL
  • description — company one-liner
  • sector / sectors[] — Sequoia's own sector taxonomy (Consumer, Enterprise, Healthcare, Financial Services, GTM, Bio, Crypto, AI, Infrastructure, FinTech, HealthTech, EdTech, D2C, etc.)
  • founded_year — year the company was founded
  • partnered_year / year_invested — year Sequoia first invested
  • acquired_year — year of acquisition (if exited)
  • ipo_year — year listed publicly (if IPO'd)
  • status — derived label: Active (Private) / Acquired (YYYY) / Public/IPO (YYYY)
  • partner / partners[] — Sequoia investment partner(s) on the deal (Bryan Schreier, Andrew Reed, Jim Goetz, Shailendra Singh, etc.)
  • founders[] — founder names (when surfaced on the detail page)
  • website — company website
  • twitter_url, linkedin_url — official social profiles
  • geography — US (Sequoia Capital), India, SEA, or India/SEA (Peak XV)
  • source_firmSequoia Capital or Peak XV (Sequoia India/SEA)
  • source_sitemap_lastmod — when Sequoia last updated this company's page
  • scraped_at — UTC ISO timestamp of this run

Input

FieldTypeDescription
geographyFilterenumGlobal (default) / US / India / SEA. Picks which sitemap to crawl.
sectorFilterstring[]Optional substring filter on sector pills (e.g. ["FinTech", "AI"]).
stageFilterstring[]Optional substring filter on derived status (e.g. ["Acquired"] for M&A leads, ["Public"] for IPOs).
yearFromToobjectInclusive bounds on partnered_year, e.g. {"from": 2020, "to": 2026} for recent vintage.
maxResultsintegerCap on dataset rows (1–1000, default 100).

Use cases

  • VC sourcing & competitive intel. Pull the full Sequoia + Peak XV roster, dedupe against your CRM, find overlap with YC, a16z, Founders Fund, Bessemer, Greylock, Lightspeed.
  • M&A target lists. Filter stageFilter=["Acquired"] to study Sequoia's exit playbook, or stageFilter=["Active"] with a vintage window to find growth-stage targets approaching their exit window.
  • Founder enrichment. Cross-reference founders[] and partners[] with LinkedIn / Crunchbase enrichment actors to build a deal-team map.
  • BD prospecting. Sequoia-backed companies are pre-qualified buyers for many enterprise SaaS categories. Filter by sector, send a sequence.
  • Market mapping. Group by sector + partnered_year to see where Sequoia leaned in each vintage.

Companion actors

  • YC Companies Directory — Y Combinator alumni; large overlap with Sequoia's early-stage portfolio (Stripe, Airbnb, DoorDash were both YC and Sequoia).
  • a16z Portfolio Scraper — direct competitor portfolio for cross-firm comparison.
  • Startup Funding Tracker — recent rounds + valuations to enrich Sequoia-backed companies.
  • 500 Global Companies Directory — accelerator alumni at the early-stage entry point.

Combine any of these to build the most complete sourcing pipeline on the market.

Pricing

Pay-per-event:

  • Per company — $0.75 (primary event, charged on every dataset row)
  • Actor start — $0.00005 (negligible)

You only pay for the rows you actually receive. Filtering by sector / status / year does not cost extra.

Technical notes

  • Static HTML parsing. No JS rendering, no headless browser.
  • Apify residential proxy used by default for IP rotation.
  • Concurrent fetch (8-way) — full Global crawl (~710 companies) completes in ~3–5 minutes.
  • Detail-page selectors handle both the Sequoia clist__item markup and the Peak XV company__milestone markup.
  • Robust to either site reordering the milestones / partners / categories blocks.

Limitations

  • Sequoia China (HongShan) is not in scope. Sequoia US, Sequoia China, and Peak XV (India/SEA) split into three independent firms in 2023; HongShan runs separate infrastructure that does not expose a comparable public sitemap. This actor covers Sequoia US + Peak XV only.
  • The original stage-invested label (Series A vs B vs C at time of investment) is not surfaced on Sequoia detail pages and therefore not in the output. We expose partnered_year (first-investment year) and status (current state) instead — sufficient for almost all sourcing use cases.
  • For companies that have been acquired but later acqui-spun-back-out, the acquired_year reflects what Sequoia shows on the page, which may lag real-world events by 0–6 months.