Y Combinator Companies API Scraper avatar

Y Combinator Companies API Scraper

Pricing

Pay per event

Go to Apify Store
Y Combinator Companies API Scraper

Y Combinator Companies API Scraper

Query the YC company directory by batch, industry, or status via the same Algolia index the website uses. Returns name, batch, status, industries, team size, website, one-liner, and YC profile URL as clean structured rows.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

19 hours ago

Last modified

Categories

Share


🎯 What this scrapes

Y Combinator publishes every funded company at ycombinator.com/companies. The directory is backed by a public Algolia search index — the same one the website uses to power its filter UI. This Actor queries that index directly so you can pull the full yc companies database by free-text query, batch (e.g. W24, S23), industry tag, or status — and receive one clean dataset row per company.

No official API, no CSV download, and no bulk export exists from YC. That gap is exactly what this Actor fills.

🔥 What we handle for you

  • 🛡️ Browser fingerprint rotationcurl-cffi impersonates real Chrome / Firefox / Safari TLS handshakes so the target sees a genuine browser, not a Python script.
  • 🌐 Residential proxy rotation via Apify Proxy — fresh session and exit IP on every block or rate-limit response.
  • 🔁 Retries with exponential backoff on 408 / 429 / 5xx — up to 5 attempts per page, Retry-After header honoured.
  • 🧱 Rate-limit-aware pacing — when the target pushes back, we slow down and surface a clear status message instead of returning an empty dataset.
  • 🧊 Clean, typed dataset rows — Pydantic-validated, ISO-8601 timestamps, stable IDs, JSON / CSV / Excel export straight from the Apify Console.
  • 💰 Pay-Per-Event pricing — you only pay for results that hit your dataset. No data, no charge (beyond the small warm-up fee).

💡 Use cases

  • Lead gen — pull every W24 Developer Tools company with website, team size, and one-liner, then enrich downstream in Clay or Apollo.
  • Investor prospecting — surface YC companies in a given vertical sorted by team size to prioritise outreach.
  • Hiring — track active YC companies in your city or tech stack.
  • Competitive intel — schedule a weekly run, diff against last week, and catch newly-added companies the day they appear.
  • Sales ops — pipe filtered YC company lists straight into HubSpot, Attio, or Smartlead as a live lead source.
  • Research — build a quarterly batch report (YC W24: companies by industry, location, team size) without hand-copying rows from the directory.

⚙️ How to use it

  1. Click Try for free at the top of the page.
  2. Fill in the input form — most fields have sensible defaults. Start with just a batch value like W24 to pull a full batch.
  3. Click Start. Output streams into the run's dataset in real time.
  4. Export from Storage → Dataset as JSON, CSV, or Excel — or pull via the Apify API into your own pipeline.

📥 Input

FieldTypeRequiredDefaultNotes
searchQuerystringno''Free-text search across company name and one-liner. Leave empty to use filters only.
batchstringno''YC batch slug — e.g. W24, S23, W25. Leave empty for all batches.
industrystringno''Industry tag exactly as YC uses it — e.g. Developer Tools, B2B, Fintech.
statusstringno'any'YC status field. Leave on any to include all companies regardless of status.
maxResultsintegerno50Maximum number of companies to return.
proxyConfigurationobjectno{"useApifyProxy": false}Proxy settings. Algolia is served publicly; proxy is optional but available for high-volume runs.

Example input

{
"batch": "W24",
"maxResults": 3,
"proxyConfiguration": {
"useApifyProxy": false
}
}

📤 Output

Every row is one dataset item. Export as JSON, CSV, or Excel from the Apify Console.

FieldTypeNotes
slugstringYC slug — used in the directory URL.
namestringCompany name.
one_linerstring | nullYC-curated one-liner pitch.
long_descriptionstring | nullLonger YC description if the Algolia index exposes it.
batchstring | nullYC batch — e.g. W24, S23.
statusstring | nullYC status — Active, Acquired, Inactive, etc.
industriesarrayIndustry tags as assigned by YC.
tagsarrayAdditional tags from the Algolia index.
regionsarrayRegion tags — e.g. United States, San Francisco.
locationstring | nullHeadquarters string.
team_sizeinteger | nullYC-reported team size.
websitestring | nullCompany website URL.
small_logo_urlstring | nullLogo URL (small).
yc_urlstringFull YC directory profile URL for this company.
yc_team_linkstring | nullYC founders page link.
scraped_atstringISO-8601 timestamp of when this row was recorded.

Example output

{
"slug": "example-startup",
"name": "Example",
"one_liner": "AI-powered example builder.",
"batch": "W24",
"status": "Active",
"industries": ["B2B", "Developer Tools"],
"regions": ["United States", "San Francisco"],
"team_size": 4,
"website": "https://example.com",
"yc_url": "https://www.ycombinator.com/companies/example-startup",
"scraped_at": "2026-06-01T12:00:00Z"
}

💰 Pricing

Pay-Per-Event — you pay only when these events fire:

EventUSDWhat it is
actor-start$0.005One-off warm-up charge per run
result$0.003Per dataset item pushed

Example: 1 000 results at the rates above ≈ $3.00. No subscription, no minimum — Apify gives every new account $5 of free credit, so your first run costs nothing.

🚧 Limitations

  • Algolia search ranks results — the order reflects YC's internal relevance scoring, not alphabetical or chronological.
  • Founder names, bios, and full team rosters live on individual profile pages, not in the directory index. Use a follow-up Actor to fetch individual profiles if needed.
  • Revenue figures, funding amounts, and valuation data are not published in the YC directory and are not in scope.
  • YC occasionally refreshes its Algolia configuration. We run weekly cloud QA; if the index changes we ship a fix within 24 hours.

❓ FAQ

Is there an official Y Combinator companies API?

No. YC has never published an official API or export endpoint. The directory is a public Algolia index used by the ycombinator.com website — this Actor queries it and hands you the data in a structured dataset. That's the y combinator companies api gap we fill.

Can I get a YC companies CSV or Excel export?

Yes. Once the run finishes, open Storage → Dataset in the Apify Console and click Export → CSV or Export → Excel. The ycombinator companies csv export is one click with no post-processing needed.

Is this a complete YC directory export?

The Actor returns everything that Algolia exposes for the query you provide. Set maxResults high enough (or remove the cap) to pull the full yc companies database for a given batch or industry filter.

Does this need a login or YC account?

No. The directory is fully public. We query the same Algolia endpoint the ycombinator.com website uses — no credentials required.

Are founder names or emails included?

Not from the directory index. YC's Algolia data covers company metadata only — name, batch, industry, team size, website, one-liner. Founder contact information is not in the directory and is not returned.

Why are some fields null?

YC does not fill every field for every company. We surface null, never fabricate.

Can I filter by city?

Use the regions field for post-filtering. The Algolia index uses region tags (e.g. San Francisco, New York) rather than free-text city search, so filtering in your spreadsheet or downstream tool is the most reliable approach.

How do I track new YC companies added each week?

Schedule the Actor on a weekly cadence via Apify Schedules with no batch filter. Diff the output against your previous run to surface newly-added companies in the yc directory export. The scraped_at field makes deduplication straightforward.

💬 Your feedback

Spotted a bug, hit a weird edge case, or need a new field? Open an issue on the Actor's Issues tab in the Apify Console — we ship fixes weekly and read every report.