Y Combinator Companies API Scraper
Pricing
Pay per event
Y Combinator Companies API Scraper
Query the YC company directory by batch, industry, or status via the same Algolia index the website uses. Returns name, batch, status, industries, team size, website, one-liner, and YC profile URL as clean structured rows.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
19 hours ago
Last modified
Categories
Share
🎯 What this scrapes
Y Combinator publishes every funded company at ycombinator.com/companies. The directory is backed by a public Algolia search index — the same one the website uses to power its filter UI. This Actor queries that index directly so you can pull the full yc companies database by free-text query, batch (e.g. W24, S23), industry tag, or status — and receive one clean dataset row per company.
No official API, no CSV download, and no bulk export exists from YC. That gap is exactly what this Actor fills.
🔥 What we handle for you
- 🛡️ Browser fingerprint rotation —
curl-cffiimpersonates real Chrome / Firefox / Safari TLS handshakes so the target sees a genuine browser, not a Python script. - 🌐 Residential proxy rotation via Apify Proxy — fresh session and exit IP on every block or rate-limit response.
- 🔁 Retries with exponential backoff on
408 / 429 / 5xx— up to 5 attempts per page,Retry-Afterheader honoured. - 🧱 Rate-limit-aware pacing — when the target pushes back, we slow down and surface a clear status message instead of returning an empty dataset.
- 🧊 Clean, typed dataset rows — Pydantic-validated, ISO-8601 timestamps, stable IDs, JSON / CSV / Excel export straight from the Apify Console.
- 💰 Pay-Per-Event pricing — you only pay for results that hit your dataset. No data, no charge (beyond the small warm-up fee).
💡 Use cases
- Lead gen — pull every
W24Developer Tools company with website, team size, and one-liner, then enrich downstream in Clay or Apollo. - Investor prospecting — surface YC companies in a given vertical sorted by team size to prioritise outreach.
- Hiring — track active YC companies in your city or tech stack.
- Competitive intel — schedule a weekly run, diff against last week, and catch newly-added companies the day they appear.
- Sales ops — pipe filtered YC company lists straight into HubSpot, Attio, or Smartlead as a live lead source.
- Research — build a quarterly batch report (YC W24: companies by industry, location, team size) without hand-copying rows from the directory.
⚙️ How to use it
- Click Try for free at the top of the page.
- Fill in the input form — most fields have sensible defaults. Start with just a
batchvalue likeW24to pull a full batch. - Click Start. Output streams into the run's dataset in real time.
- Export from Storage → Dataset as JSON, CSV, or Excel — or pull via the Apify API into your own pipeline.
📥 Input
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
searchQuery | string | no | '' | Free-text search across company name and one-liner. Leave empty to use filters only. |
batch | string | no | '' | YC batch slug — e.g. W24, S23, W25. Leave empty for all batches. |
industry | string | no | '' | Industry tag exactly as YC uses it — e.g. Developer Tools, B2B, Fintech. |
status | string | no | 'any' | YC status field. Leave on any to include all companies regardless of status. |
maxResults | integer | no | 50 | Maximum number of companies to return. |
proxyConfiguration | object | no | {"useApifyProxy": false} | Proxy settings. Algolia is served publicly; proxy is optional but available for high-volume runs. |
Example input
{"batch": "W24","maxResults": 3,"proxyConfiguration": {"useApifyProxy": false}}
📤 Output
Every row is one dataset item. Export as JSON, CSV, or Excel from the Apify Console.
| Field | Type | Notes |
|---|---|---|
slug | string | YC slug — used in the directory URL. |
name | string | Company name. |
one_liner | string | null | YC-curated one-liner pitch. |
long_description | string | null | Longer YC description if the Algolia index exposes it. |
batch | string | null | YC batch — e.g. W24, S23. |
status | string | null | YC status — Active, Acquired, Inactive, etc. |
industries | array | Industry tags as assigned by YC. |
tags | array | Additional tags from the Algolia index. |
regions | array | Region tags — e.g. United States, San Francisco. |
location | string | null | Headquarters string. |
team_size | integer | null | YC-reported team size. |
website | string | null | Company website URL. |
small_logo_url | string | null | Logo URL (small). |
yc_url | string | Full YC directory profile URL for this company. |
yc_team_link | string | null | YC founders page link. |
scraped_at | string | ISO-8601 timestamp of when this row was recorded. |
Example output
{"slug": "example-startup","name": "Example","one_liner": "AI-powered example builder.","batch": "W24","status": "Active","industries": ["B2B", "Developer Tools"],"regions": ["United States", "San Francisco"],"team_size": 4,"website": "https://example.com","yc_url": "https://www.ycombinator.com/companies/example-startup","scraped_at": "2026-06-01T12:00:00Z"}
💰 Pricing
Pay-Per-Event — you pay only when these events fire:
| Event | USD | What it is |
|---|---|---|
actor-start | $0.005 | One-off warm-up charge per run |
result | $0.003 | Per dataset item pushed |
Example: 1 000 results at the rates above ≈ $3.00. No subscription, no minimum — Apify gives every new account $5 of free credit, so your first run costs nothing.
🚧 Limitations
- Algolia search ranks results — the order reflects YC's internal relevance scoring, not alphabetical or chronological.
- Founder names, bios, and full team rosters live on individual profile pages, not in the directory index. Use a follow-up Actor to fetch individual profiles if needed.
- Revenue figures, funding amounts, and valuation data are not published in the YC directory and are not in scope.
- YC occasionally refreshes its Algolia configuration. We run weekly cloud QA; if the index changes we ship a fix within 24 hours.
❓ FAQ
Is there an official Y Combinator companies API?
No. YC has never published an official API or export endpoint. The directory is a public Algolia index used by the ycombinator.com website — this Actor queries it and hands you the data in a structured dataset. That's the y combinator companies api gap we fill.
Can I get a YC companies CSV or Excel export?
Yes. Once the run finishes, open Storage → Dataset in the Apify Console and click Export → CSV or Export → Excel. The ycombinator companies csv export is one click with no post-processing needed.
Is this a complete YC directory export?
The Actor returns everything that Algolia exposes for the query you provide. Set maxResults high enough (or remove the cap) to pull the full yc companies database for a given batch or industry filter.
Does this need a login or YC account?
No. The directory is fully public. We query the same Algolia endpoint the ycombinator.com website uses — no credentials required.
Are founder names or emails included?
Not from the directory index. YC's Algolia data covers company metadata only — name, batch, industry, team size, website, one-liner. Founder contact information is not in the directory and is not returned.
Why are some fields null?
YC does not fill every field for every company. We surface null, never fabricate.
Can I filter by city?
Use the regions field for post-filtering. The Algolia index uses region tags (e.g. San Francisco, New York) rather than free-text city search, so filtering in your spreadsheet or downstream tool is the most reliable approach.
How do I track new YC companies added each week?
Schedule the Actor on a weekly cadence via Apify Schedules with no batch filter. Diff the output against your previous run to surface newly-added companies in the yc directory export. The scraped_at field makes deduplication straightforward.
💬 Your feedback
Spotted a bug, hit a weird edge case, or need a new field? Open an issue on the Actor's Issues tab in the Apify Console — we ship fixes weekly and read every report.