Pricing

$4.49 / 1,000 companies

Try for free

Go to Apify Store

Y Combinator Scraper With Emails | $4.5 / 1K

Try for free

Scrape the Y Combinator directory and get rich company profiles with socials, founder details + emails, hiring status/job links, and news mentions. Perfect for lead gen, market mapping, recruiting, and competitor tracking.

Pricing

$4.49 / 1,000 companies

Rating

4.3

(4)

Developer

Fatih Tahta

Actor stats

Bookmarked

111

Total users

Monthly active users

7 days ago

Last modified

Y Combinator Directory Scraper with Founder Emails

Slug: fatihtahta/y-combinator-directory-scraper

Overview

This actor collects structured Y Combinator company profiles, founder details (including emails), and related public context such as news and jobs when present.

This actor is built upon the ValidatedMails.com architecture for email enrichment workflows.

It captures key company attributes like name, industry, batch, team size, locations, and status, along with social links and founder roles. The data is sourced from https://www.ycombinator.com/companies, a widely used directory for tracking YC-backed companies and market activity. Results are delivered as consistent JSON records, enabling repeatable analysis and reliable downstream use. The automation saves manual research time while keeping outputs structured and easy to process.

Why Use This Actor

Market research / analytics: Build datasets of YC companies to analyze batches, industries, locations, and growth patterns.
Product & content teams: Discover companies by keyword or category to inform content planning and product positioning.
Developers / data engineering pipelines: Feed structured company records into analytics tools, warehouses, or internal directories.
Lead gen / enrichment: Identify founders and company context to enrich outreach and qualification workflows.
Monitoring / competitive tracking: Track changes in hiring status, public/private status, or category coverage over time.

Input Parameters

Provide any combination of URLs, queries, and filters…

Parameter	Type	Description	Default
`topCompanies`	`boolean`	When enabled, return only YC’s top companies.	–
`isHiring`	`boolean`	When enabled, return only companies actively hiring.	–
`nonprofit`	`boolean`	When enabled, return only companies marked as nonprofit.	–
`queries`	`string[]`	Keywords to discover companies (e.g., product, market, or location). Use when you don’t already have URLs.	`["AI Assistant"]`
`batches`	`string[]`	Filter by YC batch. Allowed values: `All Batches`, `Spring 2026`, `Winter 2026`, `Fall 2025`, `Summer 2025`, `Spring 2025`, `Winter 2025`, `Fall 2024`, `Summer 2024`, `Winter 2024`, `Summer 2023`, `Winter 2023`, `Summer 2022`, `Winter 2022`, `Summer 2021`, `Winter 2021`, `Summer 2020`, `Winter 2020`, `Summer 2019`, `Winter 2019`, `Summer 2018`, `Winter 2018`, `Summer 2017`, `Winter 2017`, `Summer 2016`, `Winter 2016`, `Summer 2015`, `Winter 2015`, `Summer 2014`, `Winter 2014`, `Summer 2013`, `Winter 2013`, `Summer 2012`, `Winter 2012`, `Summer 2011`, `Winter 2011`, `Summer 2010`, `Winter 2010`, `Summer 2009`, `Winter 2009`, `Summer 2008`, `Winter 2008`, `Summer 2007`, `Winter 2007`, `Summer 2006`, `Winter 2006`, `Summer 2005`.	`["All Batches"]`
`industries`	`string[]`	Filter by industry. Allowed values: `All industries`, `B2B`, `Consumer`, `Fintech`, `Healthcare`, `Education`, `Industrials`, `Real Estate and Construction`, `Government`, `Unspecified`.	`["All industries"]`
`regions`	`string[]`	Filter by company region. Allowed values: `Anywhere`, `America / Canada`, `Remote`, `Europe`, `South Asia`, `Latin America`, `Southeast Asia`, `Africa`, `Middle East and North Africa`, `East Asia`, `Oceania`.	`["Anywhere"]`
`minEmployeeSize`	`string`	Minimum company size. Allowed values: `1+`, `5+`, `10+`, `25+`, `50+`, `100+`, `250+`, `500+`, `1000+`.	`"1+"`
`maxEmployeeSize`	`string`	Maximum company size. Allowed values: `1+`, `5`, `10`, `25`, `50`, `100`, `250`, `500`, `1000+`.	`"1000+"`
`limit`	`integer`	Maximum companies to save per query. Minimum: `10`.	`50000`
`getEmails`	`boolean`	When enabled, include founder email permutations where available.	`false`
`includeRiskyEmails`	`boolean`	Include additional potential emails with lower confidence, labeled as `verified` or `risky`.	`true`
`proxyConfiguration`	`object`	Optional connection settings to improve reliability on larger runs.	`{ "useApifyProxy": true, "apifyProxyGroups": [] }`

Example Input

{
  "queries": ["climate", "fintech"],
  "batches": ["Summer 2024"],
  "industries": ["Fintech"],
  "regions": ["America / Canada"],
  "minEmployeeSize": "10+",
  "getEmails": true,
  "includeRiskyEmails": false,
  "limit": 250
}

Output

6.1 Output destination

The actor writes results to an Apify dataset as JSON records. And the dataset is designed for direct consumption by analytics tools, ETL pipelines, and downstream APIs without post-processing.

6.2 Record envelope (all items)

Every record includes stable identifiers:

type (string, required)
id (number, required)
url (string, required)

Recommended idempotency key: type + ":" + id. Use this key to deduplicate and upsert records when the same company appears across multiple runs or inputs.

6.3 Examples

Example: company (type = "company")

{
  "type": "company",
  "id": 731,
  "url": "https://www.ycombinator.com/companies/oklo",
  "Company General Info": {
    "Company Name": "Oklo",
    "One-liner Description": "Emission free, always on power from advanced fission power plants.",
    "Full Description": "About Oklo Inc.: \r\n\r\nOklo Inc. (Oklo) is developing advanced fission power plants to provide emission-free, reliable, and affordable energy. \r\n\r\nOklo received a Site Use Permit from the U.S Department of Energy, has performed successful prototypic fuel fabrication, was awarded fuel material from Idaho National Laboratory, developed the first advanced fission combined license application accepted and docketed by the U.S. Nuclear Regulatory Commission, and is developing advanced fuel recycling technologies in collaboration with the U.S. Department of Energy and national laboratories.\r\n\r\nOklo has been featured in Time, Newsweek, Wall Street Journal, CNBC, Popular Mechanics, Wired, Architectural Digest, Hyperallergic, POWER Magazine, has been the subject of a Harvard Business School case, and is featured in the Oliver Stone documentary Nuclear, among other features.",
    "Website": "http://oklo.com",
    "Year Founded": null,
    "Team Size": 50,
    "Company Stage": "Growth",
    "Industry": "Industrials",
    "Sub-industry": "Industrials -> Energy",
    "YC Batch": "Summer 2014",
    "Public / Private Status": "Public",
    "Hiring Status": false,
    "Top Company": true,
    "Nonprofit": false,
    "Regions": [
      "United States of America",
      "America / Canada",
      "Remote",
      "Partly Remote"
    ],
    "Headquarters City": "Santa Clara",
    "Headquarters Country": "US",
    "All Known Locations": "Santa Clara, CA, USA; Sunnyvale, CA, USA",
    "Tags / Categories": [
      "Small Modular Reactors",
      "Climate"
    ]
  },
  "Company Socials & External Links": {
    "Website": "http://oklo.com",
    "LinkedIn": "https://www.linkedin.com/company/oklo/",
    "Twitter / X": "http://www.twitter.com/oklo",
    "Facebook": "http://www.facebook.com/okloinc",
    "GitHub": null,
    "Crunchbase": "",
    "YC Company Page URL": "https://www.ycombinator.com/companies/oklo",
    "Jobs Page URL": "https://bookface.ycombinator.com/workatastartup",
    "News Page URL": "https://bookface.ycombinator.com/company/731/company_news"
  },
  "Founders": [
    {
      "Label": "Founder 1",
      "Founder Name": "Jacob DeWitte",
      "Title / Role": "Founder/CEO",
      "Bio": null,
      "LinkedIn URL": "http://linkedin.com/in/jacob-dewitte-90062132",
      "Twitter URL": "",
      "Email Available": true,
      "Email": "jacob@oklo.com",
      "Email Status": "verified",
      "Avatar Image URL": "https://bookface-images.s3.us-west-2.amazonaws.com/avatars/6147b6a370516ec8d93ef1234b50853dec23e50a.jpg",
      "Latest YC Company": "Oklo"
    },
    {
      "Label": "Founder 2",
      "Founder Name": "Caroline Cochran",
      "Title / Role": "Founder",
      "Bio": "Caroline Cochran is the Co-Founder and Chief Operating Officer of Oklo Inc., a company developing advanced fission cleantech for cleaner air and human development. She was one of the youngest recipients of the University of Oklahoma Regent's Alumni Award. Caroline received her S.M. in Nuclear Engineering from MIT, a B.A. in Economics and a B.S. in Mechanical Engineering from the University of Oklahoma.",
      "LinkedIn URL": "https://www.linkedin.com/in/caorilne",
      "Twitter URL": "https://www.twitter.com/caorilne",
      "Email Available": true,
      "Email": "caroline@oklo.com",
      "Email Status": "verified",
      "Avatar Image URL": "https://bookface-images.s3.us-west-2.amazonaws.com/avatars/db9d311bc6421042f8b6d63bd74ddf14772adb4f.jpg",
      "Latest YC Company": "Oklo"
    }
  ],
  "Jobs": [],
  "Company News": [
    {
      "Headline": "Oklo’s microreactor project pipeline jumps 93% ahead of 2027 planned deployment | Utility Dive",
      "Source": "Utility Dive",
      "Publication Date": "Aug 15, 2024",
      "Article URL": "https://www.utilitydive.com/news/oklo-advanced-nuclear-microreactor-project-pipeline-nrc/724343/"
    },
    {
      "Headline": "Oklo starts trading on NYSE",
      "Source": null,
      "Publication Date": "May 10, 2024",
      "Article URL": "https://www.cnbc.com/2024/05/10/sam-altman-takes-nuclear-startup-oklo-public-to-power-ai-ambitions.html"
    },
    {
      "Headline": "A Sam Altman-backed nuclear startup is going public via SPAC.",
      "Source": null,
      "Publication Date": "Jul 11, 2023",
      "Article URL": "https://www.axios.com/pro/climate-deals/2023/07/11/sam-altman-backed-nuclear-startup-to-go-public-via-500m-spac"
    }
  ],
  "Metadata": {
    "YC Company ID": 731,
    "Company Slug": "oklo",
    "Source URL": "https://www.ycombinator.com/companies?top_company=true",
    "Seed Type": "filters",
    "Seed Value": "default",
    "Scrape Timestamp": "2026-01-16T17:44:49+00:00",
    "Data Completeness Flags": [
      "missing founder bio",
      "missing jobs"
    ]
  }
}

Field reference

Company record fields (`type = "company"`)

type (string, required): Record type.
id (number, required): Company identifier.
url (string, required): Canonical company page URL.
Company General Info (object, optional): High-level company details.
- Company General Info.Company Name (string, optional): Company name.
- Company General Info.One-liner Description (string, optional): Short description.
- Company General Info.Full Description (string, optional): Extended description.
- Company General Info.Website (string, optional): Company website URL.
- Company General Info.Year Founded (number, optional): Year founded when available.
- Company General Info.Team Size (number, optional): Team size estimate.
- Company General Info.Company Stage (string, optional): Company stage.
- Company General Info.Industry (string, optional): Primary industry.
- Company General Info.Sub-industry (string, optional): More specific industry label.
- Company General Info.YC Batch (string, optional): YC batch.
- Company General Info.Public / Private Status (string, optional): Public or private status.
- Company General Info.Hiring Status (boolean, optional): Hiring status when listed.
- Company General Info.Top Company (boolean, optional): Top company flag.
- Company General Info.Nonprofit (boolean, optional): Nonprofit flag.
- Company General Info.Regions (array[string], optional): Regions list.
- Company General Info.Headquarters City (string, optional): HQ city.
- Company General Info.Headquarters Country (string, optional): HQ country code.
- Company General Info.All Known Locations (string, optional): Aggregated locations string.
- Company General Info.Tags / Categories (array[string], optional): Tags or categories.
Company Socials & External Links (object, optional): External links and social profiles.
- Company Socials & External Links.Website (string, optional)
- Company Socials & External Links.LinkedIn (string, optional)
- Company Socials & External Links.Twitter / X (string, optional)
- Company Socials & External Links.Facebook (string, optional)
- Company Socials & External Links.GitHub (string, optional)
- Company Socials & External Links.Crunchbase (string, optional)
- Company Socials & External Links.YC Company Page URL (string, optional)
- Company Socials & External Links.Jobs Page URL (string, optional)
- Company Socials & External Links.News Page URL (string, optional)
Founders (array[object], optional): Founder list.
- Founders.Label (string, optional): Founder label.
- Founders.Founder Name (string, optional): Founder name.
- Founders.Title / Role (string, optional): Title or role.
- Founders.Bio (string, optional): Founder bio.
- Founders.LinkedIn URL (string, optional)
- Founders.Twitter URL (string, optional)
- Founders.Email Available (boolean, optional): Email availability flag.
- Founders.Email (string, optional): Email address when present.
- Founders.Email Status (string, optional): Email confidence label.
- Founders.Avatar Image URL (string, optional)
- Founders.Latest YC Company (string, optional)
Jobs (array[object], optional): Job listings when present.
Company News (array[object], optional): News coverage.
- Company News.Headline (string, optional)
- Company News.Source (string, optional)
- Company News.Publication Date (string, optional)
- Company News.Article URL (string, optional)
Metadata (object, optional): Run-level metadata.
- Metadata.YC Company ID (number, optional)
- Metadata.Company Slug (string, optional)
- Metadata.Source URL (string, optional)
- Metadata.Seed Type (string, optional)
- Metadata.Seed Value (string, optional)
- Metadata.Scrape Timestamp (string, optional)
- Metadata.Data Completeness Flags (array[string], optional)

Data guarantees & handling

Best-effort extraction: fields may vary by region/session/availability/UI experiments.
Optional fields: null-check in downstream code.
Deduplication: recommend type + ":" + id.

How to Run on Apify

Open the Actor in Apify Console.
Configure your search parameters (e.g., category/practice area, state/region, optional city).
Set the maximum number of outputs to collect.
Click Start and wait for the run to finish.
Download results in JSON, CSV, Excel, or other supported formats.

Love that you caught it — yeah, that section reads like generic “insert automation here” filler.

Here’s a tailored replacement for your YC actor. You can drop this straight into the README in place of the current ## Scheduling & Automation block.

Scheduling & Automation

This actor is designed for continuous YC intelligence, not just one-off pulls.

Whether you’re tracking new batches, monitoring hiring companies, or running ongoing founder email enrichment, scheduling turns this into a live data pipeline.

Scheduling Recurring YC Snapshots

Use Apify Schedules to automatically re-run the actor with the same filters over time.

Common use cases:

🗓 Track new companies in a specific batch (e.g., Winter 2026)
📈 Monitor hiring YC companies in a region (e.g., America / Canada)
🧠 Rebuild enriched founder-email datasets weekly
🕵️ Watch specific industries (e.g., Fintech, Healthcare) for new entrants

How to set it up:

Open the actor in Apify Console
Go to Schedules → Create schedule
Choose frequency (daily, weekly, or custom cron)
Paste your production input JSON
Enable notifications (optional)

Each run writes to a new dataset, allowing you to:

Compare snapshots over time
Track deltas (new companies, hiring changes, status changes)
Re-enrich previously missing emails

Automation & Downstream Workflows

This actor is structured for ETL, lead generation, and analytics pipelines.

Because each record includes stable identifiers (type, id), you can safely deduplicate and upsert into your database using:

type + ":" + id

🔁 Webhooks (Recommended for Production)

Trigger automations immediately after a run finishes:

Push results into your CRM
Insert/update rows in a warehouse
Send founder emails to a verification pipeline
Trigger AI enrichment or scoring

Typical Integration Patterns

CRM sync: Upsert founders + companies into HubSpot, Salesforce, or internal tools
Lead generation: Extract founders with Email Available = true and feed into outreach systems
Warehouse ingestion: Stream dataset into BigQuery / Snowflake for batch-level analytics
Hiring alerts: Notify Slack when new YC companies are marked Hiring Status = true
Batch monitoring: Run per-batch schedules and compare growth patterns over time

Performance

Estimated run times:

Small runs (< 1,000 outputs): ~5–10 minutes
Medium runs (1,000–5,000 outputs): ~15–25 minutes
Large runs (5,000+ outputs): ~25–90 minutes

Execution time varies based on filters, result volume, and how much information is returned per record.

Compliance & Ethics

Responsible Data Collection

This actor collects publicly available startup, company, and founder metadata from Y Combinator for legitimate business, research, and analytical purposes, including:

Startup ecosystem research, trend analysis, and market mapping
Venture intelligence, portfolio analysis, and competitive landscape monitoring
Data enrichment workflows for internal databases, CRM systems, analytics dashboards, and research pipelines

The actor is designed to operate on non-authenticated, publicly accessible pages and does not attempt to bypass access controls.

This section is informational and not legal advice.

Best Practices

Use collected data in accordance with applicable laws, regulations, and the target site’s terms
Respect individual privacy and personal information
Use data responsibly and avoid disruptive or excessive collection
Do not use this actor for spamming, harassment, or other harmful purposes
Follow relevant data protection requirements where applicable (e.g., GDPR, CCPA)

Support

For help or troubleshooting, open an issue on the actor page in Apify Console. Include the input you used (redacted), the run ID, expected vs. actual behavior, and a small output sample if possible.

Y Combinator Scraper

michael.g/y-combinator-scraper

Scrape data on Y Combinator companies and their founders from the YC startup directory.

Michael G

5.0

🎉 Y Combinator Founders

prog-party/y-combinator-founders

This Y Combinator Founder Actor retrieves data from Y Combinator, allowing to filter, and returns a list of founders as a Dataset.

Prog Party

Y Combinator Jobs Scraper

artemlazarevm/yc-jobs-scraper

Scrapes Y Combinator company pages and job listings, extracting company information, founder details, and complete job postings with salary, equity, and interview process information.

Artem Lazarev

Y Combinator Scraper

scraped/y-combinator-scraper

Scrape companies from Y Combinator

scraped

5.0

Y Combinator Scraper

futurizerush/y-combinator-scraper

Scrape Y Combinator companies directory for startup data, founders information, and job openings. Filter by batch, tags, status, or location.

Futurize Rush

4.5

Y Combinator Extractor

jupri/ycombinator

💫 All-In-One YCombinator.com Scraper

cat

157

5.0

Y Combinator Scraper

shahidirfan/Y-Combinator-Scraper

Discover the Y Combinator Scraper, an efficient actor for scraping the YC companies directory. Easily extract detailed profiles, founder info, and batch data. Ideal for market research, lead generation, or investment analysis. For best results and to avoid blocks, residential proxies are best.

Shahid Irfan

3.5

✨ Y Combinator Scraper Apify

damilo/y-combinator-scraper-apify

⚡ Scrape Y Combinator’s startup directory with rich data: company info, founders, job postings, batch, team size, social links, and more. Filter by industry, region, or hiring status. Ideal for lead gen, VC scouting, and recruiting. Clean JSON output ready for Airtable, Notion, or Excel.

Imad

2.8

🔥 Y Combinator Scraper [API] 2026 | Super Cheap & Fast

clearpath/ycombinator-api-scraper

Extract complete Y Combinator ecosystem data - 5000+ companies, 8000+ founders, 3500+ jobs. Perfect for VCs, recruiters, and researchers. Get startup intelligence, funding trends, team data, and job listings. Reliable Python scraper with proxy support. Start at $3.50.

ClearPath

164

4.3

Y Combinator Jobs Scraper

infinity_and_beyond/y-combinator-jobs-scraper

Scrapes job listings from Y Combinator's WorkAtAStartup.com and uses Perplexity AI to extract structured job data based on keyword and location filters. Ideal for recruiters, job boards, and analysts seeking clean, filtered job insights from YC-backed startups.

DR Nayaki

1.0

Y Combinator Scraper With Emails | $4.5 / 1K

Y Combinator Directory Scraper with Founder Emails

Overview

Why Use This Actor

Input Parameters

Example Input

Output

6.1 Output destination

6.2 Record envelope (all items)

6.3 Examples

Field reference

Company record fields (type = "company")

Data guarantees & handling

How to Run on Apify

Scheduling & Automation

Scheduling Recurring YC Snapshots

Automation & Downstream Workflows

🔁 Webhooks (Recommended for Production)

Typical Integration Patterns

Performance

Compliance & Ethics

Responsible Data Collection

Best Practices

Support

You might also like

Y Combinator Scraper

🎉 Y Combinator Founders

Y Combinator Jobs Scraper

Y Combinator Scraper

Y Combinator Scraper

Y Combinator Extractor

Y Combinator Scraper

✨ Y Combinator Scraper Apify

🔥 Y Combinator Scraper [API] 2026 | Super Cheap & Fast

Y Combinator Jobs Scraper

Company record fields (`type = "company"`)