YC Startup & Jobs Scraper — Companies, Jobs & Founders avatar

YC Startup & Jobs Scraper — Companies, Jobs & Founders

Pricing

$18.99/month + usage

Go to Apify Store
YC Startup & Jobs Scraper — Companies, Jobs & Founders

YC Startup & Jobs Scraper — Companies, Jobs & Founders

Scrape YC-backed startup data from workatastartup.com. Returns company details, active jobs with salary ranges, founder LinkedIn profiles, team size and YC batch. Filter by industry, batch, remote-only and hiring status. Perfect for recruiting, sales and investor research.

Pricing

$18.99/month + usage

Rating

0.0

(0)

Developer

Scrape Pilot

Scrape Pilot

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share


🚀 Y Combinator Startup & Jobs Scraper v1 — Official YC Data, No API Key

Extract real startup data from Y Combinator’s public API and workatastartup.com.
Get company profiles, job listings, founder details, funding info, batch, valuation, and more — without any API key. Perfect for investors, job seekers, market researchers, and data analysts.


💡 What is Y Combinator Startup & Jobs Scraper?

Y Combinator Startup & Jobs Scraper is a powerful automation tool that extracts publicly available startup data directly from Y Combinator’s official endpoints:

  • https://www.workatastartup.com/companies – company list + current jobs
  • https://api.ycombinator.com/v0.1/companies – YC company metadata (batch, valuation, founders, etc.)
  • https://www.workatastartup.com/jobs – detailed job listings

No authentication, no API key, no rate limits (but respectful delays are applied). The scraper can:

  • Search for startups by keyword, industry, batch, remote/hiring filters.
  • Fetch specific companies by slug (e.g., airbnb, stripe, dropbox).
  • Extract rich metadata – description, website, team size, batch, valuation, total raised, stage, location, logo, HN link.
  • List open jobs – title, salary range, location, required skills, job URL.
  • List founders – names, LinkedIn URLs, titles.

All data is returned as clean JSON, ready for dashboards, CRM enrichment, or investment research.


📦 What Data Can You Extract?

🧩 Data Type📋 Description
🏢 Company ProfileName, slug, description, website, industry, subindustry, team size, batch, status, stage
💰 Funding & ValuationValuation, total raised, country, city
🖼️ Logo & MediaLogo URL, HN discussion URL
👥 FoundersFull name, LinkedIn URL, title
💼 Open JobsJob title, salary range, location, remote flag, required skills, job URL
🏷️ Tags & CategoriesIndustry tags (e.g., AI, Fintech, SaaS)
📊 MetadataData source (companies, yc_api, html), processing timestamp

⚙️ Key Features

  • Official YC Data – Uses Y Combinator’s public API (no reverse engineering).
  • No API Key Required – Completely free to use (respectful rate limits applied).
  • Search & Direct Fetch – Search by keyword/industry/batch or fetch specific company slugs.
  • Rich Filters – Remote‑only, hiring‑only, industry, batch.
  • Job Extraction – Captures salary, location, skills, and job URL for each open position.
  • Founder Intelligence – Names and LinkedIn profiles (when available).
  • Residential Proxy Ready – Avoid IP bans when scraping at scale.
  • Clean JSON Output – Structured, documented schema.

📥 Input Parameters

The actor accepts a JSON object with the following fields:

ParameterTypeRequiredDefaultDescription
company_slugsarray or stringNoList of company slugs (e.g., ["airbnb", "stripe"]). Can be comma‑ or newline‑separated string.
search_querystringNoKeyword to search companies (e.g., "AI", "fintech").
industrystringNoFilter by industry (e.g., "SaaS", "Healthcare").
batchstringNoFilter by YC batch (e.g., W24, S23).
remote_onlybooleanNofalseShow only companies with remote jobs.
hiring_onlybooleanNofalseShow only companies that are currently hiring.
max_resultsintegerNo20Maximum number of companies to return.
proxyConfigurationobjectNoApify proxy configuration. Residential proxies recommended for large runs.

Example Input (Search Mode)

{
"search_query": "AI",
"industry": "SaaS",
"batch": "W24",
"remote_only": true,
"hiring_only": true,
"max_results": 10,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}

Example Input (Direct Fetch by Slug)

{
"company_slugs": ["airbnb", "stripe", "dropbox"],
"max_results": 10
}

📤 Output Format

Each company is returned as a JSON object. Fields vary slightly depending on the data source, but the core schema is consistent.

Company Object Fields

FieldTypeDescription
data_sourcestringcompanies, yc_api, company_detail, or html.
idintegerInternal YC company ID (if available).
namestringCompany name.
slugstringURL slug (e.g., airbnb).
websitestringCompany website URL.
descriptionstringShort description (one‑liner or long description).
industrystringPrimary industry.
subindustrystringSub‑industry (if available).
team_sizeintegerNumber of employees (approx).
batchstringYC batch (e.g., W24).
statusstringActive, Acquired, Inactive, etc.
is_hiringbooleanWhether the company has open jobs.
valuationstringValuation (if disclosed).
total_raisedstringTotal funding raised.
stagestringFunding stage (e.g., Seed, Series A).
countrystringCountry of headquarters.
citystringCity of headquarters.
logo_urlstringURL to company logo.
hn_urlstringHacker News discussion link.
jobsarrayList of open job postings (see below).
foundersarrayList of founders (see below).
tagsarrayIndustry/topic tags.
processed_atstringISO timestamp of extraction.
source_urlstringSlug or name used as identifier.
status_resultstringsuccess or not_found.

Job Object Structure

FieldTypeDescription
idintegerJob ID.
titlestringJob title.
salary_rangestringe.g., "$120K - $180K".
locationstringCity or "Remote".
typestringe.g., Full-time, Contract.
remotebooleanRemote friendly.
skillsarrayRequired skills/tags.
job_urlstringDirect application link.

Founder Object Structure

FieldTypeDescription
full_namestringFounder’s full name.
linkedinstringLinkedIn profile URL (if available).
titlestringTitle/role at the company.

Example Output

[
{
"data_source": "companies",
"id": 64,
"name": "Y Combinator",
"slug": "ycombinator",
"website": "https://www.ycombinator.com",
"description": "A startup accelerator that funds and mentors early-stage companies.",
"industry": "Venture Capital",
"subindustry": "Accelerator",
"team_size": 100,
"batch": null,
"status": "Active",
"is_hiring": true,
"valuation": null,
"total_raised": null,
"stage": null,
"country": "USA",
"city": "Mountain View",
"logo_url": "https://...",
"hn_url": "https://news.ycombinator.com/item?id=123456",
"jobs": [
{
"id": 78637,
"title": "Product Engineer - App ops",
"salary_range": "$180K - $270K",
"location": "San Francisco, CA, US",
"type": "Full-time",
"remote": false,
"skills": ["React", "Ruby on Rails", "PostgreSQL"],
"job_url": "https://www.workatastartup.com/jobs/78637"
}
],
"founders": [
{
"full_name": "Paul Graham",
"linkedin": "",
"title": "Co-founder"
},
{
"full_name": "Jessica Livingston",
"linkedin": "https://www.linkedin.com/in/jessicalivingston1",
"title": "Co-founder"
}
],
"tags": ["startup", "accelerator"],
"processed_at": "2026-04-04T12:00:00Z",
"source_url": "ycombinator",
"status_result": "success"
}
]

🛠 How to Use on Apify

  1. Create a task with this actor.
  2. Provide input – either a list of company slugs or search parameters.
  3. Configure proxies – optional but recommended for large‑scale runs.
  4. Run – the actor will fetch data and push it to the Dataset.
  5. Export – download results as JSON, CSV, or Excel.

Running via API

curl -X POST "https://api.apify.com/v2/acts/your-username~yc-startup-scraper/runs" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-d '{
"search_query": "fintech",
"batch": "W24",
"max_results": 10
}'

🎯 Use Cases

  • Investor Research – Identify promising YC startups by batch, industry, or funding stage.
  • Job Aggregation – Build a job board focused on YC‑backed companies.
  • Competitive Intelligence – Track hiring trends, salary ranges, and skill demands.
  • Founder Database – Enrich CRM with founder names and LinkedIn profiles.
  • Market Analysis – Study which industries are most active in recent YC batches.
  • Data Enrichment – Add startup metadata to existing datasets.

❓ Frequently Asked Questions

Q1. Do I need a Y Combinator API key?

No. This scraper uses the public APIs that power workatastartup.com and the YC company directory. No authentication required.

Q2. Is the data real?

Yes. All data comes directly from Y Combinator’s official public endpoints. It is the same data you see when visiting workatastartup.com.

Q3. Can I get historical data (older batches)?

Yes. The API returns companies from all batches. You can filter by batch (e.g., W15, S10) to get older startups.

Q4. Why do I need residential proxies?

For small runs (under 50 companies), proxies are optional. For large‑scale scraping (hundreds of companies), residential proxies help avoid IP‑based rate limiting. Datacenter IPs may be blocked by Cloudflare.

Q5. What happens if a company slug is not found?

The actor returns a status_result: "not_found" object with only basic fields. The run continues for other slugs.

Q6. Can I extract full job descriptions?

The job description is not directly available via the API. However, you can combine this actor with a job detail scraper (using the job_url field) to fetch full descriptions.

Q7. How accurate is the team size?

Team size is as reported by the company on their YC profile. It may not be real‑time, but it is usually up‑to‑date for active startups.

Q8. How fast is the scraper?

Search returns up to 20 companies in 5–10 seconds. Direct fetching of 10 specific slugs takes 10–20 seconds (including delays).



🔍 SEO Keywords

Y Combinator scraper, YC startup data, workatastartup API, startup intelligence, founder database, YC job scraper, startup salary data, Apify YC actor, venture capital research, startup ecosystem analytics, YC companies API