YC Startup & Jobs Scraper — Companies, Jobs & Founders
Pricing
$18.99/month + usage
YC Startup & Jobs Scraper — Companies, Jobs & Founders
Scrape YC-backed startup data from workatastartup.com. Returns company details, active jobs with salary ranges, founder LinkedIn profiles, team size and YC batch. Filter by industry, batch, remote-only and hiring status. Perfect for recruiting, sales and investor research.
Pricing
$18.99/month + usage
Rating
0.0
(0)
Developer
Scrape Pilot
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
🚀 Y Combinator Startup & Jobs Scraper v1 — Official YC Data, No API Key
Extract real startup data from Y Combinator’s public API and workatastartup.com.
Get company profiles, job listings, founder details, funding info, batch, valuation, and more — without any API key. Perfect for investors, job seekers, market researchers, and data analysts.
💡 What is Y Combinator Startup & Jobs Scraper?
Y Combinator Startup & Jobs Scraper is a powerful automation tool that extracts publicly available startup data directly from Y Combinator’s official endpoints:
https://www.workatastartup.com/companies– company list + current jobshttps://api.ycombinator.com/v0.1/companies– YC company metadata (batch, valuation, founders, etc.)https://www.workatastartup.com/jobs– detailed job listings
No authentication, no API key, no rate limits (but respectful delays are applied). The scraper can:
- Search for startups by keyword, industry, batch, remote/hiring filters.
- Fetch specific companies by slug (e.g.,
airbnb,stripe,dropbox). - Extract rich metadata – description, website, team size, batch, valuation, total raised, stage, location, logo, HN link.
- List open jobs – title, salary range, location, required skills, job URL.
- List founders – names, LinkedIn URLs, titles.
All data is returned as clean JSON, ready for dashboards, CRM enrichment, or investment research.
📦 What Data Can You Extract?
| 🧩 Data Type | 📋 Description |
|---|---|
| 🏢 Company Profile | Name, slug, description, website, industry, subindustry, team size, batch, status, stage |
| 💰 Funding & Valuation | Valuation, total raised, country, city |
| 🖼️ Logo & Media | Logo URL, HN discussion URL |
| 👥 Founders | Full name, LinkedIn URL, title |
| 💼 Open Jobs | Job title, salary range, location, remote flag, required skills, job URL |
| 🏷️ Tags & Categories | Industry tags (e.g., AI, Fintech, SaaS) |
| 📊 Metadata | Data source (companies, yc_api, html), processing timestamp |
⚙️ Key Features
- Official YC Data – Uses Y Combinator’s public API (no reverse engineering).
- No API Key Required – Completely free to use (respectful rate limits applied).
- Search & Direct Fetch – Search by keyword/industry/batch or fetch specific company slugs.
- Rich Filters – Remote‑only, hiring‑only, industry, batch.
- Job Extraction – Captures salary, location, skills, and job URL for each open position.
- Founder Intelligence – Names and LinkedIn profiles (when available).
- Residential Proxy Ready – Avoid IP bans when scraping at scale.
- Clean JSON Output – Structured, documented schema.
📥 Input Parameters
The actor accepts a JSON object with the following fields:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
company_slugs | array or string | No | – | List of company slugs (e.g., ["airbnb", "stripe"]). Can be comma‑ or newline‑separated string. |
search_query | string | No | – | Keyword to search companies (e.g., "AI", "fintech"). |
industry | string | No | – | Filter by industry (e.g., "SaaS", "Healthcare"). |
batch | string | No | – | Filter by YC batch (e.g., W24, S23). |
remote_only | boolean | No | false | Show only companies with remote jobs. |
hiring_only | boolean | No | false | Show only companies that are currently hiring. |
max_results | integer | No | 20 | Maximum number of companies to return. |
proxyConfiguration | object | No | – | Apify proxy configuration. Residential proxies recommended for large runs. |
Example Input (Search Mode)
{"search_query": "AI","industry": "SaaS","batch": "W24","remote_only": true,"hiring_only": true,"max_results": 10,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Example Input (Direct Fetch by Slug)
{"company_slugs": ["airbnb", "stripe", "dropbox"],"max_results": 10}
📤 Output Format
Each company is returned as a JSON object. Fields vary slightly depending on the data source, but the core schema is consistent.
Company Object Fields
| Field | Type | Description |
|---|---|---|
data_source | string | companies, yc_api, company_detail, or html. |
id | integer | Internal YC company ID (if available). |
name | string | Company name. |
slug | string | URL slug (e.g., airbnb). |
website | string | Company website URL. |
description | string | Short description (one‑liner or long description). |
industry | string | Primary industry. |
subindustry | string | Sub‑industry (if available). |
team_size | integer | Number of employees (approx). |
batch | string | YC batch (e.g., W24). |
status | string | Active, Acquired, Inactive, etc. |
is_hiring | boolean | Whether the company has open jobs. |
valuation | string | Valuation (if disclosed). |
total_raised | string | Total funding raised. |
stage | string | Funding stage (e.g., Seed, Series A). |
country | string | Country of headquarters. |
city | string | City of headquarters. |
logo_url | string | URL to company logo. |
hn_url | string | Hacker News discussion link. |
jobs | array | List of open job postings (see below). |
founders | array | List of founders (see below). |
tags | array | Industry/topic tags. |
processed_at | string | ISO timestamp of extraction. |
source_url | string | Slug or name used as identifier. |
status_result | string | success or not_found. |
Job Object Structure
| Field | Type | Description |
|---|---|---|
id | integer | Job ID. |
title | string | Job title. |
salary_range | string | e.g., "$120K - $180K". |
location | string | City or "Remote". |
type | string | e.g., Full-time, Contract. |
remote | boolean | Remote friendly. |
skills | array | Required skills/tags. |
job_url | string | Direct application link. |
Founder Object Structure
| Field | Type | Description |
|---|---|---|
full_name | string | Founder’s full name. |
linkedin | string | LinkedIn profile URL (if available). |
title | string | Title/role at the company. |
Example Output
[{"data_source": "companies","id": 64,"name": "Y Combinator","slug": "ycombinator","website": "https://www.ycombinator.com","description": "A startup accelerator that funds and mentors early-stage companies.","industry": "Venture Capital","subindustry": "Accelerator","team_size": 100,"batch": null,"status": "Active","is_hiring": true,"valuation": null,"total_raised": null,"stage": null,"country": "USA","city": "Mountain View","logo_url": "https://...","hn_url": "https://news.ycombinator.com/item?id=123456","jobs": [{"id": 78637,"title": "Product Engineer - App ops","salary_range": "$180K - $270K","location": "San Francisco, CA, US","type": "Full-time","remote": false,"skills": ["React", "Ruby on Rails", "PostgreSQL"],"job_url": "https://www.workatastartup.com/jobs/78637"}],"founders": [{"full_name": "Paul Graham","linkedin": "","title": "Co-founder"},{"full_name": "Jessica Livingston","linkedin": "https://www.linkedin.com/in/jessicalivingston1","title": "Co-founder"}],"tags": ["startup", "accelerator"],"processed_at": "2026-04-04T12:00:00Z","source_url": "ycombinator","status_result": "success"}]
🛠 How to Use on Apify
- Create a task with this actor.
- Provide input – either a list of company slugs or search parameters.
- Configure proxies – optional but recommended for large‑scale runs.
- Run – the actor will fetch data and push it to the Dataset.
- Export – download results as JSON, CSV, or Excel.
Running via API
curl -X POST "https://api.apify.com/v2/acts/your-username~yc-startup-scraper/runs" \-H "Content-Type: application/json" \-H "Authorization: Bearer YOUR_API_TOKEN" \-d '{"search_query": "fintech","batch": "W24","max_results": 10}'
🎯 Use Cases
- Investor Research – Identify promising YC startups by batch, industry, or funding stage.
- Job Aggregation – Build a job board focused on YC‑backed companies.
- Competitive Intelligence – Track hiring trends, salary ranges, and skill demands.
- Founder Database – Enrich CRM with founder names and LinkedIn profiles.
- Market Analysis – Study which industries are most active in recent YC batches.
- Data Enrichment – Add startup metadata to existing datasets.
❓ Frequently Asked Questions
Q1. Do I need a Y Combinator API key?
No. This scraper uses the public APIs that power workatastartup.com and the YC company directory. No authentication required.
Q2. Is the data real?
Yes. All data comes directly from Y Combinator’s official public endpoints. It is the same data you see when visiting workatastartup.com.
Q3. Can I get historical data (older batches)?
Yes. The API returns companies from all batches. You can filter by batch (e.g., W15, S10) to get older startups.
Q4. Why do I need residential proxies?
For small runs (under 50 companies), proxies are optional. For large‑scale scraping (hundreds of companies), residential proxies help avoid IP‑based rate limiting. Datacenter IPs may be blocked by Cloudflare.
Q5. What happens if a company slug is not found?
The actor returns a status_result: "not_found" object with only basic fields. The run continues for other slugs.
Q6. Can I extract full job descriptions?
The job description is not directly available via the API. However, you can combine this actor with a job detail scraper (using the job_url field) to fetch full descriptions.
Q7. How accurate is the team size?
Team size is as reported by the company on their YC profile. It may not be real‑time, but it is usually up‑to‑date for active startups.
Q8. How fast is the scraper?
Search returns up to 20 companies in 5–10 seconds. Direct fetching of 10 specific slugs takes 10–20 seconds (including delays).
🔍 SEO Keywords
Y Combinator scraper, YC startup data, workatastartup API, startup intelligence, founder database, YC job scraper, startup salary data, Apify YC actor, venture capital research, startup ecosystem analytics, YC companies API