Wellfound Startup Scraper With Emails | AngelList Directory
Pricing
$3.99 / 1,000 results
Wellfound Startup Scraper With Emails | AngelList Directory
Extract structured Wellfound startup profiles including company details, email adresses, phone numbers, social media accounts, hiring signal and more. Built for startup sourcing, market intelligence, and automated CRM or analytics pipelines.
Pricing
$3.99 / 1,000 results
Rating
0.0
(0)
Developer
Fatih Tahta
Actor stats
1
Bookmarked
18
Total users
11
Monthly active users
2 days
Issues response
2 days ago
Last modified
Categories
Share
Slug: fatihtahta/wellfound-startup-scraper-with-contacts
Overview
Wellfound Startup Scraper With Emails collects structured startup records from Wellfound, including company identity, profile details, hiring signals, market tags, location tags, reputation indicators, and optional contact details such as business emails, phone numbers, and social links. It supports both broad directory discovery and focused collection from specific startup or search pages. Wellfound is a widely used startup directory, making it a valuable source for company discovery, market mapping, hiring research, and enrichment workflows. The actor automates recurring collection so teams can replace manual copy-paste work with consistent, analysis-ready output. This helps save time across research, outreach, and pipeline operations while keeping datasets easier to maintain.
Why Use This Actor
- Market research and analytics teams: Build startup landscape datasets, compare industries and regions, and track hiring or category shifts over time.
- Product and content teams: Discover emerging companies, validate coverage opportunities, and source examples for reports, newsletters, and market pages.
- Developers and data engineers: Feed structured startup records into ETL jobs, internal catalogs, data warehouses, and downstream APIs.
- Lead generation and enrichment teams: Collect company profiles and optional contact details for prospecting, CRM enrichment, and account research.
- Monitoring and competitive intelligence teams: Re-run targeted searches on a schedule to watch specific sectors, technologies, or locations for changes.
Input Parameters
Provide any combination of URLs, queries, and filters to control what the actor collects.
| Parameter | Type | Description | Default |
|---|---|---|---|
getAll | boolean | When true, collects from the full Wellfound startup directory and ignores startUrls, industry, location, and tech. Use for broad platform-wide discovery. Allowed values: true, false. | false |
startUrls | string[] | One or more Wellfound URLs to collect directly. Supported inputs include startup directory search pages and individual startup profile pages. Example page types: industry pages, location-filtered pages, tech-filtered pages, and company pages. | - |
industry | string | Optional industry keyword or slug used to narrow results, such as fintech, health-tech, or artificial-intelligence. | - |
location | string | Optional hiring location keyword or slug, such as new-york, london, or remote. | - |
tech | string | Optional technology keyword or slug used to find startups by technology footprint, such as python, react, management, or microsoft-office. | - |
getContacts | boolean | When true, adds available contact details such as business emails, phone numbers, and social profiles. Use false when you only need core startup directory data. Allowed values: true, false. | false |
limit | integer | Maximum number of listings to save per query. Minimum value: 10. Use smaller values for sampling and larger values for deeper coverage. | 50000 |
proxyConfiguration | object | Optional Apify connection settings for larger or longer runs. For most runs, the default configuration is sufficient. | Apify proxy with RESIDENTIAL group |
Example Input
{"startUrls": ["https://wellfound.com/startups/industry/fintech-2"],"industry": "fintech","location": "new-york","tech": "python","getContacts": true,"limit": 500,"getAll": false}
Output
Output destination
The actor writes results to an Apify dataset as JSON records. And the dataset is designed for direct consumption by analytics tools, ETL pipelines, and downstream APIs without post-processing.
Record envelope (all items)
Every record includes these stable identifiers:
- type (string, required): Record category. Current value:
startup. - id (number, required): Stable Wellfound identifier for the startup.
- url (string, required): Canonical Wellfound URL for the startup.
Recommended idempotency key: type + ":" + id
Use this key to deduplicate repeated discoveries and perform reliable upserts in downstream systems.
Examples
Example: startup (type = "startup")
{"type": "startup","id": 28804,"url": "https://wellfound.com/company/homelight","title": "HomeLight","company_profile": {"website_url": "https://www.homelight.com","logo_url": "https://photos.wellfound.com/startups/i/28804-00e4e7d1013076846b81724c74c417e0-medium_jpg.jpg?buster=1589639645","one_liner": "Our vision is a world where every real estate transaction is simple, certain, and satisfying.","description": "We're the essential technology platform used by hundreds of thousands of homebuyers and sellers to partner with top real estate agents and win at any step of the real estate journey.","company_size": "501 to 1,000 Employees"},"hiring_overview": {"open_roles_total": 9,"roles_breakdown": [{"positionId": "154874","positionName": "Management","totalJobPostings": 1},{"positionId": "80488","positionName": "Sales","totalJobPostings": 3}],"is_actively_hiring": true},"market_presence": {"market_tags": [{"marketTagId": "16","marketTagName": "real estate","marketTagSlug": "real-estate-1","marketTagDisplayName": "Real Estate","marketTagNoindex": false}],"location_tags": [{"id": "1692","name": "san francisco","slug": "san-francisco","displayName": "San Francisco","locationTagNoindex": false}],"business_models": ["B2B","B2C"]},"reputation_signals": {"badges": [{"badgeId": "ACTIVELY_HIRING","badgeName": "ACTIVELY_HIRING_BADGE","badgeLabel": "Actively Hiring","badgeTooltip": "Actively processing applications"}],"high_ratings": [{"label": "Highly rated","rating": "4.8","tooltip": "HomeLight is highly rated on Glassdoor, with 4.8 out of 5 stars"}]},"crawl_metadata": {"listing_url": "https://wellfound.com/company/homelight","source_url": "https://wellfound.com/startups","seed": {"type": "all","value": "all-startups"},"scraped_time": "2026-03-03T17:31:40.067257+00:00"},"contact_details": {"emails": ["pr@homelight.com"],"phone_numbers": ["855-999-7971"],"social_media": {"facebook": "https://www.facebook.com/gohomelight","twitter": "https://twitter.com/gohomelight?lang=en","linkedin": "https://www.linkedin.com/company/homelight"}}}
Field reference
Startup fields (type = "startup")
- type (string, required): Record category.
- id (number, required): Stable startup identifier.
- url (string, required): Canonical startup URL.
- title (string, required): Startup name.
- company_profile (object, optional): Core company profile block.
- company_profile.website_url (string, optional): Company website URL.
- company_profile.logo_url (string, optional): Logo image URL.
- company_profile.one_liner (string, optional): Short company summary.
- company_profile.description (string, optional): Longer company description.
- company_profile.company_size (string, optional): Company size as displayed.
- hiring_overview (object, optional): Hiring summary block.
- hiring_overview.open_roles_total (number, optional): Total open roles shown.
- hiring_overview.roles_breakdown (array[object], optional): Hiring breakdown by function.
- hiring_overview.roles_breakdown.positionId (string, optional): Role category identifier.
- hiring_overview.roles_breakdown.positionName (string, optional): Role category name.
- hiring_overview.roles_breakdown.totalJobPostings (number, optional): Open postings in that role category.
- hiring_overview.is_actively_hiring (boolean, optional): Whether the startup is actively hiring.
- market_presence (object, optional): Market, location, and business model signals.
- market_presence.market_tags (array[object], optional): Startup market/category tags.
- market_presence.market_tags.marketTagId (string, optional): Market tag identifier.
- market_presence.market_tags.marketTagName (string, optional): Raw market tag name.
- market_presence.market_tags.marketTagSlug (string, optional): Market tag slug.
- market_presence.market_tags.marketTagDisplayName (string, optional): Display label for the tag.
- market_presence.market_tags.marketTagNoindex (boolean, optional): Visibility flag from the source.
- market_presence.location_tags (array[object], optional): Associated location tags.
- market_presence.location_tags.id (string, optional): Location tag identifier.
- market_presence.location_tags.name (string, optional): Raw location name.
- market_presence.location_tags.slug (string, optional): Location slug.
- market_presence.location_tags.displayName (string, optional): Display label for the location.
- market_presence.location_tags.locationTagNoindex (boolean, optional): Visibility flag from the source.
- market_presence.business_models (array[string], optional): Business model labels such as
B2BorB2C. - reputation_signals (object, optional): Badges and rating-related signals.
- reputation_signals.badges (array[object], optional): Reputation and status badges.
- reputation_signals.badges.badgeId (string, optional): Badge identifier.
- reputation_signals.badges.badgeName (string, optional): Badge type name.
- reputation_signals.badges.badgeLabel (string, optional): Badge display label.
- reputation_signals.badges.badgeTooltip (string, optional): Badge explanatory text.
- reputation_signals.badges.badgeData (string, optional): Additional badge metadata when present.
- reputation_signals.badges.badgeRating (string, optional): Badge rating value when present.
- reputation_signals.high_ratings (array[object], optional): Highlighted ratings.
- reputation_signals.high_ratings.label (string, optional): Rating label.
- reputation_signals.high_ratings.rating (string, optional): Rating value.
- reputation_signals.high_ratings.tooltip (string, optional): Rating context text.
- crawl_metadata (object, optional): Collection metadata for traceability.
- crawl_metadata.listing_url (string, optional): Source startup page URL.
- crawl_metadata.source_url (string, optional): Source listing or directory URL.
- crawl_metadata.seed (object, optional): Input seed metadata.
- crawl_metadata.seed.type (string, optional): Seed category.
- crawl_metadata.seed.value (string, optional): Seed value used for discovery.
- crawl_metadata.scraped_time (string, optional): ISO 8601 timestamp for record collection.
- contact_details (object, optional): Available contact enrichment block.
- contact_details.emails (array[string], optional): Business email addresses.
- contact_details.phone_numbers (array[string], optional): Business phone numbers.
- contact_details.social_media (object, optional): Social profile links.
- contact_details.social_media.facebook (string, optional): Facebook URL.
- contact_details.social_media.twitter (string, optional): Twitter/X URL.
- contact_details.social_media.linkedin (string, optional): LinkedIn URL.
Data guarantees & handling
- Best-effort extraction: fields may vary by region, session, availability, or UI experiments.
- Optional fields: null-check in downstream code.
- Deduplication: recommend
type + ":" + id.
How to Run on Apify
- Open the Actor in Apify Console.
- Configure your search parameters, such as start URLs, industry, hiring location, or technology filters.
- Set the maximum number of outputs to collect.
- Click Start and wait for the run to finish.
- Download the results in JSON, CSV, Excel, or another supported format.
Scheduling & Automation
Scheduling
Automated Data Collection
You can schedule recurring runs to keep your startup dataset fresh without rerunning jobs manually. This is useful for ongoing market monitoring, enrichment, and change tracking workflows.
- Navigate to Schedules in Apify Console
- Create a new schedule (daily, weekly, or custom cron)
- Configure input parameters
- Enable notifications for run completion
- Add webhooks for automated processing
Integration Options
- Webhooks: Trigger downstream actions when a run completes
- Zapier: Connect to 5,000+ apps without coding
- Make (Integromat): Build multi-step automation workflows
- Google Sheets: Export results to a spreadsheet
- Slack/Discord: Receive notifications and summaries
- Email: Send automated reports via email
Performance
Estimated run times:
When contact enrichment is disabled:
- Small runs (< 1,000 outputs): ~2-3 minutes
- Medium runs (1,000-5,000 outputs): ~5-15 minutes
- Large runs (5,000+ outputs): ~15-30 minutes
When contact enrichment is enabled:
- Small runs (< 1,000 outputs): ~5-10 minutes
- Medium runs (1,000-5,000 outputs): ~15-30 minutes
- Large runs (5,000+ outputs): ~30+ minutes
Execution time varies based on filters, result volume, and how much information is returned per record.
Compliance & Ethics
Responsible Data Collection
This actor collects publicly available startup company, hiring, and business contact information from Wellfound for legitimate business purposes. Common use cases include:
- Startup ecosystem research and market analysis
- Lead enrichment and account research
- Competitive monitoring and change tracking
Users are responsible for ensuring their collection and use of data complies with applicable laws, regulations, and the target site's terms. This section is informational and not legal advice.
Best Practices
- Use collected data in accordance with applicable laws, regulations, and the target site's terms
- Respect individual privacy and personal information
- Use data responsibly and avoid disruptive or excessive collection
- Do not use this actor for spamming, harassment, or other harmful purposes
- Follow relevant data protection requirements where applicable (for example, GDPR and CCPA)
Support
For help, use the Issues tab or the actor page in Apify Console. Include the input you used with sensitive values redacted, the run ID, the expected behavior, the actual behavior, and an optional small output sample if it helps reproduce the issue.