Equity Crowdfunding Leads Scraper
Pricing
Pay per event
Equity Crowdfunding Leads Scraper
Scrape unified equity crowdfunding leads from Wefunder, Republic, and StartEngine in one schema — founder name, company, tagline, region, raise progress, valuation — export to JSON or CSV. Built for VC scouts and SDRs targeting funded startups.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
1
Monthly active users
10 days ago
Last modified
Categories
Share
Wefunder Scraper — Equity Crowdfunding Leads
We do the dirty work so your dataset stays clean. 😈
$5.05 / 1,000 rows — pay only for results, no credit card to try. Unify currently-raising and recently-funded campaigns from the three biggest US equity-crowdfunding platforms (Wefunder, Republic, StartEngine) into one flat founder-lead dataset. Built for VC scouts, SDRs targeting founders, and competitive-intel analysts who today need to hand-visit three platforms with no shared schema.
This Actor pulls from Wefunder's internal JSON API, Republic's SSR shell, and StartEngine's offering sitemap — parses each with Pydantic-validated models, and pushes a single normalized dataset to Apify. Wefunder alone surfaces 4,800+ currently-listed companies with founder name, tagline, raise progress, investor count, and pre-money valuation in a single run.
🎯 What this scrapes
Three equity-crowdfunding platforms, one schema. This Wefunder scraper — and the Republic and StartEngine scrapers bundled alongside it — collapses three separate platforms into a single normalized equity crowdfunding leads dataset:
- Wefunder —
wefunder.com(primary source; full founder + tagline + raise progress + valuation payload via the internal JSON API the Wefunder SPA itself calls) - Republic —
republic.com(secondary; slug + company name from trending carousel + JSON-LD breadcrumb) - StartEngine —
startengine.com(secondary; slug + company name from offering sitemap; financials nullable in v1 — detail pages are JS-gated)
Per-source coverage matrix (v1):
| Source | Slug | Company name | Tagline | Founder | Raised $ | Valuation $ | Investors |
|---|---|---|---|---|---|---|---|
| Wefunder | yes | yes | yes | yes | yes | yes | yes |
| Republic | yes | yes | — | — | — | — | — |
| StartEngine | yes | yes | — | — | — | — | — |
Output fields:
| Field | Type | Description |
|---|---|---|
source | string | Source platform (wefunder, republic, startengine) |
campaign_slug | string | URL slug (e.g. riserobotics, atari-hotels, ai-frontier-fund) |
company_name | string | Display name |
tagline | string | null | Short pitch one-liner |
industry | string | null | Vertical / category label |
location | string | null | Region, state, or city |
founders | array | List of founder names (empty when none disclosed) |
website_url | string | null | External company website when linked |
target_amount_usd | number | null | Funding goal in USD |
raised_amount_usd | number | null | Running total raised in USD |
num_investors | integer | null | Current investor count |
valuation_usd | number | null | Pre-money valuation in USD |
revenue_usd | number | null | Latest annual revenue (Form C) when published |
funding_stage | string | null | Derived stage label (raising, funded, closed) |
campaign_url | string | Canonical campaign detail URL |
scraped_at | string | ISO 8601 UTC timestamp |
🔥 Features
- Three platforms, one schema — drop straight into your CRM, spreadsheet, or BI tool with no per-source normalization.
- Wefunder primary path — full founder + tagline + raise progress + valuation per row via the internal JSON API the Wefunder SPA itself calls.
- Pre-money valuation parsing — Wefunder's
$62.1M-style shorthand auto-converted to a numeric USD float. - Per-source isolation — one source going down does not abort the run; the remaining sources still produce data and stream results.
- Pydantic v2 validation — both input and dataset rows are model-validated; invalid input fails fast with a clear error before any network call.
- Filter knobs — restrict by source list, funding status (active / funded / all), industry substring, or hard row cap per source.
- Exponential backoff on
429/503withRetry-Afterhonored; up to 5 attempts per page. - We rotate browser fingerprints — curl-cffi impersonation (Chrome 131 / Chrome 124 / Firefox 147) so the platforms see real-browser TLS handshakes, not Python.
- We rotate residential proxies through Apify Proxy on every block — fresh session ID, fresh exit IP.
- You pay only for results that land. No data pushed to the dataset means no result-row charges.
💡 Use cases
- VC scout pipelines — schedule weekly runs, enrich with LinkedIn lookups on
founders[], build a "first dollar in" tracker for sub-seed deals currently raising on Wefunder. - SDR founder outreach — filter by
industrysubstring (e.g."fintech","climate","AI"), drop founders into your email tool, reach them before a Series A bump. - Competitive intelligence — track which sectors are over-raising vs under-raising quarter-over-quarter; spot whitespace before incumbents do.
- Crowdfunding leaderboards — publish a public site ranking the fastest-growing campaigns this week; update daily on a scheduled run.
- Cap-table benchmarks — pre-money valuation distribution by sector at the Reg CF stage, cleanly covered by Wefunder's
terms.nbfield. - Form C deep dives — pair this Actor with
sec-edgar-filings-scraperto follow the slug from Wefunder back to the canonical Form C / Form C-AR PDF on EDGAR for revenue, expense, and SAFE term extraction. - Equity crowdfunding data API replacement — no Crunchbase contract, no manual exports. One Apify run returns the same structured equity crowdfunding data an API subscription would, on pay-per-result pricing with no contract.
⚙️ How to use it
- Open the Actor input form.
- (Optional) Pick Sources — leave empty to scrape all three, or list a subset like
["wefunder"]. - (Optional) Set Max rows per source — default 50, cap 500.
- (Optional) Pick Status filter —
active(default) for currently raising,fundedfor recently funded,allfor both. - (Optional) Set Industry filter — a case-insensitive substring matched against
taglineorindustry. - Leave Use Apify Proxy on (default) — Wefunder and Republic block plain datacenter IPs; we handle those blocks for you.
- Click Start. Results stream into the default dataset.
Quick examples
All three sources, default settings:
{"sources": [],"maxPerSource": 50,"statusFilter": "active","useProxy": true}
Wefunder only, fintech rows, currently raising — highest-signal default for SDR pipelines:
{"sources": ["wefunder"],"maxPerSource": 100,"statusFilter": "active","industryFilter": "fintech","useProxy": true}
Wefunder + Republic, 3 rows each:
{"sources": ["wefunder", "republic"],"maxPerSource": 3,"statusFilter": "active","useProxy": false}
📥 Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
sources | array of source literals | no | [] (all three) | Subset of wefunder, republic, startengine |
maxPerSource | integer | no | 50 | Hard cap per source; range 1..500 |
statusFilter | enum string | no | "active" | One of active / funded / all |
industryFilter | string | null | no | null | Case-insensitive substring filter over tagline/industry |
useProxy | boolean | no | true | Route via Apify Proxy (BUYPROXIES94952) |
📤 Output
One dataset row per discovered campaign. Example record — Wefunder RISE Robotics as of 2026-05-16:
{"source": "wefunder","campaign_slug": "riserobotics","company_name": "RISE Robotics","tagline": "Electrifying heavy machines","industry": null,"location": "MA","founders": ["Hiten Sonpal"],"website_url": null,"target_amount_usd": null,"raised_amount_usd": 17448682.0,"num_investors": 417,"valuation_usd": 62100000.0,"revenue_usd": null,"funding_stage": "raising","campaign_url": "https://wefunder.com/riserobotics","scraped_at": "2026-05-16T13:40:00.000Z"}
Download the dataset as JSON, CSV, Excel, or XML from the Export button on the run page.
💰 Pricing
Pay-Per-Event (PPE) — you pay only for what lands:
| Event | Rate (USD) | Trigger |
|---|---|---|
actor-start | $0.05 | Once per Actor run at boot |
result-row | $0.005 | Per campaign row pushed |
A typical run (all three sources, default 50/source = 150 rows) costs ~$0.80. Per-1,000-row extrapolation: $5.05 — sourced directly from public crowdfunding campaigns, with no subscription and no per-seat contract.
🚧 Limitations
- StartEngine detail pages are JavaScript-gated. v1 emits one row per offering slug from the public
sitemap-private-offerings.xmlwith name derived from the slug; live raise progress, valuation, and investor count stay nullable on this source. A v2 Camoufox-backed full-render path is planned. - Republic financials are client-rendered. v1 surfaces ~10 trending campaign slugs per run from the SSR shell with company name from the JSON-LD breadcrumb. Raised amount, valuation, and investor count are out of reach without a real browser.
- Wefunder is the data-rich source. Run Wefunder-only (
sources: ["wefunder"]) when you need the most fields per row. - Authoritative campaign data only. No SEC EDGAR Form C parsing (use
sec-edgar-filings-scraper), no investor identity scraping (privacy), no comment threads or campaign updates. - 7-day default-storage retention on the Apify free plan. Schedule runs and export to your own storage for time-series tracking.
- No historical tracking. Every run is a fresh snapshot. Pipeline runs into BigQuery / S3 / Snowflake to build deltas.
❓ FAQ
Is there a Wefunder API I can use instead?
Wefunder does not publish an official public API. The Wefunder API this Actor uses is the internal JSON endpoint that the Wefunder SPA calls — /-/companies/explore — which returns the full founder + raise progress + valuation payload. We handle the session management, fingerprint rotation, and retry logic so you get clean structured rows without reverse-engineering anything yourself.
Does this work as a Republic startups scraper and a StartEngine scraper too? Yes. The Actor bundles a Republic startups scraper path (trending campaigns from the SSR carousel) and a StartEngine scraper path (active offering slugs from the public sitemap) alongside the primary Wefunder path. Republic and StartEngine return sparser rows in v1 because their financial data is client-rendered and requires a full browser to unlock — that is on our v2 roadmap.
Can I use this as an equity crowdfunding data API? Exactly the use case it was built for. Schedule it weekly, export to your storage of choice (S3, BigQuery, Snowflake via Apify integrations), and treat the output as a continuously refreshed equity crowdfunding data feed. No Crunchbase subscription, no manual exports, no per-platform schema mapping.
What is the Wefunder API address and how does the valuation field work?
The Actor calls https://wefunder.com/-/companies/explore (paginated). Wefunder encodes pre-money valuation as dollar-shorthand text ("$62.1M", "$700K", "$1.2B") in the terms.nb field. This Actor parses that shorthand into a numeric USD float in valuation_usd. Malformed values emit null rather than crashing.
What about the Republic.com API and StartEngine API? Neither platform publishes an official Republic.com API or StartEngine API for campaign data. This Actor uses the authenticated-but-unauthenticated surfaces each platform exposes: Republic's SSR page JSON-LD for company names, and StartEngine's public XML sitemap for offering slugs. Full financial details require a browser render — see Limitations above.
Why only these three platforms? Wefunder, Republic, and StartEngine are the three largest US equity-crowdfunding portals by total capital deployed. NextSeed, MicroVentures, and LATAM portals are deliberately out of scope for v1 — open a feature request if you need them.
Does this Actor track valuation changes across runs? No — every run is a fresh snapshot. Schedule runs and export to your own storage to build a time series. Apify's default run-scoped storage is purged after 7 days on the free plan.
Companion Actor?
Yes — sec-edgar-filings-scraper is the natural follow-up for any campaign you want a deep dive on. Take the slug from this Actor, look up the issuer's CIK on EDGAR, and parse the Form C / Form C-AR PDFs for revenue, expense, share count, and SAFE-term detail.
💬 Your feedback
Found a parser that broke after Wefunder, Republic, or StartEngine restructured their page? Want a fourth platform added (NextSeed, MicroVentures, etc.)? Open an issue on the Actor's Apify Store page or contact us at apify.com/DevilScrapes. We monitor publish-day failures and ship patches the same week.