Pricing

Pay per event

Equity Crowdfunding Leads Scraper

Scrape unified equity crowdfunding leads from Wefunder, Republic, and StartEngine in one schema — founder name, company, tagline, region, raise progress, valuation — export to JSON or CSV. Built for VC scouts and SDRs targeting funded startups.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Wefunder Scraper — Equity Crowdfunding Leads

We do the dirty work so your dataset stays clean. 😈

$5.05 / 1,000 rows — pay only for results, no credit card to try. Unify currently-raising and recently-funded campaigns from the three biggest US equity-crowdfunding platforms (Wefunder, Republic, StartEngine) into one flat founder-lead dataset. Built for VC scouts, SDRs targeting founders, and competitive-intel analysts who today need to hand-visit three platforms with no shared schema.

This Actor pulls from Wefunder's internal JSON API, Republic's SSR shell, and StartEngine's offering sitemap — parses each with Pydantic-validated models, and pushes a single normalized dataset to Apify. Wefunder alone surfaces 4,800+ currently-listed companies with founder name, tagline, raise progress, investor count, and pre-money valuation in a single run.

🎯 What this scrapes

Three equity-crowdfunding platforms, one schema. This Wefunder scraper — and the Republic and StartEngine scrapers bundled alongside it — collapses three separate platforms into a single normalized equity crowdfunding leads dataset:

Wefunder — wefunder.com (primary source; full founder + tagline + raise progress + valuation payload via the internal JSON API the Wefunder SPA itself calls)
Republic — republic.com (secondary; slug + company name from trending carousel + JSON-LD breadcrumb)
StartEngine — startengine.com (secondary; slug + company name from offering sitemap; financials nullable in v1 — detail pages are JS-gated)

Per-source coverage matrix (v1):

Source	Slug	Company name	Tagline	Founder	Raised $	Valuation $	Investors
Wefunder	yes	yes	yes	yes	yes	yes	yes
Republic	yes	yes	—	—	—	—	—
StartEngine	yes	yes	—	—	—	—	—

Output fields:

Field	Type	Description
`source`	string	Source platform (`wefunder`, `republic`, `startengine`)
`campaign_slug`	string	URL slug (e.g. `riserobotics`, `atari-hotels`, `ai-frontier-fund`)
`company_name`	string	Display name
`tagline`	string \| null	Short pitch one-liner
`industry`	string \| null	Vertical / category label
`location`	string \| null	Region, state, or city
`founders`	array	List of founder names (empty when none disclosed)
`website_url`	string \| null	External company website when linked
`target_amount_usd`	number \| null	Funding goal in USD
`raised_amount_usd`	number \| null	Running total raised in USD
`num_investors`	integer \| null	Current investor count
`valuation_usd`	number \| null	Pre-money valuation in USD
`revenue_usd`	number \| null	Latest annual revenue (Form C) when published
`funding_stage`	string \| null	Derived stage label (`raising`, `funded`, `closed`)
`campaign_url`	string	Canonical campaign detail URL
`scraped_at`	string	ISO 8601 UTC timestamp

🔥 Features

Three platforms, one schema — drop straight into your CRM, spreadsheet, or BI tool with no per-source normalization.
Wefunder primary path — full founder + tagline + raise progress + valuation per row via the internal JSON API the Wefunder SPA itself calls.
Pre-money valuation parsing — Wefunder's $62.1M-style shorthand auto-converted to a numeric USD float.
Per-source isolation — one source going down does not abort the run; the remaining sources still produce data and stream results.
Pydantic v2 validation — both input and dataset rows are model-validated; invalid input fails fast with a clear error before any network call.
Filter knobs — restrict by source list, funding status (active / funded / all), industry substring, or hard row cap per source.
Exponential backoff on 429 / 503 with Retry-After honored; up to 5 attempts per page.
We rotate browser fingerprints — curl-cffi impersonation (Chrome 131 / Chrome 124 / Firefox 147) so the platforms see real-browser TLS handshakes, not Python.
We rotate residential proxies through Apify Proxy on every block — fresh session ID, fresh exit IP.
You pay only for results that land. No data pushed to the dataset means no result-row charges.

💡 Use cases

VC scout pipelines — schedule weekly runs, enrich with LinkedIn lookups on founders[], build a "first dollar in" tracker for sub-seed deals currently raising on Wefunder.
SDR founder outreach — filter by industry substring (e.g. "fintech", "climate", "AI"), drop founders into your email tool, reach them before a Series A bump.
Competitive intelligence — track which sectors are over-raising vs under-raising quarter-over-quarter; spot whitespace before incumbents do.
Crowdfunding leaderboards — publish a public site ranking the fastest-growing campaigns this week; update daily on a scheduled run.
Cap-table benchmarks — pre-money valuation distribution by sector at the Reg CF stage, cleanly covered by Wefunder's terms.nb field.
Form C deep dives — pair this Actor with sec-edgar-filings-scraper to follow the slug from Wefunder back to the canonical Form C / Form C-AR PDF on EDGAR for revenue, expense, and SAFE term extraction.
Equity crowdfunding data API replacement — no Crunchbase contract, no manual exports. One Apify run returns the same structured equity crowdfunding data an API subscription would, on pay-per-result pricing with no contract.

⚙️ How to use it

Open the Actor input form.
(Optional) Pick Sources — leave empty to scrape all three, or list a subset like ["wefunder"].
(Optional) Set Max rows per source — default 50, cap 500.
(Optional) Pick Status filter — active (default) for currently raising, funded for recently funded, all for both.
(Optional) Set Industry filter — a case-insensitive substring matched against tagline or industry.
Leave Use Apify Proxy on (default) — Wefunder and Republic block plain datacenter IPs; we handle those blocks for you.
Click Start. Results stream into the default dataset.

Quick examples

All three sources, default settings:

{
  "sources": [],
  "maxPerSource": 50,
  "statusFilter": "active",
  "useProxy": true
}

Wefunder only, fintech rows, currently raising — highest-signal default for SDR pipelines:

{
  "sources": ["wefunder"],
  "maxPerSource": 100,
  "statusFilter": "active",
  "industryFilter": "fintech",
  "useProxy": true
}

Wefunder + Republic, 3 rows each:

{
  "sources": ["wefunder", "republic"],
  "maxPerSource": 3,
  "statusFilter": "active",
  "useProxy": false
}

📥 Input

Field	Type	Required	Default	Description
`sources`	array of source literals	no	`[]` (all three)	Subset of `wefunder`, `republic`, `startengine`
`maxPerSource`	integer	no	`50`	Hard cap per source; range 1..500
`statusFilter`	enum string	no	`"active"`	One of `active` / `funded` / `all`
`industryFilter`	string \| null	no	`null`	Case-insensitive substring filter over tagline/industry
`useProxy`	boolean	no	`true`	Route via Apify Proxy (`BUYPROXIES94952`)

📤 Output

One dataset row per discovered campaign. Example record — Wefunder RISE Robotics as of 2026-05-16:

{
  "source": "wefunder",
  "campaign_slug": "riserobotics",
  "company_name": "RISE Robotics",
  "tagline": "Electrifying heavy machines",
  "industry": null,
  "location": "MA",
  "founders": ["Hiten Sonpal"],
  "website_url": null,
  "target_amount_usd": null,
  "raised_amount_usd": 17448682.0,
  "num_investors": 417,
  "valuation_usd": 62100000.0,
  "revenue_usd": null,
  "funding_stage": "raising",
  "campaign_url": "https://wefunder.com/riserobotics",
  "scraped_at": "2026-05-16T13:40:00.000Z"
}

Download the dataset as JSON, CSV, Excel, or XML from the Export button on the run page.

💰 Pricing

Pay-Per-Event (PPE) — you pay only for what lands:

Event	Rate (USD)	Trigger
`actor-start`	$0.05	Once per Actor run at boot
`result-row`	$0.005	Per campaign row pushed

A typical run (all three sources, default 50/source = 150 rows) costs ~$0.80. Per-1,000-row extrapolation: $5.05 — sourced directly from public crowdfunding campaigns, with no subscription and no per-seat contract.

🚧 Limitations

StartEngine detail pages are JavaScript-gated. v1 emits one row per offering slug from the public sitemap-private-offerings.xml with name derived from the slug; live raise progress, valuation, and investor count stay nullable on this source. A v2 Camoufox-backed full-render path is planned.
Republic financials are client-rendered. v1 surfaces ~10 trending campaign slugs per run from the SSR shell with company name from the JSON-LD breadcrumb. Raised amount, valuation, and investor count are out of reach without a real browser.
Wefunder is the data-rich source. Run Wefunder-only (sources: ["wefunder"]) when you need the most fields per row.
Authoritative campaign data only. No SEC EDGAR Form C parsing (use sec-edgar-filings-scraper), no investor identity scraping (privacy), no comment threads or campaign updates.
7-day default-storage retention on the Apify free plan. Schedule runs and export to your own storage for time-series tracking.
No historical tracking. Every run is a fresh snapshot. Pipeline runs into BigQuery / S3 / Snowflake to build deltas.

❓ FAQ

Is there a Wefunder API I can use instead? Wefunder does not publish an official public API. The Wefunder API this Actor uses is the internal JSON endpoint that the Wefunder SPA calls — /-/companies/explore — which returns the full founder + raise progress + valuation payload. We handle the session management, fingerprint rotation, and retry logic so you get clean structured rows without reverse-engineering anything yourself.

Does this work as a Republic startups scraper and a StartEngine scraper too? Yes. The Actor bundles a Republic startups scraper path (trending campaigns from the SSR carousel) and a StartEngine scraper path (active offering slugs from the public sitemap) alongside the primary Wefunder path. Republic and StartEngine return sparser rows in v1 because their financial data is client-rendered and requires a full browser to unlock — that is on our v2 roadmap.

Can I use this as an equity crowdfunding data API? Exactly the use case it was built for. Schedule it weekly, export to your storage of choice (S3, BigQuery, Snowflake via Apify integrations), and treat the output as a continuously refreshed equity crowdfunding data feed. No Crunchbase subscription, no manual exports, no per-platform schema mapping.

What is the Wefunder API address and how does the valuation field work? The Actor calls https://wefunder.com/-/companies/explore (paginated). Wefunder encodes pre-money valuation as dollar-shorthand text ("$62.1M", "$700K", "$1.2B") in the terms.nb field. This Actor parses that shorthand into a numeric USD float in valuation_usd. Malformed values emit null rather than crashing.

What about the Republic.com API and StartEngine API? Neither platform publishes an official Republic.com API or StartEngine API for campaign data. This Actor uses the authenticated-but-unauthenticated surfaces each platform exposes: Republic's SSR page JSON-LD for company names, and StartEngine's public XML sitemap for offering slugs. Full financial details require a browser render — see Limitations above.

Why only these three platforms? Wefunder, Republic, and StartEngine are the three largest US equity-crowdfunding portals by total capital deployed. NextSeed, MicroVentures, and LATAM portals are deliberately out of scope for v1 — open a feature request if you need them.

Does this Actor track valuation changes across runs? No — every run is a fresh snapshot. Schedule runs and export to your own storage to build a time series. Apify's default run-scoped storage is purged after 7 days on the free plan.

Companion Actor? Yes — sec-edgar-filings-scraper is the natural follow-up for any campaign you want a deep dive on. Take the slug from this Actor, look up the issuer's CIK on EDGAR, and parse the Form C / Form C-AR PDFs for revenue, expense, share count, and SAFE-term detail.

💬 Your feedback

Found a parser that broke after Wefunder, Republic, or StartEngine restructured their page? Want a fourth platform added (NextSeed, MicroVentures, etc.)? Open an issue on the Actor's Apify Store page or contact us at apify.com/DevilScrapes. We monitor publish-day failures and ship patches the same week.

Equity Crowdfunding Deal Tracker (Reg CF / SEC EDGAR)

jadey_orbit/equity-crowdfunding-tracker

A normalized feed of every Regulation CF equity crowdfunding raise — Wefunder, StartEngine, Republic and more — straight from official SEC EDGAR Form C filings. Terms, deadlines, financials, and platform, structured. Pay per deal.

Matt Sparks

StartEngine Scraper

crawlergang/startengine-scraper

Scrape StartEngine (startengine.com) - an equity crowdfunding platform. Browse active private investment offerings. Extracts company name, offering slug, type, valuation cap, minimum investment, deadline, and offering URL.

Crawler Gang

5.0

StartEngine Scraper

crawlerbros/startengine-scraper

Crawler Bros

Broota Chile Equity Crowdfunding Campaigns Scraper

jungle_synthesizer/broota-chile-equity-crowdfunding-campaigns-scraper

Scrape active and historical equity-crowdfunding campaigns from Broota, Chile's largest CMF-regulated crowdfunding platform. Extracts raise targets, amounts raised, investor counts, sectors, and team data.

BowTiedRaccoon

Wefunder Scraper

jupri/wefunder

Scrape Wefunder.com Campaigns

cat

108

Kickstarter Projects Scraper - Crowdfunding Data Extractor

parseforge/kickstarter-scraper

Scrape Kickstarter crowdfunding projects by keyword, category, or sort order. Extracts title, creator, funding goal, pledged amount, backers, percent funded, days left, status, location, and more.