SEC Form D Leads Scraper avatar

SEC Form D Leads Scraper

Pricing

Pay per event

Go to Apify Store
SEC Form D Leads Scraper

SEC Form D Leads Scraper

Scrape US SEC Form D filings — Regulation D private offerings — with issuer, offering amount, related persons, industry, state, and date filed. We handle EDGAR's pagination, retries, and rate-limit pacing. Typed rows ready for a B2B lead pipeline.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Categories

Share

SEC Form D Leads Scraper

SEC Form D Leads Scraper

We do the dirty work so your dataset stays clean. 😈

$5.05 / 1,000 rows — Turn the U.S. SEC EDGAR Form D filing stream into a structured B2B lead list. Every U.S. company raising private capital under Regulation D must file Form D with the SEC within 15 days of the first sale, making this a real-time feed of "we just raised money" announcements. Every row carries the issuer's legal name, address, phone number, industry group, offering amount, investor count, and the named officers / directors / promoters. No API key, no login, no browser automation — the data is U.S. government public domain.

The Apify Store has exactly one active competitor for Form D (logiover/sec-edgar-form-d-scraper — 9 monthly active users, zero reviews, no state or raise-size filters). This Actor ships the same primary source with deeper extraction (officers and addresses, not just headline numbers), proper amendment handling, post-filters for state and minimum raise size, and the EDGAR rate-limit + User-Agent compliance baked in.

🎯 What this scrapes

One ResultRow per qualifying Form D filing. Each row has 27 top-level fields plus a nested related_persons list of officers, directors, and promoters. Data is U.S. government public domain — there are no terms-of-service restrictions on redistribution.

FieldTypeDescription
accession_numberstringSEC accession number with dashes
cikstringZero-padded 10-character Central Index Key
entity_namestringIssuer legal name
entity_typestring | nullCorporation, Limited Liability Company, Limited Partnership, etc.
jurisdiction_of_incorporationstring | nullFull state / country name of incorporation
year_of_incorporationinteger | nullNull when issuer reports overFiveYears=true
issuer_street_1 / _2 / city / state_or_country / zip_codestrings | nullIssuer postal address
issuer_phone_numberstring | nullIssuer phone as filed (raw)
industry_group_typestring | nullE.g. Other Technology, Pooled Investment Fund, Biotechnology
total_offering_amount_usdnumber | nullNull when issuer reports Indefinite
total_amount_sold_usdnumber | nullAmount sold to date in USD
total_remaining_usdnumber | nullNull when Indefinite
is_indefinite_amountbooleanTrue when issuer reported Indefinite
total_number_already_investedinteger | nullInvestor count to date
minimum_investment_usdnumber | nullNull when no minimum (raw "0")
exemption_claimedstring[]Federal exemption codes (e.g. ["06B", "3C", "3C.1"])
is_new_noticebooleanTrue for new D, False for D/A amendment
date_of_first_salestring | nullISO date; null when yetToOccur=true
related_personsobject[]Officers / directors / promoters with names and locations
filing_datestringEDGAR acceptance date (ISO)
filing_urlstringCanonical EDGAR company-filings page
scraped_atstringISO 8601 UTC timestamp

🔥 Features

  • Real-time SEC EDGAR Form D feed — every Reg D capital raise in the U.S. lands here within 15 days of the first sale.
  • 27-field extraction including the structured B2B contact fields (phone, address, named officers) that the general sec-edgar-filings-scraper does not surface.
  • Free-text query, date range, U.S. state filter, and minimum-raise-size filter — all four input axes optional, all four post-filtered before the per-row charge so you pay only for kept rows.
  • Amendment toggle: includeAmendments=false (default) skips Form D/A entries before fetching the XML — saves both bandwidth and rate-limit budget.
  • XML schema quirks handled correctly — "Indefinite" offering amounts, overFiveYears year-of-inc, yetToOccur first-sale flag, raw "0" minimum investment → null.
  • EDGAR Fair Access Policy compliant — mandatory User-Agent header set on every request; 0.1 s inter-fetch sleep keeps the run under the 10 req/s/IP limit.
  • Exponential backoff with Retry-After for 429 / 503 responses; max 5 attempts.
  • Pydantic v2 input validation: ISO date shape checked before any network call, start_date <= end_date enforced, state filter normalised to uppercase.
  • curl-cffi with Chrome 131 TLS impersonation (ADR-0002 house default) — robust against any future EDGAR fingerprint tightening.
  • Apify Proxy support via the BUYPROXIES94952 group (opt-in via useProxy). Off by default because EDGAR does not block datacenter IPs when the User-Agent is set.

💡 Use cases

  • B2B sales prospecting — pull every Form D filed in the last 7-30 days in your target state (e.g. stateFilter=CA, minOfferingAmountUsd=1000000) for fresh inbound to SaaS / professional-services teams selling to newly-funded startups.
  • VC deal-flow tracking — query industryGroupType=Biotechnology and filter by state to surface every Series A-and-earlier biotech raise in your geography.
  • Compete-monitoring — pass a query term matching a competitor's domain category to detect new entrants raising private capital.
  • Journalism + research — bulk-export Form D over a multi-month window to measure private-market capital flow by state, industry, or exemption code.
  • Crunchbase / PitchBook alternative for early-stage — Form D is the original-source filing every paid private-markets database derives from, available here at $5/1,000 rows vs $299-999/month.
  • Compliance + KYC — verify a counterparty's Reg D filing history by issuer name or CIK before signing a master service agreement.

⚙️ How to use it

  1. Open the Actor input form on the Apify Console.
  2. (Optional) Set query to a free-text search term (e.g. "artificial intelligence") — passes through to EDGAR Full-Text Search.
  3. (Optional) Set startDate and endDate (ISO YYYY-MM-DD). Defaults: last 30 days.
  4. (Optional) Set stateFilter to a U.S. state ISO-2 code (e.g. CA, NY, TX) to keep only matching issuers.
  5. (Optional) Set minOfferingAmountUsd to filter out small raises (indefinite-amount filings always pass).
  6. Set maxResults (1-5000). Default 100.
  7. Leave includeAmendments=false to skip Form D/A entries; set true to include them with is_new_notice=false.
  8. Toggle useProxy=true if you see 429 / 403 (rare — EDGAR does not currently rate-limit datacenter IPs).
  9. Click Start. Results stream into the default dataset as JSON / CSV / Excel / XML.

California raises ≥ $1M in the last 30 days

{
"stateFilter": "CA",
"minOfferingAmountUsd": 1000000,
"maxResults": 500
}

Last week's biotech filings nationally

{
"query": "biotechnology",
"startDate": "2026-05-09",
"endDate": "2026-05-16",
"maxResults": 200
}

Every Form D + amendments filed today

{
"startDate": "2026-05-16",
"endDate": "2026-05-16",
"includeAmendments": true,
"maxResults": 5000
}

📥 Input

FieldTypeRequiredDefaultDescription
querystringnoFree-text EDGAR search (1-200 chars)
startDatestringno30 days agoInclusive lower bound, ISO YYYY-MM-DD
endDatestringnotodayInclusive upper bound, ISO YYYY-MM-DD
stateFilterstringnoIssuer state ISO-2 code (uppercased)
minOfferingAmountUsdintegernoDiscard filings below this amount (USD)
maxResultsintegerno100Cap on emitted rows (1-5000)
includeAmendmentsbooleannofalseInclude Form D/A amendments
useProxybooleannofalseRoute via Apify Proxy (BUYPROXIES94952)

startDate and endDate are validated for ISO shape and ordering before any network call — invalid input raises immediately and the run exits non-zero.

📤 Output

One row per qualifying Form D filing, pushed to the default dataset and available as JSON, CSV, Excel, or XML.

{
"accession_number": "0002122627-26-000001",
"cik": "0002122627",
"entity_name": "AM-0304 Fund II, a series of Delk-SPV, LP",
"entity_type": "Limited Partnership",
"jurisdiction_of_incorporation": "DELAWARE",
"year_of_incorporation": 2026,
"issuer_street_1": "2006 196TH ST SW",
"issuer_street_2": "SUITE 114",
"issuer_city": "LYNNWOOD",
"issuer_state_or_country": "WA",
"issuer_zip_code": "98036",
"issuer_phone_number": "2068016359",
"industry_group_type": "Pooled Investment Fund",
"total_offering_amount_usd": 577807.0,
"total_amount_sold_usd": 577807.0,
"total_remaining_usd": 0.0,
"is_indefinite_amount": false,
"total_number_already_invested": 35,
"minimum_investment_usd": 1000.0,
"exemption_claimed": ["06B", "3C", "3C.1"],
"is_new_notice": true,
"date_of_first_sale": "2026-05-14",
"related_persons": [
{
"first_name": "N/A",
"last_name": "Fund GP, LLC",
"city": "Wilmington",
"state_or_country": "DE",
"relationships": ["Director"]
}
],
"filing_date": "2026-05-15",
"filing_url": "https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0002122627&type=D",
"scraped_at": "2026-05-16T12:00:00.000000+00:00"
}

Export formats

  • JSON — full fidelity, newline-delimited.
  • CSV — flat; nested related_persons serialised as JSON string per Apify convention.
  • Excel.xlsx via the Apify dataset converter.
  • XML — structured per-item.

All formats are available via the Apify API: GET /datasets/{id}/items?format=csv&clean=true.

💰 Pricing

Pay-Per-Event (PPE) — you pay only for what you use:

EventPrice (USD)When
actor-start$0.05Once per run, at boot
result-row$0.005Per Form D row written to dataset (after filters)

Example costs

RunRowsCost
7-day default scan, ~50 filings50$0.30
California ≥$1M, 30 days, ~200 filings200$1.05
AI keyword, 90 days, ~500 filings500$2.55
1,000 filings1,000$5.05
Maximum cap, 5,000 filings5,000$25.05

At scale, the per-row charge dominates: ~$5 per 1,000 Form D rows. Crunchbase, PitchBook, and DealRoom expose the same underlying SEC data inside subscriptions priced $299-999/month with annual contracts — this Actor delivers the raw filings on a pay-as-you-go basis with no minimums.

🚧 Limitations

  • Public EDGAR Full-Text Search only. No EDGAR Online, no authenticated EDGAR features, no premium SEC data products.
  • Form D and Form D/A only. Forms 3/4/5 (insider transactions), Form 10-K/Q (public-company financials), and Form S-1/424 (public offerings) are out of scope — use sec-edgar-filings-scraper for those.
  • No email address extraction. Form D does not contain email addresses; only the issuer phone and the named officers' city + state are filed.
  • No use-of-proceeds narrative. Free-text fields beyond the structured XML elements listed above are out of scope.
  • No sales-compensation extraction. Broker / finder fee data is present in the XML but excluded — out of scope for lead generation.
  • No cross-run deduplication. Overlapping date ranges may surface the same filing multiple times across separate runs.
  • EDGAR rate limit applies. The Actor enforces a 0.1 s inter-fetch sleep, capping the practical throughput at ~10 filings/second / ~600 filings/minute / 5,000 cap → ~9 minutes for a full run.
  • 7-day default storage retention on the Apify FREE plan. Export the dataset immediately after the run or move to a paid plan for longer retention.
  • U.S. government public domain. SEC EDGAR data has no copyright; attribution is optional but appreciated.

❓ FAQ

Do I need an API key?

No. The EDGAR Full-Text Search endpoint at https://efts.sec.gov/LATEST/search-index?forms=D and the https://www.sec.gov/Archives/edgar/data/... static archive are both fully public. The Actor sends a User-Agent header identifying Devil Scrapes per EDGAR Fair Access Policy §2.4 — this is required by SEC ToS, not an authentication step.

Why does the cik URL not have leading zeros?

The EDGAR Archives URL path requires the integer form of the CIK (no padding), while the JSON / search APIs return the zero-padded 10-character form. The Actor normalises both — every output row's cik is the canonical zero-padded form ("0002122627"), and the internal URL builder strips the leading zeros for the XML fetch. Pre-padding the URL returns HTTP 404.

What's the difference between Form D and Form D/A?

Form D is the initial notice of a Reg D offering; Form D/A is an amendment (typically to update the total amount sold, add investors, or correct an error). When includeAmendments=false (default), only new D notices are returned. When true, amendments are included with is_new_notice=false.

Why is total_offering_amount_usd sometimes null?

The issuer reported the literal string "Indefinite" (common for evergreen funds and certain real-estate offerings). The is_indefinite_amount flag is True on those rows, and total_offering_amount_usd + total_remaining_usd are both null. Indefinite-amount rows always pass the minOfferingAmountUsd filter — we never throw them away on a numeric comparison they can't satisfy.

Why is year_of_incorporation sometimes null?

The issuer reported overFiveYears=true in the XML — the SEC does not require an exact year when the entity is more than 5 years old. Issuers younger than 5 years report withinFiveYears=true and the specific <value>YYYY</value> we parse to an integer.

Can I redistribute the data?

Yes. SEC EDGAR data is U.S. government public domain — there are no licensing restrictions on redistribution or commercial use. Attribution to the U.S. Securities and Exchange Commission is optional but recommended.

💬 Your feedback

Found a bug, hit a rate limit, or need a new field on the output row? Open an issue on the Actor's Apify Store page or contact the Devil Scrapes team at apify.com/DevilScrapes. We ship updates within days of validated reports.

Funding Intelligence Suite

Cross-link this Actor with the rest of the Devil Scrapes funding + research vertical:

  • SEC EDGAR Filings Scraper — general EDGAR filings across all form types (10-K, 10-Q, 8-K, 4, etc.) for established public-company coverage.
  • EU CORDIS Grants Scraper — Horizon Europe + Horizon 2020 grant projects (EU public R&D funding).
  • arXiv Papers Scraper — preprint metadata across all arXiv categories.
  • PubMed Papers Scraper — biomedical literature from the NIH PubMed index.
  • USPTO Patents Scraper — U.S. patent metadata for IP landscape work.

All suite Actors share consistent PPE pricing ($0.001-$0.005 per row, $0.01-$0.05 per run) and scraped_at ISO 8601 UTC timestamps so cross-source joins work cleanly.