SEC EDGAR Filings Scraper — Structured for AI & RAG avatar

SEC EDGAR Filings Scraper — Structured for AI & RAG

Pricing

Pay per usage

Go to Apify Store
SEC EDGAR Filings Scraper — Structured for AI & RAG

SEC EDGAR Filings Scraper — Structured for AI & RAG

Pull SEC EDGAR filings (10-K, 10-Q, 8-K, more) as clean structured JSON, ready for AI/RAG pipelines and fintech. Financial facts, filing text, zero charge on empty runs.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

The Mine Works

The Mine Works

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

SEC EDGAR Filings Scraper — Structured for AI, RAG & Fintech

Pull SEC EDGAR filings — 10-K, 10-Q, 8-K, S-1, Form 4, DEF 14A and every other form type — as clean, structured JSON that drops straight into an AI/RAG pipeline, a vector database, or a fintech data model. Resolve any stock ticker to its SEC CIK automatically, filter by form type and date, optionally attach XBRL financial facts and RAG-ready filing text. You only pay for filings actually delivered — empty runs and unknown tickers are never charged.

Keywords: SEC EDGAR API, SEC filings scraper, 10-K scraper, 10-Q scraper, 8-K scraper, EDGAR full-text, XBRL financials, company filings API, financial data for RAG, fintech data pipeline.


Why this actor

The SEC publishes everything through EDGAR, but the raw endpoints are awkward: you have to know a company's zero-padded CIK, walk a parallel-array JSON structure, reconstruct document URLs by hand, and respect the SEC's fair-access User-Agent and rate-limit rules. Most teams burn a day writing glue code before they get a single clean filing.

This actor does all of that for you and returns one tidy record per filing:

  • Ticker → CIK resolution built in. Pass AAPL, get Apple's filings. No CIK lookups.
  • Form and date filtering server-side, so you only get the 10-Ks (or 8-Ks, or insider Form 4s) you asked for, in the window you asked for.
  • Reconstructed, click-ready URLs for both the primary document and the filing index.
  • Optional XBRL financial facts — revenue, net income, total assets, liabilities, equity, cash — pulled from the SEC's structured company-facts API.
  • Optional RAG-ready text — the primary document fetched and cleaned to plain text, ready for chunking and embedding.
  • Fair-access compliant — descriptive User-Agent with your contact email, polite request pacing, automatic backoff on 403/429.

It targets the SEC's official open data endpoints (data.sec.gov and www.sec.gov/Archives). No API key. No scraping of rendered HTML search pages. No anti-bot fragility.


What you can build with it

  • AI / RAG knowledge bases — ingest a company's full 10-K and 10-Q history as clean text, chunk it, embed it, and let an LLM answer questions grounded in real filings.
  • Fintech dashboards — track new 8-Ks (material events), insider Form 4 transactions, or quarterly financials across a watchlist.
  • Quant & research pipelines — pull standardized XBRL financial facts across hundreds of tickers for screening and modelling.
  • Compliance & monitoring — schedule a daily run over a portfolio and capture every new filing the moment it hits EDGAR.
  • Due diligence — assemble a complete filing history for a target company in seconds, with direct links to every source document.

Input

FieldTypeDefaultDescription
tickersstring[]["AAPL"]Stock tickers, resolved to CIK automatically.
ciksstring[]Direct SEC CIK numbers, if you already have them.
formTypesstring[]["10-K","10-Q","8-K"]Form types to return. Empty = all forms. Prefix-matched, case-insensitive.
maxFilingsPerCompanyinteger10Max filings per company, newest first.
dateFromstringOnly filings filed on/after this YYYY-MM-DD.
dateTostringOnly filings filed on/before this YYYY-MM-DD.
includeFinancialsbooleanfalseAttach a summary of key XBRL financial facts per company.
includeDocumentTextbooleanfalseAttach cleaned plain text of the primary document (RAG-ready).
contactEmailstringYour email, used in the SEC fair-access User-Agent.

Example input

{
"tickers": ["AAPL", "MSFT", "NVDA"],
"formTypes": ["10-K", "10-Q"],
"maxFilingsPerCompany": 4,
"includeFinancials": true,
"contactEmail": "you@yourcompany.com"
}

Output

Each filing is one dataset record:

{
"ticker": "AAPL",
"cik": "0000320193",
"company_name": "Apple Inc.",
"sic": "Electronic Computers",
"form": "10-K",
"filing_date": "2025-11-01",
"report_date": "2025-09-27",
"accession_number": "0000320193-25-000123",
"primary_document": "aapl-20250927.htm",
"primary_doc_description": "10-K",
"items": null,
"is_xbrl": true,
"size_bytes": 12849302,
"filing_url": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000123/aapl-20250927.htm",
"index_url": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000123/0000320193-25-000123-index.htm",
"financial_facts": {
"revenue": { "value": 391035000000, "unit": "USD", "period_end": "2025-09-27", "form": "10-K", "fy": 2025 },
"net_income": { "value": 93736000000, "unit": "USD", "period_end": "2025-09-27", "form": "10-K", "fy": 2025 }
},
"scraped_at": "2026-06-10T14:00:00.000Z"
}

financial_facts appears only when includeFinancials is on; document_text only when includeDocumentText is on. A final {"_type": "summary"} record reports how many companies and filings were processed.


Pricing

Your first 25 filings are free — every Apify account, no card, no trial clock. After that it is a flat $0.004 per filing delivered.

  • First 25 filings free per account (lifetime), then $0.004/filing
  • Zero charge on empty runs — unknown tickers, no matching filings, or fetch failures cost you nothing
  • No monthly minimum, no rental, no per-second compute surprises
  • A run pulling 100 filings costs $0.40

You pay for outcomes — filings in your dataset — not for time on the platform.


Notes on SEC fair access

The SEC's EDGAR system is free and open, but it asks every automated client to:

  1. Send a descriptive User-Agent that includes a contact email (set contactEmail).
  2. Stay under roughly 10 requests per second (this actor paces itself well below that).

This actor honours both. It does not attempt to bypass any access control, because EDGAR has none to bypass — the data is public by law. Please use it responsibly and within the SEC's fair-access policy.


FAQ

Do I need an SEC API key? No. EDGAR open data requires no key — only a contact email in the User-Agent.

Which companies are covered? Every company with an EDGAR CIK — over 10,000 tickers and many more CIK-only filers (funds, trusts, individuals filing Form 4s).

Can I get the actual financial numbers, not just the filing link? Yes — turn on includeFinancials for structured XBRL facts, or includeDocumentText for the full cleaned filing text.

Is the filing text good enough for RAG? Yes — includeDocumentText strips scripts, styles, and markup and returns clean plain text ready to chunk and embed.

How fresh is the data? Real-time. The actor reads EDGAR's live submissions feed, so a filing is available the moment the SEC publishes it.