# SEC EDGAR Filings Scraper — Structured for AI & RAG (`themineworks/sec-edgar-filings`) Actor

Pull SEC EDGAR filings (10-K, 10-Q, 8-K, more) as clean structured JSON, ready for AI/RAG pipelines and fintech. Financial facts, filing text, zero charge on empty runs.

- **URL**: https://apify.com/themineworks/sec-edgar-filings.md
- **Developed by:** [The Mine Works](https://apify.com/themineworks) (community)
- **Categories:** Business, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## SEC EDGAR Filings Scraper — Structured for AI, RAG & Fintech

Pull **SEC EDGAR filings** — 10-K, 10-Q, 8-K, S-1, Form 4, DEF 14A and every other form type — as clean, structured JSON that drops straight into an **AI/RAG pipeline**, a vector database, or a fintech data model. Resolve any **stock ticker to its SEC CIK** automatically, filter by form type and date, optionally attach **XBRL financial facts** and **RAG-ready filing text**. You only pay for filings actually delivered — empty runs and unknown tickers are never charged.

**Keywords:** SEC EDGAR API, SEC filings scraper, 10-K scraper, 10-Q scraper, 8-K scraper, EDGAR full-text, XBRL financials, company filings API, financial data for RAG, fintech data pipeline.

---

### Why this actor

The SEC publishes everything through EDGAR, but the raw endpoints are awkward: you have to know a company's zero-padded CIK, walk a parallel-array JSON structure, reconstruct document URLs by hand, and respect the SEC's fair-access User-Agent and rate-limit rules. Most teams burn a day writing glue code before they get a single clean filing.

This actor does all of that for you and returns **one tidy record per filing**:

- **Ticker → CIK resolution** built in. Pass `AAPL`, get Apple's filings. No CIK lookups.
- **Form and date filtering** server-side, so you only get the 10-Ks (or 8-Ks, or insider Form 4s) you asked for, in the window you asked for.
- **Reconstructed, click-ready URLs** for both the primary document and the filing index.
- **Optional XBRL financial facts** — revenue, net income, total assets, liabilities, equity, cash — pulled from the SEC's structured company-facts API.
- **Optional RAG-ready text** — the primary document fetched and cleaned to plain text, ready for chunking and embedding.
- **Fair-access compliant** — descriptive User-Agent with your contact email, polite request pacing, automatic backoff on 403/429.

It targets the SEC's **official open data endpoints** (`data.sec.gov` and `www.sec.gov/Archives`). No API key. No scraping of rendered HTML search pages. No anti-bot fragility.

---

### What you can build with it

- **AI / RAG knowledge bases** — ingest a company's full 10-K and 10-Q history as clean text, chunk it, embed it, and let an LLM answer questions grounded in real filings.
- **Fintech dashboards** — track new 8-Ks (material events), insider Form 4 transactions, or quarterly financials across a watchlist.
- **Quant & research pipelines** — pull standardized XBRL financial facts across hundreds of tickers for screening and modelling.
- **Compliance & monitoring** — schedule a daily run over a portfolio and capture every new filing the moment it hits EDGAR.
- **Due diligence** — assemble a complete filing history for a target company in seconds, with direct links to every source document.

---

### Input

| Field | Type | Default | Description |
|---|---|---|---|
| `tickers` | string[] | `["AAPL"]` | Stock tickers, resolved to CIK automatically. |
| `ciks` | string[] | — | Direct SEC CIK numbers, if you already have them. |
| `formTypes` | string[] | `["10-K","10-Q","8-K"]` | Form types to return. Empty = all forms. Prefix-matched, case-insensitive. |
| `maxFilingsPerCompany` | integer | 10 | Max filings per company, newest first. |
| `dateFrom` | string | — | Only filings filed on/after this `YYYY-MM-DD`. |
| `dateTo` | string | — | Only filings filed on/before this `YYYY-MM-DD`. |
| `includeFinancials` | boolean | false | Attach a summary of key XBRL financial facts per company. |
| `includeDocumentText` | boolean | false | Attach cleaned plain text of the primary document (RAG-ready). |
| `contactEmail` | string | — | Your email, used in the SEC fair-access User-Agent. |

#### Example input

```json
{
  "tickers": ["AAPL", "MSFT", "NVDA"],
  "formTypes": ["10-K", "10-Q"],
  "maxFilingsPerCompany": 4,
  "includeFinancials": true,
  "contactEmail": "you@yourcompany.com"
}
````

***

### Output

Each filing is one dataset record:

```json
{
  "ticker": "AAPL",
  "cik": "0000320193",
  "company_name": "Apple Inc.",
  "sic": "Electronic Computers",
  "form": "10-K",
  "filing_date": "2025-11-01",
  "report_date": "2025-09-27",
  "accession_number": "0000320193-25-000123",
  "primary_document": "aapl-20250927.htm",
  "primary_doc_description": "10-K",
  "items": null,
  "is_xbrl": true,
  "size_bytes": 12849302,
  "filing_url": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000123/aapl-20250927.htm",
  "index_url": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000123/0000320193-25-000123-index.htm",
  "financial_facts": {
    "revenue": { "value": 391035000000, "unit": "USD", "period_end": "2025-09-27", "form": "10-K", "fy": 2025 },
    "net_income": { "value": 93736000000, "unit": "USD", "period_end": "2025-09-27", "form": "10-K", "fy": 2025 }
  },
  "scraped_at": "2026-06-10T14:00:00.000Z"
}
```

`financial_facts` appears only when `includeFinancials` is on; `document_text` only when `includeDocumentText` is on. A final `{"_type": "summary"}` record reports how many companies and filings were processed.

***

### Pricing

**Your first 25 filings are free — every Apify account, no card, no trial clock.** After that it is a flat **$0.004 per filing delivered**.

- First 25 filings free per account (lifetime), then $0.004/filing
- **Zero charge on empty runs** — unknown tickers, no matching filings, or fetch failures cost you nothing
- No monthly minimum, no rental, no per-second compute surprises
- A run pulling 100 filings costs $0.40

You pay for outcomes — filings in your dataset — not for time on the platform.

***

### Notes on SEC fair access

The SEC's EDGAR system is free and open, but it asks every automated client to:

1. Send a descriptive **User-Agent that includes a contact email** (set `contactEmail`).
2. Stay under roughly **10 requests per second** (this actor paces itself well below that).

This actor honours both. It does not attempt to bypass any access control, because EDGAR has none to bypass — the data is public by law. Please use it responsibly and within the SEC's [fair-access policy](https://www.sec.gov/os/webmaster-faq#developers).

***

### FAQ

**Do I need an SEC API key?** No. EDGAR open data requires no key — only a contact email in the User-Agent.

**Which companies are covered?** Every company with an EDGAR CIK — over 10,000 tickers and many more CIK-only filers (funds, trusts, individuals filing Form 4s).

**Can I get the actual financial numbers, not just the filing link?** Yes — turn on `includeFinancials` for structured XBRL facts, or `includeDocumentText` for the full cleaned filing text.

**Is the filing text good enough for RAG?** Yes — `includeDocumentText` strips scripts, styles, and markup and returns clean plain text ready to chunk and embed.

**How fresh is the data?** Real-time. The actor reads EDGAR's live submissions feed, so a filing is available the moment the SEC publishes it.

# Actor input Schema

## `tickers` (type: `array`):

Stock ticker symbols to pull filings for (e.g. AAPL, MSFT, NVDA). Resolved to SEC CIK automatically. Use this OR direct CIKs below.

## `ciks` (type: `array`):

Direct SEC Central Index Key numbers, if you already have them (e.g. 320193 or 0000320193). Use instead of, or in addition to, tickers.

## `formTypes` (type: `array`):

Which SEC form types to return (e.g. 10-K, 10-Q, 8-K, S-1, 4, DEF 14A). Leave empty to return all form types. Matching is case-insensitive and prefix-based.

## `maxFilingsPerCompany` (type: `integer`):

Maximum number of filings to return per company, newest first.

## `dateFrom` (type: `string`):

Only return filings filed on or after this date. Leave empty for no lower bound.

## `dateTo` (type: `string`):

Only return filings filed on or before this date. Leave empty for no upper bound.

## `includeFinancials` (type: `boolean`):

Attach a summary of key XBRL financial facts (revenue, net income, assets, etc.) for each company. Adds one API call per company.

## `includeDocumentText` (type: `boolean`):

Fetch the primary filing document and attach cleaned plain text, ready for LLM/RAG ingestion. Slower and larger output.

## `contactEmail` (type: `string`):

SEC requires a contact email in the User-Agent for fair-access. Your own email is recommended. A default is used if left blank.

## Actor input object example

```json
{
  "tickers": [
    "AAPL"
  ],
  "formTypes": [
    "10-K",
    "10-Q",
    "8-K"
  ],
  "maxFilingsPerCompany": 5,
  "includeFinancials": false,
  "includeDocumentText": false,
  "contactEmail": "contact@example.com"
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "tickers": [
        "AAPL"
    ],
    "formTypes": [
        "10-K",
        "10-Q",
        "8-K"
    ],
    "maxFilingsPerCompany": 5
};

// Run the Actor and wait for it to finish
const run = await client.actor("themineworks/sec-edgar-filings").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "tickers": ["AAPL"],
    "formTypes": [
        "10-K",
        "10-Q",
        "8-K",
    ],
    "maxFilingsPerCompany": 5,
}

# Run the Actor and wait for it to finish
run = client.actor("themineworks/sec-edgar-filings").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "tickers": [
    "AAPL"
  ],
  "formTypes": [
    "10-K",
    "10-Q",
    "8-K"
  ],
  "maxFilingsPerCompany": 5
}' |
apify call themineworks/sec-edgar-filings --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=themineworks/sec-edgar-filings",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "SEC EDGAR Filings Scraper — Structured for AI & RAG",
        "description": "Pull SEC EDGAR filings (10-K, 10-Q, 8-K, more) as clean structured JSON, ready for AI/RAG pipelines and fintech. Financial facts, filing text, zero charge on empty runs.",
        "version": "0.1",
        "x-build-id": "072cvpZMKWcMvezU2"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/themineworks~sec-edgar-filings/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-themineworks-sec-edgar-filings",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/themineworks~sec-edgar-filings/runs": {
            "post": {
                "operationId": "runs-sync-themineworks-sec-edgar-filings",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/themineworks~sec-edgar-filings/run-sync": {
            "post": {
                "operationId": "run-sync-themineworks-sec-edgar-filings",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "tickers": {
                        "title": "Stock tickers",
                        "type": "array",
                        "description": "Stock ticker symbols to pull filings for (e.g. AAPL, MSFT, NVDA). Resolved to SEC CIK automatically. Use this OR direct CIKs below.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "ciks": {
                        "title": "CIK numbers (optional)",
                        "type": "array",
                        "description": "Direct SEC Central Index Key numbers, if you already have them (e.g. 320193 or 0000320193). Use instead of, or in addition to, tickers.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "formTypes": {
                        "title": "Filing form types",
                        "type": "array",
                        "description": "Which SEC form types to return (e.g. 10-K, 10-Q, 8-K, S-1, 4, DEF 14A). Leave empty to return all form types. Matching is case-insensitive and prefix-based.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxFilingsPerCompany": {
                        "title": "Max filings per company",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of filings to return per company, newest first.",
                        "default": 10
                    },
                    "dateFrom": {
                        "title": "Filed on or after (YYYY-MM-DD)",
                        "type": "string",
                        "description": "Only return filings filed on or after this date. Leave empty for no lower bound."
                    },
                    "dateTo": {
                        "title": "Filed on or before (YYYY-MM-DD)",
                        "type": "string",
                        "description": "Only return filings filed on or before this date. Leave empty for no upper bound."
                    },
                    "includeFinancials": {
                        "title": "Include financial facts (XBRL)",
                        "type": "boolean",
                        "description": "Attach a summary of key XBRL financial facts (revenue, net income, assets, etc.) for each company. Adds one API call per company.",
                        "default": false
                    },
                    "includeDocumentText": {
                        "title": "Include filing text (RAG-ready)",
                        "type": "boolean",
                        "description": "Fetch the primary filing document and attach cleaned plain text, ready for LLM/RAG ingestion. Slower and larger output.",
                        "default": false
                    },
                    "contactEmail": {
                        "title": "Contact email (SEC fair-access)",
                        "type": "string",
                        "description": "SEC requires a contact email in the User-Agent for fair-access. Your own email is recommended. A default is used if left blank.",
                        "default": "contact@example.com"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
