# SEC EDGAR Data Scraper (`thescrapelab/apify-sec-edgar-data`) Actor

High-speed, browserless extraction of SEC EDGAR filings (10-K, 10-Q, 8-K, Form 4) by ticker symbol. Get structured company data, document manifests, and historical records in seconds without the overhead of a headless browser.

- **URL**: https://apify.com/thescrapelab/apify-sec-edgar-data.md
- **Developed by:** [Inus Grobler](https://apify.com/thescrapelab) (community)
- **Categories:** Developer tools, Automation, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### SEC EDGAR filings scraper for Apify

Extract structured SEC EDGAR filing data for one or more stock tickers using direct SEC JSON and archive endpoints. This actor resolves tickers to CIKs, expands historical filing archives when needed, filters filings by date and filing focus, and writes normalized filing records to the default dataset.

It is designed for analysts, investors, quants, compliance teams, and data engineers who need SEC filing data for forms such as `10-K`, `10-Q`, `8-K`, `DEF 14A`, `DEFA14A`, `S-1`, `Form 4`, and related filings.

### Why use this actor

- Uses direct SEC endpoints instead of a browser, which keeps runs simpler and more stable.
- Resolves tickers such as `AAPL`, `MSFT`, and `BRK-B` to official SEC CIK identifiers automatically.
- Expands paginated submission history when a company has older filing archive pages.
- Supports simple filters for filing focus, filing categories, explicit form types, and filing dates.
- Offers two enrichment modes:
  - `filing-detail` for filing metadata and document manifests
  - `full` for parsed SEC complete submission text when text extraction is available

### Quick start

Run the actor with only a ticker list for a fast default scrape:

```json
{
  "tickers": ["AAPL", "MSFT"]
}
````

Default behavior:

- `filingFocus`: `investor`
- `enrichmentMode`: `filing-detail`
- `dateFrom` / `dateTo`: the most recent December window
- `maxFilingsPerTicker`: not capped unless you provide a value

For a broader, slower run with parsed submission text:

```json
{
  "tickers": ["AAPL", "MSFT", "NVDA"],
  "filingFocus": "investor_plus_governance",
  "dateFrom": "2025-01-01",
  "dateTo": "2025-12-01",
  "enrichmentMode": "full"
}
```

### Input reference

`tickers`

- Required.
- Array of stock tickers.
- Tickers are normalized to uppercase before lookup.

`filingFocus`

- Optional high-level filing filter.
- Supported values:
  - `investor`
  - `investor_plus_governance`
  - `company_filings`
  - `ownership`
  - `all`

`dateFrom` and `dateTo`

- Optional filing date range in `YYYY-MM-DD` format.
- When omitted, the actor defaults to the most recent December date range for a quick maintenance-style scrape.

`enrichmentMode`

- `filing-detail`
  Returns filing metadata, filing detail URLs, primary document URLs, and normalized document manifests.
- `full`
  Also fetches and parses the SEC complete submission text file when the filing content is text-based and extractable.

`maxFilingsPerTicker`

- Optional positive integer.
- Leave empty to return all matching filings for each ticker.

`formTypes`

- Optional advanced override for exact SEC form types such as `10-K`, `10-Q`, `8-K`, or `DEF 14A`.
- When provided, this overrides the high-level focus selection.

`filingCategories`

- Optional advanced override for normalized categories:
  - `financial_reports`
  - `company_updates`
  - `governance`
  - `capital_markets`
  - `ownership`
  - `other`

### Output

Each dataset item represents one matched filing. Output fields include:

- Ticker, company name, CIK, form type, filing category, filing date, report date
- Filing detail URL, primary document URL, filing header URL
- Normalized list of filing documents and data files
- Submission text status and error details
- In `full` mode, structured `submissionText` data and extracted primary-document text when available

Sample dataset item:

```json
{
  "ticker": "AAPL",
  "companyName": "Apple Inc.",
  "cik": "0000320193",
  "formType": "10-K",
  "filingCategory": "financial_reports",
  "filingDate": "2025-10-31",
  "reportDate": "2025-09-27",
  "acceptedAt": "2025-10-31 06:01:26",
  "periodOfReport": "2025-09-27",
  "accessionNumber": "0000320193-25-000079",
  "filingDetailUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000079/0000320193-25-000079-index.html",
  "primaryDocumentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000079/aapl-20250927.htm",
  "filingHeaderUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019325000079/0000320193-25-000079.hdr.sgml",
  "documentCount": 16,
  "dataFileCount": 8,
  "submissionTextStatus": "parsed",
  "submissionTextTextTruncated": true
}
```

Dataset items can be exported from Apify in JSON, CSV, Excel, XML, and other supported formats.

### Pricing model

This actor uses Apify pay-per-event pricing.

The exact prices should be taken from the Apify Pricing tab because pricing may change over time. The charging model is:

- `ticker-search`
  Charged once per successfully resolved ticker in `filing-detail` mode.
- `ticker-search-premium`
  Charged once per successfully resolved ticker in `full` mode.
- `apify-default-dataset-item`
  Charged once per dataset item written to the default dataset.
- `apify-actor-start`
  Synthetic Apify start event handled by the platform.

Important charging notes:

- Invalid or unresolved tickers are not charged as ticker search events.
- Internal pagination and retry requests are not charged as separate search events.
- Result charges are tied to items actually written to the default dataset.

### Reliability and SEC handling

- Uses a conservative global throttle for direct SEC requests.
- Retries common transient SEC and upstream failures such as `403`, `429`, and `503`.
- Sends a proper SEC `User-Agent` header.
- Avoids proxy-specific behavior and works through direct SEC access on Apify.

### Known limitations

- Some filings, especially annual reports in PDF form such as `ARS`, may be returned with `submissionTextStatus: "not_extractable"` in `full` mode. The filing metadata and document URLs are still included.
- `full` mode is materially slower and more expensive than `filing-detail` because it fetches and parses complete submission text files.
- Very broad date ranges across many tickers can produce large datasets and longer runtimes.

### Best use cases

- SEC EDGAR filing search by ticker
- Historical `10-K`, `10-Q`, and `8-K` extraction
- Proxy statement and governance filing collection
- Ownership and insider filing monitoring
- Financial research datasets for backtesting and analysis

# Actor input Schema

## `tickers` (type: `array`):

One or more stock tickers to resolve to SEC CIKs. Tickers are normalized to uppercase.

## `filingFocus` (type: `string`):

Choose the kind of filings you want without needing to know SEC form codes.

## `dateFrom` (type: `string`):

Inclusive lower bound for filingDate in YYYY-MM-DD format. Defaults to the most recent December when omitted.

## `dateTo` (type: `string`):

Inclusive upper bound for filingDate in YYYY-MM-DD format. Defaults to the most recent December when omitted.

## `enrichmentMode` (type: `string`):

Choose how much extra SEC enrichment to fetch. More enrichment means more requests, larger dataset items, slower runs, and higher cost.

## `maxFilingsPerTicker` (type: `integer`):

Optional cap on the number of matching filings to keep per ticker after filtering. Leave empty to return all matching filings. Most recent filings are kept first.

## `formTypes` (type: `array`):

Explicit list of SEC form types to include (e.g. 10-K, 10-Q, 8-K, DEF 14A). When provided, overrides the Filing Focus filter. Leave empty to use Filing Focus instead.

## `filingCategories` (type: `array`):

Explicit list of filing categories to include. Valid values: financial\_reports, company\_updates, governance, capital\_markets, ownership, other. When provided, overrides the Filing Focus filter. Leave empty to use Filing Focus instead.

## Actor input object example

```json
{
  "tickers": [
    "AAPL"
  ],
  "filingFocus": "investor",
  "dateFrom": "2025-12-01",
  "dateTo": "2025-12-31",
  "enrichmentMode": "filing-detail"
}
```

# Actor output Schema

## `results` (type: `string`):

API URL for the default dataset items containing the filtered SEC filings.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "tickers": [
        "AAPL"
    ],
    "dateFrom": "2025-12-01",
    "dateTo": "2025-12-31"
};

// Run the Actor and wait for it to finish
const run = await client.actor("thescrapelab/apify-sec-edgar-data").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "tickers": ["AAPL"],
    "dateFrom": "2025-12-01",
    "dateTo": "2025-12-31",
}

# Run the Actor and wait for it to finish
run = client.actor("thescrapelab/apify-sec-edgar-data").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "tickers": [
    "AAPL"
  ],
  "dateFrom": "2025-12-01",
  "dateTo": "2025-12-31"
}' |
apify call thescrapelab/apify-sec-edgar-data --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=thescrapelab/apify-sec-edgar-data",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "SEC EDGAR Data Scraper",
        "description": "High-speed, browserless extraction of SEC EDGAR filings (10-K, 10-Q, 8-K, Form 4) by ticker symbol. Get structured company data, document manifests, and historical records in seconds without the overhead of a headless browser.",
        "version": "1.0",
        "x-build-id": "uSPjjHTA3hghudUKl"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/thescrapelab~apify-sec-edgar-data/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-thescrapelab-apify-sec-edgar-data",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/thescrapelab~apify-sec-edgar-data/runs": {
            "post": {
                "operationId": "runs-sync-thescrapelab-apify-sec-edgar-data",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/thescrapelab~apify-sec-edgar-data/run-sync": {
            "post": {
                "operationId": "run-sync-thescrapelab-apify-sec-edgar-data",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "tickers"
                ],
                "properties": {
                    "tickers": {
                        "title": "Tickers",
                        "minItems": 1,
                        "type": "array",
                        "description": "One or more stock tickers to resolve to SEC CIKs. Tickers are normalized to uppercase.",
                        "items": {
                            "type": "string",
                            "minLength": 1
                        }
                    },
                    "filingFocus": {
                        "title": "Filing Focus",
                        "enum": [
                            "investor",
                            "investor_plus_governance",
                            "company_filings",
                            "ownership",
                            "all"
                        ],
                        "type": "string",
                        "description": "Choose the kind of filings you want without needing to know SEC form codes.",
                        "default": "investor"
                    },
                    "dateFrom": {
                        "title": "Date From",
                        "pattern": "^\\d{4}-\\d{2}-\\d{2}$",
                        "type": "string",
                        "description": "Inclusive lower bound for filingDate in YYYY-MM-DD format. Defaults to the most recent December when omitted."
                    },
                    "dateTo": {
                        "title": "Date To",
                        "pattern": "^\\d{4}-\\d{2}-\\d{2}$",
                        "type": "string",
                        "description": "Inclusive upper bound for filingDate in YYYY-MM-DD format. Defaults to the most recent December when omitted."
                    },
                    "enrichmentMode": {
                        "title": "Enrichment Mode",
                        "enum": [
                            "filing-detail",
                            "full"
                        ],
                        "type": "string",
                        "description": "Choose how much extra SEC enrichment to fetch. More enrichment means more requests, larger dataset items, slower runs, and higher cost.",
                        "default": "filing-detail"
                    },
                    "maxFilingsPerTicker": {
                        "title": "Max Filings Per Ticker",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Optional cap on the number of matching filings to keep per ticker after filtering. Leave empty to return all matching filings. Most recent filings are kept first."
                    },
                    "formTypes": {
                        "title": "Form Types (advanced override)",
                        "type": "array",
                        "description": "Explicit list of SEC form types to include (e.g. 10-K, 10-Q, 8-K, DEF 14A). When provided, overrides the Filing Focus filter. Leave empty to use Filing Focus instead.",
                        "items": {
                            "type": "string",
                            "minLength": 1
                        }
                    },
                    "filingCategories": {
                        "title": "Filing Categories (advanced override)",
                        "type": "array",
                        "description": "Explicit list of filing categories to include. Valid values: financial_reports, company_updates, governance, capital_markets, ownership, other. When provided, overrides the Filing Focus filter. Leave empty to use Filing Focus instead.",
                        "items": {
                            "type": "string",
                            "minLength": 1
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
