SEC EDGAR Earnings Transcripts Scraper
Pricing
Pay per event
SEC EDGAR Earnings Transcripts Scraper
Extract earnings transcripts and press releases from SEC EDGAR 8-K filings (Exhibit 99.1 / 99.2). Search by ticker, CIK, or date range. Returns structured records with full exhibit text, speaker labels, and filing metadata.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
6 days ago
Last modified
Share
Extracts earnings transcripts and press releases from SEC EDGAR 8-K filings. Returns structured records with full exhibit text, speaker labels, and filing metadata for Exhibit 99.1 (earnings press release) and Exhibit 99.2 (verbatim transcript) — the official government-filed source for US public company earnings calls.
SEC EDGAR Earnings Transcripts Scraper Features
- Extracts Exhibit 99.1 (earnings press releases) and Exhibit 99.2 (verbatim transcripts) as plain text from 8-K filings
- Filters to Item 2.02 filings — Results of Operations — so you get earnings-relevant records without wading through every 8-K
- Detects verbatim transcripts heuristically and extracts speaker labels (OPERATOR, CEO, CFO, analyst names)
- Supports three input modes: ticker list, CIK list, or full-text search query
- Returns complete filing metadata: accession number, filing date, period of report, fiscal quarter, SIC code
- Handles iXBRL-wrapped exhibits gracefully — strips inline XBRL tags before returning clean text
- Respects SEC fair-access policy (5 req/sec, descriptive User-Agent) out of the box
- Works without proxy or credentials — SEC EDGAR is public-domain US government data
Who Uses SEC EDGAR Earnings Transcripts Data?
- Quantitative researchers — Build NLP models on verbatim management language; sentiment analysis across quarters; tone-shift detection
- Equity analysts — Pull earnings call text programmatically instead of paying Bloomberg Terminal rates for the same data
- Financial AI startups — Fine-tune LLMs on earnings calls; build RAG pipelines over the official source corpus
- Hedge funds — Systematic analysis of 8-K Item 2.02 filings across the entire US-listed universe
- Academic researchers — Longitudinal studies of earnings disclosure language from 2001 to present
How SEC EDGAR Earnings Transcripts Scraper Works
- Resolves input (ticker symbols, CIK numbers, or search query) to a list of SEC filers using the official
company_tickers.jsonregistry - Fetches the
submissions/CIK{n}.jsonAPI endpoint for each filer to enumerate recent 8-K filings and filters to Item 2.02 entries within the requested date range - For each matching filing, fetches the filing index page to discover Exhibit 99.1 and 99.2 filenames, then fetches each exhibit HTML body
- Extracts plain text, detects verbatim transcripts, pulls speaker labels, and saves a structured record per filing
Input
{"tickers": ["AAPL", "MSFT"],"dateFrom": "2025-01-01","dateTo": "2025-12-31","fetchExhibitText": true,"maxItems": 50}
| Field | Type | Default | Description |
|---|---|---|---|
tickers | array | ["AAPL"] | Ticker symbols to look up (resolves to CIKs automatically) |
cikNumbers | array | — | SEC CIK numbers (use instead of tickers when you know the CIK) |
searchQuery | string | — | Full-text search query across all 8-K filings (e.g. "earnings call") |
dateFrom | string | — | Earliest filing date to include (YYYY-MM-DD) |
dateTo | string | — | Latest filing date to include (YYYY-MM-DD) |
fetchExhibitText | boolean | true | Fetch and extract plain text from exhibits. Set to false for metadata-only runs |
maxItems | integer | 10 | Maximum number of filing records to return |
Mode selection: Provide tickers, cikNumbers, or searchQuery — exactly one is required. If you provide tickers, the actor resolves them to CIKs via the SEC company registry automatically.
Output Fields
{"cik": "0000320193","ticker": "AAPL","company_name": "Apple Inc.","sic_code": "3571","form_type": "8-K","accession_number": "0000320193-26-000011","filing_date": "2026-04-30","period_of_report": "2026-03-28","items_reported": "2.02,9.01","has_item_202": true,"fiscal_quarter": "Q1 2026","exhibit_99_1_url": "https://www.sec.gov/Archives/edgar/data/320193/000032019326000011/a8-kex991q2202603282026.htm","exhibit_99_1_text": "Apple reports second quarter results...","exhibit_99_2_url": "","exhibit_99_2_text": "","transcript_present": false,"transcript_speakers": "","filing_index_url": "https://www.sec.gov/Archives/edgar/data/320193/000032019326000011/","filing_html_url": "https://www.sec.gov/Archives/edgar/data/320193/000032019326000011/aapl-20260430.htm","filer_full_filename": "aapl-20260430.htm","source_endpoint": "ticker"}
| Field | Type | Description |
|---|---|---|
cik | string | 10-digit zero-padded SEC Central Index Key |
ticker | string | Primary common-stock ticker symbol |
company_name | string | Company legal name as registered with the SEC |
sic_code | string | Standard Industrial Classification code |
form_type | string | SEC form type (always 8-K) |
accession_number | string | Canonical SEC filing identifier |
filing_date | string | Date the filing was accepted by the SEC (YYYY-MM-DD) |
period_of_report | string | Quarter-end date the filing reports on (YYYY-MM-DD) |
items_reported | string | Comma-separated 8-K item codes (e.g. 2.02,9.01) |
has_item_202 | boolean | True when the filing includes Item 2.02 (Results of Operations) |
fiscal_quarter | string | Human-readable fiscal quarter (e.g. Q2 2026) |
exhibit_99_1_url | string | Direct URL to Exhibit 99.1 (earnings press release) when present |
exhibit_99_1_text | string | Plain text extracted from Exhibit 99.1 |
exhibit_99_2_url | string | Direct URL to Exhibit 99.2 (verbatim transcript) when present |
exhibit_99_2_text | string | Plain text extracted from Exhibit 99.2 |
transcript_present | boolean | True when Exhibit 99.2 appears to be a verbatim earnings call transcript |
transcript_speakers | string | Pipe-separated speaker labels extracted from the transcript |
filing_index_url | string | URL of the SEC filing index page |
filing_html_url | string | URL of the primary 8-K HTML document |
filer_full_filename | string | Primary submission filename |
source_endpoint | string | Which input mode produced this record (ticker, cik, search) |
FAQ
Does this require an API key or login?
No. SEC EDGAR is public-domain US government data. The actor handles the required User-Agent header automatically.
What is Exhibit 99.1 vs Exhibit 99.2?
Exhibit 99.1 is the earnings press release — the formatted document companies publish alongside quarterly results. Exhibit 99.2, when filed, is a verbatim transcript of the earnings call. Roughly 20-30% of filers include a verbatim transcript; the rest file only the press release.
How far back does the data go?
EDGAR HTML-format filings extend to approximately 2001. Older filings may lack proper Exhibit 99.x labeling — the actor handles these gracefully and returns transcript_present: false rather than failing.
Can I scrape the full S&P 500 earnings history?
Yes. Provide a ticker list of up to 500 companies and set dateFrom/dateTo to the range you need. Each company typically has 4 quarterly 8-K Item 2.02 filings per year, so the full S&P 500 for one year is roughly 2,000 records.
What rate limits apply?
SEC EDGAR enforces a 10 req/sec fair-access ceiling. This actor defaults to 5 req/sec with automatic backoff on 429 responses.
Need More Features?
To request additional fields or input modes, file an issue on the actor page. Custom builds for specific filer sets or date ranges are available.
Why Use SEC EDGAR Earnings Transcripts Scraper?
- Canonical source — Official SEC filings, not third-party republication. The same documents Bloomberg Terminal charges SaaS prices for.
- No paywalls — Public-domain US government data. The only gate is a proper User-Agent header, which the actor handles.
- Per-exhibit extraction — Returns Exhibit 99.1 and 99.2 separately with distinct fields, not a raw dump of the full filing HTML.
- Speaker detection — Extracts analyst and executive names from verbatim transcripts so you can filter by speaker without NLP preprocessing.
- Three input modes — Ticker lookup, direct CIK, or full-text search — covering every discovery workflow.