Us Court Records Scraper
Pricing
from $20.00 / 1,000 results
Us Court Records Scraper
A unified Apify Actor that scrapes opinions and oral arguments from 200+ US court websites (all federal appellate courts, the Supreme Court, and most state courts of last resort).
Pricing
from $20.00 / 1,000 results
Rating
0.0
(0)
Developer
Logical Vivacity
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
14 days ago
Last modified
Categories
Share
A unified Apify Actor that scrapes opinions and oral arguments from 200+ US court websites (all federal appellate courts, the Supreme Court, and most state courts of last resort).
What it does
- Dynamically loads any supported court by short id (
scotus,ca1,ny,cal, ...) and parses the latest batch of opinions or oral arguments the court has published. - Normalises every record into a JSON dataset row with stable field names
(
case_name,docket_number,date_filed,judge,neutral_citation,download_url, etc.). - Optionally downloads the underlying PDF / audio file and stores it in the default key-value store.
- Per-court timeout and try/except so one broken court never aborts the run.
Inputs
| Field | Type | Default | Description |
|---|---|---|---|
courtIds | string[] | ["scotus"] | Court short names (e.g. scotus, ca1, ny). Ignored when scrapeAll is true. |
scrapeAll | boolean | false | If true, auto-discover and scrape every court module under the chosen documentType. |
documentType | enum | opinions | opinions or oral_arguments. |
maxOpinionsPerCourt | integer | 100 | Hard cap on records per court. |
downloadDocuments | boolean | false | Fetch the PDF / audio and save it to the key-value store as <courtId>-<index>. |
proxyConfiguration | object | Apify Proxy on | Strongly recommended. Many state court sites throttle or block repeat hits from a single IP. Defaults to Apify Proxy with the RESIDENTIAL group. Set useApifyProxy: false to disable. |
Output sample
Each dataset record looks like:
{"court_id": "scotus","document_type": "opinions","scraped_at": "2026-04-26T12:00:00Z","case_name": "United States v. Doe","case_names": "United States v. Doe","docket_number": "23-145","docket_numbers": "23-145","date_filed": "2026-04-15","case_dates": "2026-04-15","judge": "Roberts","judges": "Roberts","neutral_citation": "601 U.S. ___","citations": "601 U.S. ___","download_url": "https://www.supremecourt.gov/opinions/25pdf/23-145_8njq.pdf","download_urls": "https://www.supremecourt.gov/opinions/25pdf/23-145_8njq.pdf","precedential_statuses": "Published","document_storage_key": "scotus-0"}
When downloadDocuments=true, the binary at download_url is stored under
document_storage_key in the default key-value store.
Why a proxy matters
Federal appellate courts (ca1-ca11, scotus) usually tolerate repeated
hits from a single IP. Many state court portals (CA, NY, TX, FL and
others) will throttle or block after a handful of requests from the same IP
within a short window. Routing through Apify's residential proxy pool keeps
runs reliable across the long tail of state courts. Datacenter proxies work
for federal courts but get blocked by the strictest state portals.
Limitations
- Court sites change frequently. Individual court scrapers break regularly when courts redesign their pages. Expect some courts to fail on any given run - failures are logged and the run continues.
- Most court sites only expose the most recent batch (typically the last 10-100 opinions). This Actor does not perform historical backfill; use CourtListener for archival data.
- Some courts (notably PACER and a handful of state portals) require authentication or JavaScript rendering and are not supported by this Actor configuration.
- Georgia state court is not currently supported.
