Us Court Records Scraper avatar

Us Court Records Scraper

Pricing

from $20.00 / 1,000 results

Go to Apify Store
Us Court Records Scraper

Us Court Records Scraper

A unified Apify Actor that scrapes opinions and oral arguments from 200+ US court websites (all federal appellate courts, the Supreme Court, and most state courts of last resort).

Pricing

from $20.00 / 1,000 results

Rating

0.0

(0)

Developer

Logical Vivacity

Logical Vivacity

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

14 days ago

Last modified

Share

A unified Apify Actor that scrapes opinions and oral arguments from 200+ US court websites (all federal appellate courts, the Supreme Court, and most state courts of last resort).

What it does

  • Dynamically loads any supported court by short id (scotus, ca1, ny, cal, ...) and parses the latest batch of opinions or oral arguments the court has published.
  • Normalises every record into a JSON dataset row with stable field names (case_name, docket_number, date_filed, judge, neutral_citation, download_url, etc.).
  • Optionally downloads the underlying PDF / audio file and stores it in the default key-value store.
  • Per-court timeout and try/except so one broken court never aborts the run.

Inputs

FieldTypeDefaultDescription
courtIdsstring[]["scotus"]Court short names (e.g. scotus, ca1, ny). Ignored when scrapeAll is true.
scrapeAllbooleanfalseIf true, auto-discover and scrape every court module under the chosen documentType.
documentTypeenumopinionsopinions or oral_arguments.
maxOpinionsPerCourtinteger100Hard cap on records per court.
downloadDocumentsbooleanfalseFetch the PDF / audio and save it to the key-value store as <courtId>-<index>.
proxyConfigurationobjectApify Proxy onStrongly recommended. Many state court sites throttle or block repeat hits from a single IP. Defaults to Apify Proxy with the RESIDENTIAL group. Set useApifyProxy: false to disable.

Output sample

Each dataset record looks like:

{
"court_id": "scotus",
"document_type": "opinions",
"scraped_at": "2026-04-26T12:00:00Z",
"case_name": "United States v. Doe",
"case_names": "United States v. Doe",
"docket_number": "23-145",
"docket_numbers": "23-145",
"date_filed": "2026-04-15",
"case_dates": "2026-04-15",
"judge": "Roberts",
"judges": "Roberts",
"neutral_citation": "601 U.S. ___",
"citations": "601 U.S. ___",
"download_url": "https://www.supremecourt.gov/opinions/25pdf/23-145_8njq.pdf",
"download_urls": "https://www.supremecourt.gov/opinions/25pdf/23-145_8njq.pdf",
"precedential_statuses": "Published",
"document_storage_key": "scotus-0"
}

When downloadDocuments=true, the binary at download_url is stored under document_storage_key in the default key-value store.

Why a proxy matters

Federal appellate courts (ca1-ca11, scotus) usually tolerate repeated hits from a single IP. Many state court portals (CA, NY, TX, FL and others) will throttle or block after a handful of requests from the same IP within a short window. Routing through Apify's residential proxy pool keeps runs reliable across the long tail of state courts. Datacenter proxies work for federal courts but get blocked by the strictest state portals.

Limitations

  • Court sites change frequently. Individual court scrapers break regularly when courts redesign their pages. Expect some courts to fail on any given run - failures are logged and the run continues.
  • Most court sites only expose the most recent batch (typically the last 10-100 opinions). This Actor does not perform historical backfill; use CourtListener for archival data.
  • Some courts (notably PACER and a handful of state portals) require authentication or JavaScript rendering and are not supported by this Actor configuration.
  • Georgia state court is not currently supported.