US Court Records Scraper - Party, Attorney, RECAP Docs avatar

US Court Records Scraper - Party, Attorney, RECAP Docs

Pricing

$5.00 / 1,000 case records

Go to Apify Store
US Court Records Scraper - Party, Attorney, RECAP Docs

US Court Records Scraper - Party, Attorney, RECAP Docs

Federal and state court records scraper via CourtListener. Party monitoring, attorney tracking, litigation portfolio analysis, recent filings alerts, RECAP document lookup. For law firms, debt collectors, insurance due diligence, journalists, M&A intel.

Pricing

$5.00 / 1,000 case records

Rating

0.0

(0)

Developer

Seibs.co

Seibs.co

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

17 days ago

Last modified

Share

US Court Records Intel

TL;DR for law firms, journalists, debt collectors, and litigation analytics teams: Pulls US federal and state court dockets, case metadata, party lists, attorney names, document indexes, and document text via CourtListener and PACER-derived sources with normalized output across jurisdictions. Compared to raw CourtListener API or PACER, you get one normalized schema across federal plus state courts, attorney and party name resolution, and document-text inclusion in a single dataset. Free Apify plan covers exploratory case-search runs on your $5 platform credit. Bring your own CourtListener API key for higher rate limits. Upgrade to Apify Starter ($49/mo) for production volume.

Run it in 30 seconds

# Via the Apify Python SDK
from apify_client import ApifyClient
client = ApifyClient("<YOUR_APIFY_TOKEN>")
run = client.actor("seibs.co/court-records-intel").call(run_input={
"mode": "case_search",
"courts": [
"scotus",
"ca9"
],
"date_from": "2026-04-01",
"max_results": 50
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

Or via curl:

curl -X POST "https://api.apify.com/v2/acts/seibs.co~court-records-intel/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"mode": "case_search", "courts": ["scotus", "ca9"], "date_from": "2026-04-01", "max_results": 50}'

Or click "Try for free" on this page if you prefer the no-code UI.

What you get

Each run produces:

  • A clean dataset, filterable in the Apify console and downloadable as CSV or JSON
  • An OUTPUT.html dashboard preview of your top records
  • A sample-output preview at ./.actor/sample-output.json

Per-archetype custom artifacts shipped with this actor:

  • cases.csv (normalized docket metadata)
  • parties.csv (party-to-case mapping for conflict checks)
  • documents.csv (document index with text inclusion flags)

What does US Court Records Intel do?

It hits the CourtListener REST API and the RECAP archive to return actionable case intelligence rather than raw search dumps - per-party rollups, attorney case-load by year, corporate-entity litigation portfolios with settlement-ratio estimates, recent-filings alerts since a date, and metadata + optional plain text for documents already archived in RECAP.

AI / RAG / Agent

Federal and state court records prepared for legal AI agents and paralegal LLMs. RECAP document text is returned in clean plain-text ready to embed, and party / attorney / portfolio rollups give an agent the structured context it needs to answer "what has this entity been sued for" without scraping PACER. Compatible with LangChain, LlamaIndex, Pinecone, Weaviate, Chroma, and MCP-aware agent runtimes (Claude Desktop, GPT, custom).

from apify_client import ApifyClient
from langchain.schema import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
client = ApifyClient("APIFY_TOKEN")
run = client.actor("you/court-records-intel").call(run_input={
"mode": "litigation_portfolio",
"entity": "Acme Corp",
"includeRecapText": True,
})
splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=150)
raw = [
Document(
page_content=doc["plain_text"],
metadata={
"case_name": item["case_name"],
"court": item["court"],
"docket_number": item["docket_number"],
"filed": item["date_filed"],
"judge": item.get("judge"),
"doc_id": doc["recap_id"],
},
)
for item in client.dataset(run["defaultDatasetId"]).iterate_items()
for doc in item.get("documents", []) if doc.get("plain_text")
]
Chroma.from_documents(splitter.split_documents(raw), OpenAIEmbeddings(),
collection_name="litigation-rag")

Features

  • Party monitoring - feed in a person or company; get every case they appear in with role, status, jurisdiction, judge, filing + termination dates.
  • Litigation portfolio analysis - corporate-entity rollup: active vs closed counts, settlement-ratio estimate, average case duration, top jurisdictions, top judges, top opposing counsel.
  • Attorney tracking - case load by year, derived practice areas, frequency of opposing counsel.
  • Recent filings monitor - new dockets in specified courts / by specified parties / matching keywords since a date.
  • RECAP document lookup - URLs + metadata + optional extracted plain text for documents already in the free RECAP archive.
  • Patent-litigation crossover (schema-only stub) - reserved field for a future USPTO assignee cross-reference.

Use cases

  • Law firms - competitive intel on opposing firms, conflict-check pre-screen, attorney book-of-business audits.
  • Debt collection - find every active case where a target is a defendant.
  • Insurance and M&A due diligence - litigation-portfolio rollups for a target entity in one shot.
  • Litigation funders - filter by nature of suit + entity to surface fundable cases.
  • Investigative journalists - track filing patterns, cluster by judge / firm / party.
  • Competitive intel - monitor patent-litigation hot courts (E.D. Tex.) or securities-litigation hot courts (S.D.N.Y.).

FAQ

Q: Is this legal? A: Yes. Federal court dockets and documents are public records. The actor pulls primarily from RECAP (a free, donation-funded mirror of PACER documents) and CourtListener's open API. No PACER credentials are required and no PACER fees are incurred for documents already in RECAP.

Q: What's RECAP? A: RECAP is a free public archive of federal court documents hosted by the Free Law Project. When PACER users install the RECAP browser extension, every document they pay PACER to download is also uploaded to RECAP for free public access. RECAP now mirrors millions of dockets and tens of millions of documents from all 94 federal districts, 13 circuit courts, the bankruptcy courts, and SCOTUS. This actor reads RECAP via CourtListener's API so you get PACER coverage without PACER bills.

Q: Why might a run fail? A: The three most common failure modes are (1) CourtListener API rate-limiting on very wide party / keyword searches (the actor handles backoff), (2) a target case existing in PACER but not yet mirrored to RECAP - that document will return a "not in RECAP" sentinel rather than crash the run, and (3) ambiguous party names that match dozens of unrelated entities - tighten by court, date window, or nature-of-suit code.

Q: How fresh is the data? A: RECAP coverage of new filings depends on PACER users with the RECAP extension - hot dockets (high-profile cases, securities litigation, IP cases) are mirrored within hours, while obscure cases may take days or weeks. Docket metadata via CourtListener's PACER-RSS integration is typically same-day for participating courts.

Q: Can I schedule this daily or weekly? A: Yes. Daily cron is appropriate for recent_filings and attorney_tracking (new dockets land daily). Weekly is fine for litigation_portfolio and party_search (rollup reports). Apify Schedules + dedupe on docket_id gives clean deltas.

Q: Does it integrate with my CRM or data tooling? A: Yes. Apify webhooks POST every run to HubSpot, Salesforce, Pipedrive, Apollo, Slack, or any HTTP endpoint. Zapier, Make.com, and n8n templates available. Dataset exports as JSON / CSV / Excel / XML for litigation-management systems (LegalTracker, Mitratech, Tymetrix), conflict-check workflows, or BI tools.

Q: How does pricing work? A: PAY_PER_EVENT. You pay per case / docket record emitted, with surcharges on per-document document_lookup events and per-attorney / per-portfolio rollups. You only pay for what the actor actually emits.

  • sec-edgar-intel - cross-reference securities-litigation defendants against EDGAR filings (10-K risk factors, 8-K disclosures, restatements) for full due-diligence context.
  • uspto-patent-intel - pull the patent portfolios of parties in E.D. Tex. / D. Del. / N.D. Cal. patent-litigation cases to map the asserted IP.
  • b2b-sales-triggers - convert new lawsuits, judgments, and litigation-portfolio events into outbound sales triggers for legal-tech and insurance teams.

Integrations

  • Zapier - push to HubSpot/Salesforce/Pipedrive/Apollo
  • Make.com - workflow automation
  • n8n - self-hosted automation
  • Apify webhooks - POST to your endpoint
  • API + dataset export (JSON/CSV/Excel/XML)
  • MCP / AI agents - call from Claude/GPT/LangChain

Modes

ModeRequired inputsEmits
party_searchparty_namescase records + per-party rollup
case_searchcase_numbers or keywords or courtscase records
attorney_trackingattorney_namescase records + attorney rollup
litigation_portfolioparty_names (corporate)case records + portfolio rollup
recent_filingsoptional courts, party_names, keywordscase records since date_from
document_lookupcase_numbersone record per RECAP document

Input

See .actor/INPUT_SCHEMA.json. Sample - litigation portfolio for a corporate entity:

{
"mode": "litigation_portfolio",
"courtlistener_api_key": "YOUR_KEY",
"party_names": ["Apple Inc."],
"date_from": "2018-01-01",
"include_documents": false,
"max_results": 500
}

A free CourtListener API key raises the rate limit from ~60 req/min to ~5000/hr. Sign up at https://www.courtlistener.com/help/api/rest/.

Output

Sample output: ./.actor/sample-output.json â€" copy-paste-ready preview of real-looking records.

First record inline:

{
"record_type": "case",
"mode": "case_lookup",
"source": "courtlistener",
"scraped_at": "2026-05-13T22:01:14Z",
"available": true,
"reason": null,
"case_id": 4488210,
"case_name": "Aurora Signal Holdings, LLC v. Helion Compute, Inc.",
"case_name_short": "Aurora Signal v. Helion Compute",
"court": "United States District Court for the Northern District of California",
"court_short_name": "N.D. Cal.",
"jurisdiction": "federal",
"case_number": "3:26-cv-01882",
"docket_number": "3:26-cv-01882-VC",
"nature_of_suit": "830 Patent",
"cause": "35:271 Patent Infringement",
"filing_date": "2026-03-18",
"terminated_date": null,
"status": "active",
"assigned_to_judge": "Hon. Vince Chhabria",
"referred_to_judge": "Hon. Sallie Kim (Magistrate)",
"parties": [
{
"name": "Aurora Signal Holdings, LLC",
"role": "plaintiff",
"attorneys": [
"Jessica Vega-Roth",
"Marcus L. Tan"
]
},
{
"name": "Helion Compute, Inc.",
"role": "defendant",
"attorneys": [
"Priya Sundararajan",
"Connor Whelan"
]
}
],
"attorneys": [
{
"name": "Jessica Vega-Roth",
"firm": "Vega-Roth IP PLLC",
"role": "for plaintiff"
},
{
"name": "Marcus L. Tan",
"firm": "Vega-Roth IP PLLC",
"role": "for plaintiff"
},
{
"name": "Priya Sundararajan",
"firm": "Wilson Sonsini Goodrich & Rosati",
"role": "for defendant"
},
{
"name": "Connor Whelan",
"firm": "Wilson Sonsini Goodrich & Rosati",
"role": "for defendant"
}
],
"recent_docket_entries": [
{
"entry_number": 18,
"date_filed": "2026-05-09",
"description": "ORDER granting in part and denying in part defendant's motion to stay pending IPR. Signed by Judge Chhabria."
},
{
"entry_number": 17,
"date_filed": "2026-04-28",
"description": "REPLY in support of motion to stay (Helion Compute, Inc.)"
},
{
"entry_number": 16,
"date_filed": "2026-04-21",
"description": "OPPOSITION to motion to stay filed by Aurora Signal Holdings"
},
{
"entry_number": 15,
"date_filed": "2026-04-07",
"description": "MOTION to stay pending Inter Partes Review (Helion Compute, Inc.)"
}
],
"tags": [
"patent",
"ipr",
"stay_motion"
],
"recap_documents": [
{
"doc_id": 31182144,
"description": "Order on motion to stay",
"url": "https://www.courtlistener.com/recap/3-26-cv-01882/order-stay.pdf",
"page_count": 14,
"filed_date": "2026-05-09",
"text": null
}
],
"party_name": null,
"total_cases": null,
"active_cases": null,
"closed_cases": null,
"cases_as_plaintiff_count": null,
"cases_as_defendant_count": null,
"top_courts": null,
"sample_cases": null,
"attorney_name": null,
"firm": null,
"case_count_year": null,
"top_practice_areas": null,
"opposing_attorneys": null,
"entity_name": null,
"settlement_ratio_estimate": null,
"avg_case_duration_days": null,
"jurisdictions": null,
"top_judges": null,
"top_opposing_counsel": null,
"patent_crossover_flag": null
}

Sample case record:

{
"record_type": "case",
"mode": "party_search",
"source": "courtlistener",
"case_id": 64902131,
"case_name": "Apple Inc. v. Samsung Electronics Co., Ltd.",
"court": "United States District Court for the Northern District of California",
"court_short_name": "cand",
"jurisdiction": "federal",
"case_number": "5:11-cv-01846",
"nature_of_suit": "830 Patent",
"filing_date": "2011-04-15",
"terminated_date": "2018-12-27",
"status": "settled",
"assigned_to_judge": "Lucy H. Koh",
"parties": [
{"name": "Apple Inc.", "role": "plaintiff", "attorneys": ["..."]},
{"name": "Samsung Electronics Co., Ltd.", "role": "defendant", "attorneys": ["..."]}
],
"tags": ["patent"],
"available": true,
"scraped_at": "2026-05-14T12:00:00Z"
}

Pricing

Pay-per-event:

EventPrice
case_record$0.005
document_record$0.003
document_text_charge$0.010
intelligence_record$0.015

A typical party_search for one company that returns 50 cases costs 50 * $0.005 + 1 * $0.015 = $0.265.

FAQ

Q: Do I need a CourtListener API key? A: Optional but strongly recommended. Anonymous use is rate-limited to ~60 req/min; with a free key you get ~5000/hr.

Q: Does this hit PACER directly? A: No - PACER is paywalled and out of scope. We use only what RECAP has already archived. Coverage is excellent for federal district / appellate, thinner for state and bankruptcy.

Q: How accurate is settlement_ratio_estimate? A: It's a heuristic - we infer settlement from termination text ("stipulation of dismissal", "settle"), trial verdict from "judgment" / "verdict", dismissal from "dismiss". Directionally useful, not authoritative.

Q: Can I get document text? A: Set include_document_text=true. Text is only available where CourtListener has already OCRed / extracted the PDF; we do not run OCR ourselves.

Save your input as an Apify Task

Apify Tasks let you save a configured input once and re-run it with a single click - no need to re-type search terms, locations, filters, or tier settings every time. Tasks are the foundation for everything that comes next: schedules, monitor mode, and webhook routing all attach to a saved Task, not to the raw actor.

Steps to save your current input as a Task:

  1. On this actor's Apify Store page, click Run with your input fully configured.
  2. Click the Save as task button at the top of the run page.
  3. Name the task something memorable (e.g. Federal lawsuits filed against Acme Corp - daily).
  4. Reload the task page and click Start anytime to re-run with the same inputs.

Tasks unlock the next two features below: scheduling and monitor mode.

Run this weekly with Apify Schedules

Apify Schedules cron-run any saved Task automatically. Pair this with the saved Task above and you get hands-off recurring runs with no manual clicks, no missed weeks, and a steady stream of fresh data into your CRM or warehouse.

Steps to schedule a Task:

  1. Save your input as a Task (see above).
  2. Go to https://console.apify.com/schedules and click Create new schedule.
  3. Pick your Task and set the cron expression. Common patterns:
    • Daily at 9am UTC: 0 9 * * *
    • Weekly on Mondays at 9am: 0 9 * * 1
    • Monthly on the 1st: 0 9 1 * *
  4. Save. Apify will run your Task on that schedule automatically, push the dataset to whatever integrations you have wired up, and fire run-completion webhooks for downstream automation.

Run daily to monitor for new filings, judgments, and party mentions across the dockets you track.

Monitor mode (v2, beta)

Monitor mode is the v2 evolution of this actor and is currently in BETA. It turns a recurring schedule into a true change-feed instead of a firehose of duplicate records.

How it works:

  • When this actor runs under an Apify Schedule, monitor mode is enabled automatically.
  • Instead of emitting ALL records every run, it emits ONLY records that are NEW or CHANGED since the last scheduled run.
  • A digest record summarizes the delta (X new, Y changed, Z removed) at the top of every run.
  • Optional: provide a Slack or email webhook URL in the monitor_webhook_url input field and the digest fires there too, so your team gets the delta in their inbox or channel without polling the dataset.
  • Cost: a single scheduled_delta_run event ($0.05) per scheduled run, plus standard PPE on emitted delta records only. Predictable monthly cost, no surprise bills from re-charging for unchanged records.

Monitor mode is rolling out to the top 3 actors first (this one included if it's hotel-motel-lead-finder, google-maps-reviews-pro, or mcp-accounting-firm-leads). Full portfolio coverage by end of June.

Support

Open an issue on the actor's GitHub or email via the Apify Store contact link. Include the run ID and input config.

Changelog

See ./CHANGELOG.md.

Found this useful?

If this actor saved you time or money, please consider leaving a quick review on the Apify Store. Reviews help other buyers find work that solves their problem and let me prioritize the features paying customers actually use. Leave a review: https://apify.com/seibs.co/court-records-intel#reviews