# CourtListener Scraper: Opinions, Dockets & Full Text (`autofacts/courtlistener-scraper`) Actor

Scrape CourtListener US court opinions with full text, dockets, oral arguments, judges, citations, and RAG-ready chunks. Includes a private-build token pool and optional per-run token override.

- **URL**: https://apify.com/autofacts/courtlistener-scraper.md
- **Developed by:** [Richard Feng](https://apify.com/autofacts) (community)
- **Categories:** Business, AI, Education
- **Stats:** 2 total users, 1 monthly users, 0.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $5.00 / 1,000 opinion with full texts

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## CourtListener Scraper: Opinions, Dockets & Full Text

Scrape US case law from [CourtListener](https://www.courtlistener.com) (Free Law Project): court **opinions with complete full text** — not just the search snippets every other scraper returns — plus dockets, oral arguments, judges and citation lookups. Built to survive CourtListener's 2026 rate-limit changes.

### Why this one

- **Full opinion text, not snippets.** Competing actors return the ~300-character search snippet. This Actor follows each result into the database endpoint and returns the complete opinion (`html_with_citations`), cleaned and chunked for RAG.
- **Rate-limit aware.** In May 2026 CourtListener cut the free tier to 5/min, 50/hr, 125/day. This Actor paces itself to your tier and handles throttling responses in both formats CourtListener uses (DRF text and the citation-lookup `wait_until` JSON), backing off precisely instead of failing.
- **Token-aware private build.** Private builds can embed a CourtListener token in code, and `apiToken` remains available as a per-run override.

### Features

| Feature | Description |
|---------|-------------|
| ⚖️ Opinions | Case-law search with full opinion text + RAG chunks |
| 📁 Dockets / RECAP | Federal dockets and RECAP documents with parties, judges, nature of suit |
| 🎧 Oral arguments | Argument audio metadata with download URLs |
| 👤 Judges | Judge profiles (the "people" database) |
| 🔎 Citation lookup | Resolve citation strings (e.g. `576 U.S. 644`) to cases |

### Quick Start

```json
{
    "apiToken": "OPTIONAL_COURTLISTENER_TOKEN_OVERRIDE",
    "searchType": "opinions",
    "query": "qualified immunity",
    "court": "scotus",
    "includeFullText": true,
    "maxItems": 50
}
````

### Input

| Field | Type | Description |
|-------|------|-------------|
| `apiToken` | string (secret) | Optional free token override from [courtlistener.com/profile/apikeys](https://www.courtlistener.com/profile/apikeys/). Private builds can embed fallback tokens in code |
| `searchType` | string | `opinions`, `dockets`, `recap_docs`, `oral_arguments`, `judges`, `citation` |
| `query` | string | Full-text query with boolean operators (AND/OR/NOT) and field prefixes |
| `citations` | array | Citation strings to resolve (when `searchType` is `citation`) |
| `court` | string | Court ID, e.g. `scotus`, `ca9`, `nyed` |
| `dateFrom` / `dateTo` | string | Filing-date range (YYYY-MM-DD) |
| `orderBy` | string | `score desc`, `dateFiled desc`, `citeCount desc`… |
| `includeFullText` | boolean | Fetch complete opinion text + chunks (opinions only; default true) |
| `chunking` | string | `paragraph` (~2000 chars) or `none` |
| `requestsPerMinute` | integer | Throttle to your tier — keep at 4 for the free tier, raise it with a membership |
| `maxItems` | integer | Max items to save (default 100) |

### Output

```json
{
    "itemType": "legal",
    "searchType": "opinions",
    "id": "10380001",
    "title": "Climate United Fund v. Citibank, N.A.",
    "court": "Court of Appeals for the D.C. Circuit",
    "date": "2025-04-16",
    "url": "https://www.courtlistener.com/opinion/10380001/...",
    "citations": [],
    "citeCount": 0,
    "fullText": "...",
    "chunks": [{ "text": "...", "order": 0 }],
    "meta": { "courtId": "cadc", "clusterId": 10380001 }
}
```

### Recipes

#### 1. Build a case-law RAG corpus

Pull a court's opinions with full text, chunked for embeddings:

```json
{
    "apiToken": "OPTIONAL_TOKEN_OVERRIDE",
    "searchType": "opinions",
    "query": "first amendment retaliation",
    "court": "ca9",
    "includeFullText": true,
    "chunking": "paragraph",
    "maxItems": 200
}
```

#### 2. Resolve a brief's citations

Turn a list of citations into linked cases:

```json
{
    "apiToken": "OPTIONAL_TOKEN_OVERRIDE",
    "searchType": "citation",
    "citations": ["576 U.S. 644", "410 U.S. 113", "347 U.S. 483"]
}
```

#### 3. Track a court's docket activity

```json
{
    "apiToken": "OPTIONAL_TOKEN_OVERRIDE",
    "searchType": "dockets",
    "query": "antitrust",
    "court": "nysd",
    "dateFrom": "2026-01-01",
    "maxItems": 100
}
```

### Pricing

Pay-per-event: **$0.005 per opinion** (with full text), **$0.002 per** docket / oral argument / judge / citation. A 1,000-opinion corpus costs **$5.00**.

### FAQ

**Do I need a token?**

Yes. CourtListener requires a token. Private builds can embed a fallback token in code; `apiToken` lets callers use their own token and rate budget. Tokens are free at [courtlistener.com/profile/apikeys](https://www.courtlistener.com/profile/apikeys/). Researchers can request an EDU membership for higher limits.

**How fast can it go?**

As fast as your tier allows. The free tier is 5 requests/minute since May 2026 — set `requestsPerMinute` to 4 and the Actor paces itself and backs off on throttling. With a membership, raise it.

**Why is full text better than what other scrapers return?**

The search API only returns a short snippet. This Actor fetches the opinion record itself, so you get the entire decision — essential for RAG, citation analysis, or fine-tuning.

### Legal & Compliance

CourtListener is operated by the non-profit Free Law Project and serves public-domain US court records. This Actor respects the rate limits attached to the token it uses and accesses only public data. Keep private builds private while fallback tokens are embedded in code. Please review CourtListener's [terms](https://www.courtlistener.com/terms/).

# Actor input Schema

## `apiToken` (type: `string`):

Optional override for the private build's embedded token. Provide your own free token (https://www.courtlistener.com/profile/apikeys/) to use your own rate budget. The free tier is 5/min, 50/hr, 125/day; membership raises it.

## `searchType` (type: `string`):

What to retrieve.

## `query` (type: `string`):

Full-text query with boolean operators (AND/OR/NOT) and field prefixes. Required unless using a court filter or citation lookup.

## `citations` (type: `array`):

Citation strings to resolve, e.g. `576 U.S. 644`. Used when searchType is `citation` (up to 250).

## `court` (type: `string`):

CourtListener court ID, e.g. `scotus`, `ca9`, `nyed`.

## `dateFrom` (type: `string`):

Earliest filing date (YYYY-MM-DD).

## `dateTo` (type: `string`):

Latest filing date (YYYY-MM-DD).

## `orderBy` (type: `string`):

Sort order: `score desc` (relevance), `dateFiled desc`, `dateFiled asc`, `citeCount desc`.

## `includeFullText` (type: `boolean`):

For opinions, fetch the complete opinion text and RAG chunks from the database endpoint (one extra request per opinion). Turn off for faster metadata-only runs.

## `chunking` (type: `string`):

How to chunk full text: `paragraph` (~2000 chars) or `none`.

## `requestsPerMinute` (type: `integer`):

Throttle to stay within your tier. Keep at 4 for the free tier (5/min limit); raise it if you have a membership.

## `maxItems` (type: `integer`):

Maximum number of items to save. Each saved item is charged as one event.

## Actor input object example

```json
{
  "searchType": "opinions",
  "query": "climate change",
  "orderBy": "score desc",
  "includeFullText": true,
  "chunking": "paragraph",
  "requestsPerMinute": 4,
  "maxItems": 100
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "climate change"
};

// Run the Actor and wait for it to finish
const run = await client.actor("autofacts/courtlistener-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "query": "climate change" }

# Run the Actor and wait for it to finish
run = client.actor("autofacts/courtlistener-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "climate change"
}' |
apify call autofacts/courtlistener-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=autofacts/courtlistener-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "CourtListener Scraper: Opinions, Dockets & Full Text",
        "description": "Scrape CourtListener US court opinions with full text, dockets, oral arguments, judges, citations, and RAG-ready chunks. Includes a private-build token pool and optional per-run token override.",
        "version": "1.0",
        "x-build-id": "RflNPrRcCeKE23ZKg"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/autofacts~courtlistener-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-autofacts-courtlistener-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/autofacts~courtlistener-scraper/runs": {
            "post": {
                "operationId": "runs-sync-autofacts-courtlistener-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/autofacts~courtlistener-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-autofacts-courtlistener-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "apiToken": {
                        "title": "CourtListener API token",
                        "type": "string",
                        "description": "Optional override for the private build's embedded token. Provide your own free token (https://www.courtlistener.com/profile/apikeys/) to use your own rate budget. The free tier is 5/min, 50/hr, 125/day; membership raises it."
                    },
                    "searchType": {
                        "title": "Search type",
                        "enum": [
                            "opinions",
                            "dockets",
                            "recap_docs",
                            "oral_arguments",
                            "judges",
                            "citation"
                        ],
                        "type": "string",
                        "description": "What to retrieve.",
                        "default": "opinions"
                    },
                    "query": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Full-text query with boolean operators (AND/OR/NOT) and field prefixes. Required unless using a court filter or citation lookup."
                    },
                    "citations": {
                        "title": "Citations (citation lookup)",
                        "type": "array",
                        "description": "Citation strings to resolve, e.g. `576 U.S. 644`. Used when searchType is `citation` (up to 250).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "court": {
                        "title": "Court",
                        "type": "string",
                        "description": "CourtListener court ID, e.g. `scotus`, `ca9`, `nyed`."
                    },
                    "dateFrom": {
                        "title": "Filed after",
                        "pattern": "^\\d{4}-\\d{2}-\\d{2}$",
                        "type": "string",
                        "description": "Earliest filing date (YYYY-MM-DD)."
                    },
                    "dateTo": {
                        "title": "Filed before",
                        "pattern": "^\\d{4}-\\d{2}-\\d{2}$",
                        "type": "string",
                        "description": "Latest filing date (YYYY-MM-DD)."
                    },
                    "orderBy": {
                        "title": "Order by",
                        "type": "string",
                        "description": "Sort order: `score desc` (relevance), `dateFiled desc`, `dateFiled asc`, `citeCount desc`.",
                        "default": "score desc"
                    },
                    "includeFullText": {
                        "title": "Include full text (opinions)",
                        "type": "boolean",
                        "description": "For opinions, fetch the complete opinion text and RAG chunks from the database endpoint (one extra request per opinion). Turn off for faster metadata-only runs.",
                        "default": true
                    },
                    "chunking": {
                        "title": "Chunking",
                        "enum": [
                            "paragraph",
                            "none"
                        ],
                        "type": "string",
                        "description": "How to chunk full text: `paragraph` (~2000 chars) or `none`.",
                        "default": "paragraph"
                    },
                    "requestsPerMinute": {
                        "title": "Requests per minute",
                        "minimum": 1,
                        "maximum": 60,
                        "type": "integer",
                        "description": "Throttle to stay within your tier. Keep at 4 for the free tier (5/min limit); raise it if you have a membership.",
                        "default": 4
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of items to save. Each saved item is charged as one event.",
                        "default": 100
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
