# DOJ Press Releases Scraper | US Justice Department (`parseforge/doj-press-releases-scraper`) Actor

Extract US Department of Justice press releases with title, date, topic, components, and full body text. Filter by topic and date range. Useful for compliance teams, legal researchers, and journalists tracking federal investigations and prosecutions.

- **URL**: https://apify.com/parseforge/doj-press-releases-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** News, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $19.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## ⚖️ DOJ Press Releases Scraper

> 🚀 **Export every Department of Justice press release in seconds.** Indictments, charges, settlements, and official statements from the US DOJ.

> 🕒 **Last updated:** 2026-05-25 · **📊 9 fields** per record · **265,000+ releases** · **All DOJ components**

The US Department of Justice publishes press releases from every component, US Attorney's Office, and headquarters division. This scraper pulls the full archive (over 265,000 releases) via the official justice.gov API and returns them as structured records ready for analysis or alerting.

Each record returns the title, release number, publication date, issuing component (e.g. Office of the Attorney General, US Attorney's Office, FBI), topic, teaser, and full body text with HTML stripped. Sort newest or oldest, filter by search term.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Journalists, compliance, law firms, researchers, OSINT analysts | Monitoring enforcement actions, tracking US Attorney activity, training legal LLMs, alert systems on new charges |

### 📋 What the DOJ Press Releases Scraper does

- Pulls press releases from the official justice.gov public API
- Filters by search term across title
- Sorts newest or oldest first
- Returns full plain-text body with HTML stripped
- Paginates automatically to your requested maxItems

> 💡 **Why it matters:** New DOJ charges drop with no public notice. Catching them on publication day is the difference between writing the story and reading it.

### 🎬 Full Demo (_🚧 Coming soon_)

### ⚙️ Input

<table>
<tr><th>Field</th><th>Type</th><th>Description</th></tr>
<tr><td>searchTerm</td><td>string</td><td>Free-text search across title</td></tr>
<tr><td>maxItems</td><td>integer</td><td>How many records to return (free=10, paid up to 1M)</td></tr>
<tr><td>sortDirection</td><td>string</td><td>DESC newest first / ASC oldest first</td></tr>
</table>

Example A. Newest 50 DOJ press releases:
```json
{ "maxItems": 50 }
````

Example B. Search for fentanyl-related releases, oldest first:

```json
{ "searchTerm": "fentanyl", "sortDirection": "ASC", "maxItems": 200 }
```

> ⚠️ **Good to Know:** The DOJ API returns the same content as justice.gov. Older archived releases redirect to their archive URLs.

### 📊 Output

| Field | Type | Description |
|---|---|---|
| 📌 title | string | Release title |
| 🔗 url | string | Official justice.gov URL |
| 🆔 number | string | Press release number e.g. 25-0123 |
| 📅 publicationDate | string | Publication date YYYY-MM-DD |
| 🏛️ components | string | Issuing DOJ component(s) |
| 📑 topic | string | Topic tags |
| 📝 teaser | string | Short summary |
| 📄 body | string | Full body text, HTML stripped |
| 🕒 scrapedAt | string | ISO timestamp |

Sample record:

```json
{
  "title": "U.S. Attorney's Office Announces Indictment",
  "url": "https://www.justice.gov/usao-edpa/pr/...",
  "publicationDate": "2026-05-22",
  "components": "U.S. Attorney's Office, Eastern District of Pennsylvania",
  "teaser": "PHILADELPHIA - Acting Attorney General Todd Blanche..."
}
```

### ✨ Why choose this Actor

- 🇺🇸 Official justice.gov API source
- 📦 Export CSV, Excel, JSON, XML
- 🔄 Schedule daily for new release alerts
- 🎯 Full body text included

### 📈 How it compares to alternatives

| Approach | Result |
|---|---|
| Manual justice.gov browsing | Slow, no bulk export |
| RSS readers | Limited history |
| Building your own API client | Days of work |
| This Actor | One click structured export |

### 🚀 How to use

1. [Create a free Apify account with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp)
2. Open this Actor
3. Set filters
4. Click Start
5. Download CSV, Excel, JSON, or XML

### 💼 Business use cases

**Compliance**: track DOJ enforcement actions in your sector.
**Legal research**: build searchable archives of charges.
**Media monitoring**: alert on press releases mentioning specific entities.
**LLM training**: clean corpus of legal-press text.

### 🔌 Automating DOJ Press Releases Scraper

Connect to Make, Zapier, Slack, Airbyte, GitHub Actions, Google Drive, or any tool via the Apify API and webhooks.

### 🌟 Beyond business use cases

**Research**: longitudinal study of DOJ enforcement priorities.
**Personal**: track topics that affect your community.
**Non-profit**: monitor civil rights or environmental enforcement.
**Experimentation**: train classifiers on legal text.

### 🤖 Ask an AI assistant about this scraper

Paste this README into ChatGPT, Claude, Perplexity, or Copilot.

### ❓ Frequently Asked Questions

**Q: 📅 How current?** A: Same-day publication on justice.gov.
**Q: 📦 Export formats?** A: CSV, Excel, JSON, XML.
**Q: 🔍 Full-text search?** A: Title search via searchTerm.
**Q: 📅 How far back?** A: Full DOJ archive.
**Q: 🏛️ Filter by US Attorney's Office?** A: Use searchTerm with the office name.
**Q: 💰 Cost?** A: Free=10 items, paid up to 1M.
**Q: ⏰ Scheduling?** A: Yes, daily/hourly via Apify.
**Q: 🔒 Official?** A: Yes, justice.gov API.
**Q: 📄 Body text?** A: Plain text, HTML stripped.
**Q: 🆔 What's the number field?** A: DOJ-assigned press release ID.

### 🔌 Integrate with any app

Make, Zapier, n8n, Slack, Airbyte, GitHub Actions, Google Drive, Google Sheets, BigQuery, S3, webhooks, REST API.

### 🔗 Recommended Actors

| Actor | Description |
|---|---|
| Federal Register Scraper | Daily federal rules, notices, proposed rules |
| Congress.gov Bills Scraper | US legislative bills |
| GAO Reports Scraper | Government Accountability Office reports |
| CourtListener RSS Scraper | Court opinions and dockets |
| FARA Foreign Agents Scraper | Foreign Agents Registration Act filings |

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge).

**🆘 Need Help?** [Open our contact form](https://tally.so/r/BzdKgA)

> **⚠️ Disclaimer:** Independent tool, not affiliated with the US Department of Justice. Only publicly available data is collected.

# Actor input Schema

## `searchTerm` (type: `string`):

Free-text search inside DOJ press releases. Leave empty for all.

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000.

## `sortDirection` (type: `string`):

Sort by publication date.

## Actor input object example

```json
{
  "maxItems": 10,
  "sortDirection": "DESC"
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/doj-press-releases-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "maxItems": 10 }

# Run the Actor and wait for it to finish
run = client.actor("parseforge/doj-press-releases-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10
}' |
apify call parseforge/doj-press-releases-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/doj-press-releases-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "DOJ Press Releases Scraper | US Justice Department",
        "description": "Extract US Department of Justice press releases with title, date, topic, components, and full body text. Filter by topic and date range. Useful for compliance teams, legal researchers, and journalists tracking federal investigations and prosecutions.",
        "version": "0.1",
        "x-build-id": "jvgzpcntycAqQEEIP"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~doj-press-releases-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-doj-press-releases-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~doj-press-releases-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-doj-press-releases-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~doj-press-releases-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-doj-press-releases-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchTerm": {
                        "title": "Search Term",
                        "type": "string",
                        "description": "Free-text search inside DOJ press releases. Leave empty for all."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000."
                    },
                    "sortDirection": {
                        "title": "Sort Direction",
                        "enum": [
                            "DESC",
                            "ASC"
                        ],
                        "type": "string",
                        "description": "Sort by publication date.",
                        "default": "DESC"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
