# US Federal Register Scraper (`crawlerbros/federal-register-scraper`) Actor

Scrape US Federal Register documents, daily-published rules, proposed rules, notices, executive orders, and presidential documents. Search by term, filter by date / agency / type, fetch by document number. HTTP-only via the public federalregister.gov API.

- **URL**: https://apify.com/crawlerbros/federal-register-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Automation, Developer tools, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 21 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## US Federal Register Scraper

Scrape the US Federal Register — the official daily journal of US federal regulatory documents (rules, proposed rules, notices, executive orders, presidential documents). Search by term, filter by agency / date / type, fetch by document number. Pulls full bibliographic info, CFR cross-references, comment URLs, and optionally the full body text. HTTP-only via the public `federalregister.gov/api/v1` API. No auth, no proxy.

### What this actor does

- **Two modes:** `search` (full-text + filters) and `byDocumentNumbers` (lookup specific docs)
- **Filters:** publication date range, agency, document type (RULE / PRORULE / NOTICE / PRESDOCU), presidential document subtype
- **Sorts:** newest, oldest, relevance
- **Optional full text** — fetch and strip the body HTML to plain text
- **Empty fields are omitted** — no nulls in output

### Output per document

- `documentNumber`, `title`, `abstract`, `excerpts` (search-mode highlight)
- `type` (`Rule`/`Proposed Rule`/`Notice`/`Presidential Document`), `typeCode` (`RULE`/`PRORULE`/`NOTICE`/`PRESDOCU`)
- `publicationDate`, `effectiveDate`, `commentsCloseOn`, `signingDate`
- `citation` (e.g. `89 FR 12345`)
- `startPage`, `endPage`
- `agencies[]` — readable agency names
- `cfrReferences[]` — `[{title, part}]` references into the Code of Federal Regulations
- `docketIds[]` — Regulations.gov docket IDs
- `executiveOrderNumber` (when applicable)
- `presidentialDocumentNumber` (when applicable)
- `action`, `dates`, `explanation` — preamble fields (detail mode)
- `htmlUrl`, `pdfUrl`, `publicInspectionPdfUrl`, `fullTextXmlUrl`, `bodyHtmlUrl`, `commentUrl`
- `bodyText` — when `fetchFullText=true`
- `recordType: "document"`, `scrapedAt`

### Input

| Field | Type | Default | Description |
|---|---|---|---|
| `mode` | string | `search` | `search` / `byDocumentNumbers` |
| `searchTerm` | string | `climate` | Free-text query (mode=search) |
| `documentNumbers` | array | – | Federal Register document numbers (mode=byDocumentNumbers) |
| `documentTypes` | array | `[]` | `RULE`/`PRORULE`/`NOTICE`/`PRESDOCU` |
| `agencies` | array | `[]` | Agency slugs (e.g. `environmental-protection-agency`) |
| `publicationDateGte` | string | – | YYYY-MM-DD lower bound |
| `publicationDateLte` | string | – | YYYY-MM-DD upper bound |
| `presidentialDocumentTypes` | array | `[]` | `executive_order`/`proclamation`/etc. |
| `sortBy` | string | `newest` | `newest`/`oldest`/`relevance` |
| `fetchFullText` | bool | `false` | Fetch + strip body HTML |
| `maxItems` | int | `50` | Hard cap (1–5000) |

#### Example: all final rules from EPA in 2024

```json
{
  "mode": "search",
  "searchTerm": "",
  "agencies": ["environmental-protection-agency"],
  "documentTypes": ["RULE"],
  "publicationDateGte": "2024-01-01",
  "publicationDateLte": "2024-12-31",
  "maxItems": 200
}
````

#### Example: all executive orders since 2024

```json
{
  "mode": "search",
  "documentTypes": ["PRESDOCU"],
  "presidentialDocumentTypes": ["executive_order"],
  "publicationDateGte": "2024-01-01",
  "sortBy": "newest"
}
```

#### Example: lookup specific documents

```json
{
  "mode": "byDocumentNumbers",
  "documentNumbers": ["2024-08901", "2024-12345"]
}
```

#### Example: search "climate" with body text included

```json
{
  "mode": "search",
  "searchTerm": "climate disclosure",
  "fetchFullText": true,
  "maxItems": 25
}
```

### Use cases

- **Regulatory monitoring** — track every new rule from agencies you care about
- **Compliance** — surface comment-period deadlines and effective dates
- **Policy analysis** — bulk-export rules / EOs by topic for analysis
- **Legal research** — find every rule that cites a specific CFR title and part
- **Lobbying / GR** — aggregate all proposed rules in your industry vertical
- **Journalism** — daily digest of federal regulatory output

### FAQ

**Do I need a federalregister.gov account?**  No. The Federal Register API is fully public and requires no authentication.

**Is there a rate limit?**  The API is generous; the actor inserts small delays to stay polite (~3 req/s).

**What's the difference between RULE and PRORULE?**  `RULE` = final rule (already in effect or about to be). `PRORULE` = proposed rule (open for public comment).

**What's a "presidential document"?**  Executive Orders, Proclamations, Memoranda, Notices, and other presidential issuances. Filter further with `presidentialDocumentTypes`.

**How do I find an agency slug?**  Browse [federalregister.gov/agencies](https://www.federalregister.gov/agencies). The slug is the URL segment (e.g. `environmental-protection-agency`).

**What's `cfrReferences`?**  Pointers from this document into the Code of Federal Regulations. Each reference has a `title` (e.g. `40` for Environment) and a `part` (e.g. `52`).

**Why does `bodyText` cost extra?**  It requires a second HTTP request per document to fetch the body HTML, then strips it to plain text. For 100 documents that's 100 extra requests. Default is off; opt in via `fetchFullText: true`.

**How fresh is the data?**  Same-day. The Federal Register is published every federal business day at 8:45 AM ET; the API reflects new documents within minutes.

**Can I search the body of documents?**  Yes — the `searchTerm` parameter searches title, abstract, AND full body text by default.

# Actor input Schema

## `mode` (type: `string`):

What to fetch.

## `searchTerm` (type: `string`):

Free-text search term (mode=search). Searches across title, abstract, and body.

## `documentNumbers` (type: `array`):

Federal Register document numbers (e.g. `2024-08901`).

## `documentTypes` (type: `array`):

Filter by document type. Leave empty for all.

## `agencies` (type: `array`):

Filter by agency slug (e.g. `environmental-protection-agency`, `treasury-department`). See https://www.federalregister.gov/agencies for the full list.

## `agencySlug` (type: `string`):

Single agency slug for mode=byAgency. Validated against /api/v1/agencies/<slug>. Examples: `environmental-protection-agency`, `treasury-department`, `food-and-drug-administration`.

## `publicationDateGte` (type: `string`):

Drop documents published before this date.

## `publicationDateLte` (type: `string`):

Drop documents published after this date.

## `presidentialDocumentTypes` (type: `array`):

Optional: filter presidential documents to specific subtypes (e.g. `executive_order`, `proclamation`).

## `sortBy` (type: `string`):

Order of returned documents.

## `fetchFullText` (type: `boolean`):

When true, the actor performs a second request per document to fetch the full body HTML (`body_html_url`) and emits it as `bodyText` (HTML stripped to plain text).

## `maxItems` (type: `integer`):

Hard cap on emitted records.

## Actor input object example

```json
{
  "mode": "search",
  "searchTerm": "climate",
  "documentNumbers": [],
  "documentTypes": [],
  "agencies": [],
  "presidentialDocumentTypes": [],
  "sortBy": "newest",
  "fetchFullText": false,
  "maxItems": 50
}
```

# Actor output Schema

## `documents` (type: `string`):

Dataset containing all scraped Federal Register documents.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "search",
    "searchTerm": "climate",
    "documentNumbers": [],
    "documentTypes": [],
    "agencies": [],
    "presidentialDocumentTypes": [],
    "sortBy": "newest",
    "fetchFullText": false,
    "maxItems": 50
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/federal-register-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "search",
    "searchTerm": "climate",
    "documentNumbers": [],
    "documentTypes": [],
    "agencies": [],
    "presidentialDocumentTypes": [],
    "sortBy": "newest",
    "fetchFullText": False,
    "maxItems": 50,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/federal-register-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "search",
  "searchTerm": "climate",
  "documentNumbers": [],
  "documentTypes": [],
  "agencies": [],
  "presidentialDocumentTypes": [],
  "sortBy": "newest",
  "fetchFullText": false,
  "maxItems": 50
}' |
apify call crawlerbros/federal-register-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/federal-register-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "US Federal Register Scraper",
        "description": "Scrape US Federal Register documents, daily-published rules, proposed rules, notices, executive orders, and presidential documents. Search by term, filter by date / agency / type, fetch by document number. HTTP-only via the public federalregister.gov API.",
        "version": "1.0",
        "x-build-id": "PSgeUATfPh2SSBIRS"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~federal-register-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-federal-register-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~federal-register-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-federal-register-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~federal-register-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-federal-register-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "search",
                            "byDocumentNumbers",
                            "byAgency"
                        ],
                        "type": "string",
                        "description": "What to fetch.",
                        "default": "search"
                    },
                    "searchTerm": {
                        "title": "Search term",
                        "type": "string",
                        "description": "Free-text search term (mode=search). Searches across title, abstract, and body.",
                        "default": "climate"
                    },
                    "documentNumbers": {
                        "title": "Document numbers (mode=byDocumentNumbers)",
                        "type": "array",
                        "description": "Federal Register document numbers (e.g. `2024-08901`).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "documentTypes": {
                        "title": "Document types",
                        "type": "array",
                        "description": "Filter by document type. Leave empty for all.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "RULE",
                                "PRORULE",
                                "NOTICE",
                                "PRESDOCU"
                            ],
                            "enumTitles": [
                                "Final rule",
                                "Proposed rule",
                                "Notice",
                                "Presidential document"
                            ]
                        },
                        "default": []
                    },
                    "agencies": {
                        "title": "Agency slugs",
                        "type": "array",
                        "description": "Filter by agency slug (e.g. `environmental-protection-agency`, `treasury-department`). See https://www.federalregister.gov/agencies for the full list.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "agencySlug": {
                        "title": "Agency slug (mode=byAgency)",
                        "type": "string",
                        "description": "Single agency slug for mode=byAgency. Validated against /api/v1/agencies/<slug>. Examples: `environmental-protection-agency`, `treasury-department`, `food-and-drug-administration`."
                    },
                    "publicationDateGte": {
                        "title": "Publication date from (YYYY-MM-DD)",
                        "type": "string",
                        "description": "Drop documents published before this date."
                    },
                    "publicationDateLte": {
                        "title": "Publication date to (YYYY-MM-DD)",
                        "type": "string",
                        "description": "Drop documents published after this date."
                    },
                    "presidentialDocumentTypes": {
                        "title": "Presidential document types",
                        "type": "array",
                        "description": "Optional: filter presidential documents to specific subtypes (e.g. `executive_order`, `proclamation`).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "sortBy": {
                        "title": "Sort by",
                        "enum": [
                            "newest",
                            "oldest",
                            "relevance"
                        ],
                        "type": "string",
                        "description": "Order of returned documents.",
                        "default": "newest"
                    },
                    "fetchFullText": {
                        "title": "Fetch full text body",
                        "type": "boolean",
                        "description": "When true, the actor performs a second request per document to fetch the full body HTML (`body_html_url`) and emits it as `bodyText` (HTML stripped to plain text).",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Hard cap on emitted records.",
                        "default": 50
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
