GovInfo Publications Scraper
Pricing
from $19.00 / 1,000 results
GovInfo Publications Scraper
Reach into govinfo.gov for official US government publications spanning bills, the Congressional Record, the Federal Register, and committee reports. Each item returns the package id, title, collection, congress number, document class, publish date, and branch. Built for policy research.
Pricing
from $19.00 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share

๐๏ธ GovInfo Publications Scraper
๐ Pull official US government publications in seconds. Bills, Federal Register, Congressional Record, committee reports, and the US Code, straight from the public govinfo.gov collections API.
๐ Last updated 2026-06-05 ยท ๐ 11 fields per record ยท 15+ federal collections ยท Bills, hearings, reports, regulations ยท Updated daily by GPO
The GovInfo Publications Scraper turns the official Government Publishing Office (GPO) api.govinfo.gov/collections endpoint into a structured dataset of federal publications. Pick a collection (Bills, Federal Register, Congressional Record, US Code, etc.), set a date range, and the actor walks the paginated response and flattens each package into one row.
GovInfo is the authoritative source for official US federal documents, maintained by the GPO and trusted by lawyers, journalists, and researchers.
| ๐ฏ Target Audience | ๐ก Primary Use Cases |
|---|---|
| โ๏ธ Legal researchers | Track new bills and statutes by Congress |
| ๐ฐ Investigative journalists | Monitor the Federal Register for new rules |
| ๐ข Policy analysts | Build longitudinal datasets of congressional activity |
| ๐ค ML engineers | Train models on official legislative text |
| ๐๏ธ Civic tech builders | Power public dashboards of congressional output |
๐ What the GovInfo Publications Scraper does
- Calls
api.govinfo.gov/collections/{collection}/{startDate}/{endDate}with your filters. - Walks the paginated response across
pageSizeandoffset. - Flattens each package into one row with stable columns.
- Builds the canonical PDF URL for every package so you can pipe to a downloader.
- Pushes any upstream error as a single record rather than crashing the run.
๐ก Why it matters GovInfo holds the authoritative copy of every US federal publication, but its API requires you to chunk by date range and walk offsets. This actor handles pagination so you can focus on the data.
๐ฌ Full Demo
๐ง Coming soon.
โ๏ธ Input
| Field | Type | Required | Description |
|---|---|---|---|
collection | enum | No | GovInfo collection. Default BILLS. |
startDate | string | No | ISO 8601 lower bound. Default 2025-01-01T00:00:00Z. |
endDate | string | No | ISO 8601 upper bound. Empty means now. |
congress | integer | No | Optional Congress number (e.g. 118). |
docClass | string | No | Optional document class slug (e.g. hr, s). |
maxItems | integer | No | Free plan caps at 10. Paid up to 1,000,000. |
Example 1, recent House bills:
{"collection": "BILLS","startDate": "2025-01-01T00:00:00Z","docClass": "hr","maxItems": 50}
Example 2, latest Federal Register entries:
{"collection": "FR","startDate": "2026-05-01T00:00:00Z","maxItems": 100}
โ ๏ธ Good to Know The operator running this actor needs a free
data.govAPI key set as theGOVINFO_API_KEYenvironment variable. Get one in 30 seconds at api.data.gov/signup.
๐ Output
Each record is a flat object. error is always last.
| Field | Type | Description |
|---|---|---|
๐ packageId | string | Unique GovInfo package identifier. |
๐ title | string | Publication title. |
๐ collection | string | Collection code (BILLS, FR, CREC, etc.). |
๐๏ธ congress | integer | Congress number when applicable. |
๐๏ธ docClass | string | Document class (hr, s, hres, sres, etc.). |
๐
publishDate | string | Date issued. |
๐ lastModified | string | Last modified timestamp from GPO. |
๐ pdfUrl | string | Direct link to the official PDF on govinfo.gov. |
๐ข granuleCount | integer | Number of sub-documents inside the package. |
๐ข branch | string | Government branch when present. |
๐ scrapedAt | string | When this row was fetched. |
โ error | string | Set if the upstream response was an error. |
โจ Why choose this Actor
| ๐ | Free official GPO data, no scraping needed. | | ๐ | 15+ federal collections covered out of the box. | | ๐ | Auto-builds the direct PDF URL for every package. | | ๐ | Pagination handled, error responses surfaced cleanly. | | ๐ | Plain HTTP, no browsers, fast and resilient. |
๐ How it compares to alternatives
| Approach | Setup | Pagination | PDF URL builder |
|---|---|---|---|
| Roll your own fetch | 30 min plus | manual | manual |
| Python govinfo client | install plus script | partial | partial |
| This Actor | 5 sec, no install | โ | โ |
๐ How to use
- Click Try for free.
- Pick a
collectionand astartDate. - Click Start.
- Open the dataset when the run finishes.
๐ผ Business use cases
โ๏ธ Legal monitoring Watch the Federal Register for new agency rules affecting your industry.
๐ Policy research Pull all bills introduced in a Congress and feed pandas for descriptive stats.
๐ฐ Newsroom backbone Reporters can verify the exact text of a public law in seconds.
๐ค ML training Build a corpus of official legislative text for legal LLMs.
๐ Automating GovInfo Publications Scraper
- Make and Zapier trigger on schedule and push results to Airtable, Google Sheets, or Slack.
- Cron native Apify scheduler runs the actor every morning.
- Webhooks receive a POST the moment a run completes.
- Warehouses pipe directly to BigQuery, Snowflake, or Postgres.
๐ Beyond business use cases
๐ Education Teach a civics class with real congressional data.
๐งช Personal research Track every bill your representative introduces.
๐ค Non-profit and open data Power public dashboards of legislative activity.
๐งฐ Tinkering Spin up a quick prototype on top of authoritative federal data.
๐ค Ask an AI assistant about this scraper
Paste this README into ChatGPT, Claude, or any assistant and ask it to map your workflow to the actor's inputs. The schema, examples, and field list above are everything an LLM needs.
โ Frequently Asked Questions
โ Do I need an API key The operator running the actor sets one server-side. End users do nothing.
โ Which collections are supported All 15+ public collections in the dropdown.
โ Is there a rate limit Yes, the data.gov tier allows 1,000 requests per hour per key.
โ Can I get full document text Yes, follow the pdfUrl to the official PDF on govinfo.gov.
โ How fresh is the data GPO publishes new packages daily.
โ Can I schedule runs Yes, use Apify's native scheduler.
โ Is this scraping or API Pure API. Endpoint is official and public.
โ Will the schema change Stable core fields. New collections are additive.
๐ Integrate with any app
Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST endpoint or webhook.
๐ Recommended Actors
| Actor | What it does |
|---|---|
| ParseForge OurAirports Scraper | Global airport database. |
| ParseForge Alpha Vantage Scraper | Public market data. |
| ParseForge NBA Stats Scraper | Player and team stats. |
| ParseForge CurseForge Mods Scraper | Public mod metadata. |
๐ก Pro Tip browse the complete ParseForge collection for 900+ production-grade scrapers.
Disclaimer This actor uses only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any of the third-party services referenced. Users are responsible for complying with the target site's terms of service and applicable law. Create a free account w/ $5 credit.