GovInfo Publications Scraper avatar

GovInfo Publications Scraper

Pricing

from $19.00 / 1,000 results

Go to Apify Store
GovInfo Publications Scraper

GovInfo Publications Scraper

Reach into govinfo.gov for official US government publications spanning bills, the Congressional Record, the Federal Register, and committee reports. Each item returns the package id, title, collection, congress number, document class, publish date, and branch. Built for policy research.

Pricing

from $19.00 / 1,000 results

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

ParseForge Banner

๐Ÿ›๏ธ GovInfo Publications Scraper

๐Ÿš€ Pull official US government publications in seconds. Bills, Federal Register, Congressional Record, committee reports, and the US Code, straight from the public govinfo.gov collections API.

๐Ÿ•’ Last updated 2026-06-05 ยท ๐Ÿ“Š 11 fields per record ยท 15+ federal collections ยท Bills, hearings, reports, regulations ยท Updated daily by GPO

The GovInfo Publications Scraper turns the official Government Publishing Office (GPO) api.govinfo.gov/collections endpoint into a structured dataset of federal publications. Pick a collection (Bills, Federal Register, Congressional Record, US Code, etc.), set a date range, and the actor walks the paginated response and flattens each package into one row.

GovInfo is the authoritative source for official US federal documents, maintained by the GPO and trusted by lawyers, journalists, and researchers.

๐ŸŽฏ Target Audience๐Ÿ’ก Primary Use Cases
โš–๏ธ Legal researchersTrack new bills and statutes by Congress
๐Ÿ“ฐ Investigative journalistsMonitor the Federal Register for new rules
๐Ÿข Policy analystsBuild longitudinal datasets of congressional activity
๐Ÿค– ML engineersTrain models on official legislative text
๐Ÿ›๏ธ Civic tech buildersPower public dashboards of congressional output

๐Ÿ“‹ What the GovInfo Publications Scraper does

  • Calls api.govinfo.gov/collections/{collection}/{startDate}/{endDate} with your filters.
  • Walks the paginated response across pageSize and offset.
  • Flattens each package into one row with stable columns.
  • Builds the canonical PDF URL for every package so you can pipe to a downloader.
  • Pushes any upstream error as a single record rather than crashing the run.

๐Ÿ’ก Why it matters GovInfo holds the authoritative copy of every US federal publication, but its API requires you to chunk by date range and walk offsets. This actor handles pagination so you can focus on the data.

๐ŸŽฌ Full Demo

๐Ÿšง Coming soon.

โš™๏ธ Input

FieldTypeRequiredDescription
collectionenumNoGovInfo collection. Default BILLS.
startDatestringNoISO 8601 lower bound. Default 2025-01-01T00:00:00Z.
endDatestringNoISO 8601 upper bound. Empty means now.
congressintegerNoOptional Congress number (e.g. 118).
docClassstringNoOptional document class slug (e.g. hr, s).
maxItemsintegerNoFree plan caps at 10. Paid up to 1,000,000.

Example 1, recent House bills:

{
"collection": "BILLS",
"startDate": "2025-01-01T00:00:00Z",
"docClass": "hr",
"maxItems": 50
}

Example 2, latest Federal Register entries:

{
"collection": "FR",
"startDate": "2026-05-01T00:00:00Z",
"maxItems": 100
}

โš ๏ธ Good to Know The operator running this actor needs a free data.gov API key set as the GOVINFO_API_KEY environment variable. Get one in 30 seconds at api.data.gov/signup.

๐Ÿ“Š Output

Each record is a flat object. error is always last.

FieldTypeDescription
๐Ÿ†” packageIdstringUnique GovInfo package identifier.
๐Ÿ“„ titlestringPublication title.
๐Ÿ“š collectionstringCollection code (BILLS, FR, CREC, etc.).
๐Ÿ›๏ธ congressintegerCongress number when applicable.
๐Ÿ—‚๏ธ docClassstringDocument class (hr, s, hres, sres, etc.).
๐Ÿ“… publishDatestringDate issued.
๐Ÿ•’ lastModifiedstringLast modified timestamp from GPO.
๐Ÿ“Ž pdfUrlstringDirect link to the official PDF on govinfo.gov.
๐Ÿ”ข granuleCountintegerNumber of sub-documents inside the package.
๐Ÿข branchstringGovernment branch when present.
๐Ÿ•’ scrapedAtstringWhen this row was fetched.
โŒ errorstringSet if the upstream response was an error.

โœจ Why choose this Actor

| ๐Ÿ†“ | Free official GPO data, no scraping needed. | | ๐Ÿ“š | 15+ federal collections covered out of the box. | | ๐Ÿ“Ž | Auto-builds the direct PDF URL for every package. | | ๐Ÿ›Ÿ | Pagination handled, error responses surfaced cleanly. | | ๐Ÿ”Œ | Plain HTTP, no browsers, fast and resilient. |

๐Ÿ“ˆ How it compares to alternatives

ApproachSetupPaginationPDF URL builder
Roll your own fetch30 min plusmanualmanual
Python govinfo clientinstall plus scriptpartialpartial
This Actor5 sec, no installโœ…โœ…

๐Ÿš€ How to use

  1. Click Try for free.
  2. Pick a collection and a startDate.
  3. Click Start.
  4. Open the dataset when the run finishes.

๐Ÿ’ผ Business use cases

โš–๏ธ Legal monitoring Watch the Federal Register for new agency rules affecting your industry.

๐Ÿ“Š Policy research Pull all bills introduced in a Congress and feed pandas for descriptive stats.

๐Ÿ“ฐ Newsroom backbone Reporters can verify the exact text of a public law in seconds.

๐Ÿค– ML training Build a corpus of official legislative text for legal LLMs.

๐Ÿ”Œ Automating GovInfo Publications Scraper

  • Make and Zapier trigger on schedule and push results to Airtable, Google Sheets, or Slack.
  • Cron native Apify scheduler runs the actor every morning.
  • Webhooks receive a POST the moment a run completes.
  • Warehouses pipe directly to BigQuery, Snowflake, or Postgres.

๐ŸŒŸ Beyond business use cases

๐ŸŽ“ Education Teach a civics class with real congressional data.

๐Ÿงช Personal research Track every bill your representative introduces.

๐Ÿค Non-profit and open data Power public dashboards of legislative activity.

๐Ÿงฐ Tinkering Spin up a quick prototype on top of authoritative federal data.

๐Ÿค– Ask an AI assistant about this scraper

Paste this README into ChatGPT, Claude, or any assistant and ask it to map your workflow to the actor's inputs. The schema, examples, and field list above are everything an LLM needs.

โ“ Frequently Asked Questions

โ“ Do I need an API key The operator running the actor sets one server-side. End users do nothing.

โ“ Which collections are supported All 15+ public collections in the dropdown.

โ“ Is there a rate limit Yes, the data.gov tier allows 1,000 requests per hour per key.

โ“ Can I get full document text Yes, follow the pdfUrl to the official PDF on govinfo.gov.

โ“ How fresh is the data GPO publishes new packages daily.

โ“ Can I schedule runs Yes, use Apify's native scheduler.

โ“ Is this scraping or API Pure API. Endpoint is official and public.

โ“ Will the schema change Stable core fields. New collections are additive.

๐Ÿ”Œ Integrate with any app

Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST endpoint or webhook.

ActorWhat it does
ParseForge OurAirports ScraperGlobal airport database.
ParseForge Alpha Vantage ScraperPublic market data.
ParseForge NBA Stats ScraperPlayer and team stats.
ParseForge CurseForge Mods ScraperPublic mod metadata.

๐Ÿ’ก Pro Tip browse the complete ParseForge collection for 900+ production-grade scrapers.


Disclaimer This actor uses only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any of the third-party services referenced. Users are responsible for complying with the target site's terms of service and applicable law. Create a free account w/ $5 credit.