USA Data.gov U.S. Government's Open Data Scrape
Pricing
Pay per event
USA Data.gov U.S. Government's Open Data Scrape
Stop wasting hours digging through thousands of government datasets. Our Data.gov scraper automatically gathers complete dataset details from the U.S. government's open data portal in minutes. Ideal for researchers, analysts, journalists, and teams needing reliable data without manual effort.
Pricing
Pay per event
Rating
0.0
(0)
Developer

ParseForge
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
🏛️ USA Data.gov U.S. Government's Open Data Scraper
🚀 Supercharge your government data research with our comprehensive Data.gov scraper! Automate collection of detailed dataset information from the US government data catalog including dataset metadata, organizations, publishers, topics, formats, and access information. Get complete dataset details, resource links, and metadata from Data.gov's official catalog. Perfect for researchers, data analysts, and organizations who need accurate, up to date government data intelligence without manual work.
Target Audience: Researchers, data analysts, government contractors, policy analysts, journalists, and organizations needing government data
Primary Use Cases: Government data research, policy analysis, market research, competitive intelligence, and data driven decision making
What Does Data.gov Scraper Do?
This tool collects comprehensive dataset information from Data.gov, the U.S. government's open data portal. It delivers:
- Complete Dataset Metadata - Titles, descriptions, organization details, publisher information
- Organization Information - Organization names, types, missions, contact details, and URLs
- Publisher Data - Publisher names, URLs, and contact information
- Topic Classifications - Topics, topic categories, and tags for easy categorization
- Resource Formats - Available data formats (CSV, JSON, XML, PDF, etc.) with download links
- Access & Use Information - Public status, licenses, and usage rights
- Download Links - Direct links to all available resources and datasets
- Metadata Details - Creation dates, update dates, metadata sources, and references
- Contact Information - Dataset maintainer names and email addresses
- And much more - Comprehensive government data intelligence in one scrape
Business Value: Access thousands of government datasets efficiently, track data updates automatically, and build comprehensive government data databases that save weeks of manual research and monitoring.
How to use the Data.gov Scraper - Full Demo
[YouTube video embed or link]
Watch this 3-minute demo to see how easy it is to get started!
Input
To start Data.gov web scraping, simply fill in the input form. You can scrape Data.gov using two different methods (choose one):
Method 1: Direct URL Scraping 🔗
- startUrl - Use a direct Data.gov catalog URL (e.g.,
https://catalog.data.gov/dataset?q=climate)- Required if search filters are not provided
- Cannot be used together with search filters
- maxItems - Set the maximum number of datasets to collect (up to 1,000,000). Free users: Required, max 50. Paid users: Optional, max 1,000,000. Leave empty for unlimited (paid users only). Default: 10
Method 2: Search Filters (Recommended) 🎯
- searchQuery - Enter a search term for datasets (e.g., "climate", "healthcare", "transportation")
- Required if startUrl is not provided
- maxItems - Set the maximum number of datasets to collect (up to 1,000,000). Free users: Required, max 50. Paid users: Optional, max 1,000,000. Leave empty for unlimited (paid users only). Default: 10
Advanced Filtering Options:
- topics - Filter by topic groups. Select one or more topics from the dropdown (e.g., "Climate", "Energy", "Local Government")
- topicCategories - Filter by topic categories. Select one or more categories (e.g., "Arctic", "Water", "Transportation")
- datasetType - Filter by dataset type (e.g., "Dataset", "Collection")
- tags - Filter by tags. Enter multiple tag names (e.g., "earth science", "noaa", "oceans")
- formats - Filter by resource format. Select one or more formats (e.g., "CSV", "JSON", "PDF", "XML")
- organizationType - Filter by organization type (e.g., "Federal Government", "State", "City")
- organization - Filter by specific organization. Select from the dropdown (e.g., "noaa-gov", "usgs-gov")
- publisher - Filter by publisher. Select from the dropdown
- bureau - Filter by bureau code. Select from the dropdown
- location - Filter by geographic location (e.g., "California", "New York")
- sort - Sort results by relevance, views, or date
⚠️ Important Input Rules:
- Choose One Method: You must use either direct URL scraping OR search filters, not both
- Required Fields:
- Either
startUrlORsearchQuerymust be provided - Free users can only use the prefill values provided in the input form. To use custom input parameters, please upgrade to a paid plan.
- Either
- Mutual Exclusivity:
- If using
startUrl, you cannot usesearchQueryor any search filters - If using
searchQuery, you cannot usestartUrl
- If using
- Filter Combinations: You can combine multiple filters (topics, formats, organization, etc.) for precise results
Here's what the input configuration looks like in JSON:
Example 1: Search with Filters (Recommended)
{"searchQuery": "climate","topics": ["climate5434"],"formats": ["CSV", "JSON"],"organizationType": "Federal Government","maxItems": 50}
Example 2: Direct URL
{"startUrl": "https://catalog.data.gov/dataset?q=climate&groups=climate5434&res_format=CSV","maxItems": 100}
Example 3: Advanced Multi-Filter Search
{"searchQuery": "transportation","topics": ["local", "energy9485"],"topicCategories": ["Transportation", "Energy Infrastructure"],"tags": ["u.s. department of commerce", "air quality"],"formats": ["CSV", "JSON", "PDF"],"organizationType": "Federal Government","organization": "noaa-gov","maxItems": 200}
Pro Tips:
- 🎯 Use search filters for flexibility - Combine multiple filters to find exactly what you need
- 📊 Start broad, then narrow - Begin with a search query, then add filters to refine results
- 🔍 Use topic filters - Topics help categorize datasets by subject area
- 📁 Filter by format - Get only the data formats you can work with (CSV, JSON, etc.)
- 🏛️ Filter by organization - Focus on specific government agencies or departments
- ⚡ Use direct URLs - If you've already found a specific catalog page, paste the URL directly
Output
After the Actor finishes its run, you'll get a dataset with the output. The length of the dataset depends on the amount of results you've set. You can download those results as an Excel, HTML, XML, JSON, and CSV document.
Here's an example of scraped Data.gov data you'll get if you decide to scrape dataset information:
{"organizationImage": "https://raw.githubusercontent.com/GSA/logo/refs/heads/master/state_IA.png","title": "Iowa School Performance Profiles","datasetUrl": "https://catalog.data.gov/dataset/iowa-school-performance-profiles","organizationName": "State of Iowa","organizationUrl": "https://catalog.data.gov/organization/about/state-of-iowa","organizationType": "State","publisher": "data.iowa.gov","publisherUrl": "https://catalog.data.gov/dataset?publisher=data.iowa.gov","contact": {"name": "Bryan Bauer","email": "no-reply@data.iowa.gov"},"organizationMission": "State of Iowa","topics": null,"availableFormats": ["HTML"],"tags": ["school-report-cards", "student-performance"],"metadataUpdated": "September 1, 2023","metadataCreated": "January 20, 2023","description": "The Iowa School Performance Profiles is an online tool showing how public schools performed on required measures...","downloadsAndResources": [{"format": "HTML","url": "https://www.iaschoolperformance.gov/ECP/Home/Index","description": null}],"accessAndUseInfo": [{"type": "publicStatus","label": "Public Status","value": "Public: This dataset is intended for public access and use.","url": "https://resources.data.gov/schemas/dcat-us/v1.1/#accessLevel"},{"type": "license","label": "License","value": "License: No license information was provided.","url": null}],"references": null,"metadataSource": [{"format": "Data.json","heading": "Data.json Metadata","downloadUrl": "https://catalog.data.gov/harvest/object/f8ba9a15-9137-4b9f-bbf8-fd59d89bc825","harvestedFrom": "Iowa metadata"}],"dates": [{"label": "Metadata Created Date","value": "January 20, 2023"},{"label": "Metadata Updated Date","value": "September 1, 2023"}],"additionalMetadata": {"Resource Type": "Dataset","Publisher": "data.iowa.gov","Maintainer": "Bryan Bauer","Identifier": "https://data.iowa.gov/api/views/qu5a-5eu4","Data First Published": "2022-11-01","Data Last Modified": "2023-08-30","Category": "Primary & Secondary Ed","Public Access Level": "public"},"scrapedTimestamp": "2025-12-05T15:37:25.646Z"}
What You Get:
- 🏛️ Complete Organization Details - Organization names, types, missions, images, and URLs
- 📊 Comprehensive Dataset Info - Titles, descriptions, metadata dates, and identifiers
- 📁 Resource Links - Direct download links for all available formats (CSV, JSON, PDF, etc.)
- 📞 Contact Information - Maintainer names and email addresses for support
- 🏷️ Categorization - Topics, tags, and categories for easy organization
- 📋 Access & Use Info - Public status, licenses, and usage rights
- 🔗 References & Sources - Metadata sources, harvest information, and related links
- 📅 Date Tracking - Creation dates, update dates, and modification timestamps
- 📦 Additional Metadata - Complete technical metadata for advanced analysis
Download Options: CSV, Excel, or JSON formats for easy analysis in your business tools
Why Choose the Data.gov Scraper?
- 🎯 Comprehensive Data Collection: Get all available dataset information in one scrape - metadata, resources, organizations, and more
- 🔍 Advanced Filtering: Filter by topics, formats, organizations, publishers, and more for precise results
- 📊 Multiple Input Methods: Use direct URLs or search filters - whichever works best for your workflow
- 🏛️ Organization Intelligence: Get complete organization details including missions, contact info, and URLs
- 📁 Format Filtering: Find datasets in the formats you need (CSV, JSON, PDF, XML, etc.)
- 🔗 Direct Download Links: Access direct links to all available resources and datasets
- 📋 Complete Metadata: Get creation dates, update dates, metadata sources, and technical details
- 🚫 No Duplicates: Automatically skips datasets already in your collection
- ⚡ User-Friendly: No coding needed, just input search terms or URLs and go
- 🔄 Parallel Processing: Processes multiple datasets simultaneously for faster results
Time Savings: Save 10-20 hours per week compared to manual government data research
Cost Efficiency: Fraction of the cost of hiring a research assistant or using expensive data services
How to Use
- Sign Up: Create a free account w/ $5 credit (takes 2 minutes)
- Find the Scraper: Visit the Data.gov Scraper page
- Set Input:
- Option A (Recommended): Enter a search query and select filters
- Option B: Add your direct Data.gov catalog URL
- Set max items (optional)
- Run It: Click "Start" and let it collect your data
- Download Data: Get your results in the "Dataset" tab as CSV, Excel, or JSON
Total Time: 3 minutes setup, 10-30 minutes for data collection
No Technical Skills Required: Everything is point and click
Business Use Cases
📊 Researchers & Academics:
- Collect government datasets for research projects
- Track updates to datasets over time
- Build comprehensive government data databases
- Analyze policy impacts using government data
🏛️ Government Contractors:
- Monitor new government data releases
- Track data updates from specific agencies
- Identify data sources for proposals
- Stay informed about government data initiatives
📰 Journalists & Media:
- Find government data for investigative reporting
- Track data releases from specific agencies
- Monitor updates to important datasets
- Build data driven stories with government sources
💼 Policy Analysts:
- Analyze policy data across multiple agencies
- Track policy implementation through data
- Compare datasets across different time periods
- Build policy impact assessments
📈 Data Analysts:
- Build comprehensive government data catalogs
- Create automated data monitoring systems
- Integrate government data into business intelligence tools
- Support data driven decision making
🔬 Market Researchers:
- Access government economic and market data
- Track industry trends through government datasets
- Analyze market conditions using official data
- Support business planning with government statistics
Using Data.gov Scraper with the Apify API
For advanced users who want to automate this process, you can control the scraper programmatically with the Apify API. This allows you to schedule regular data collection and integrate with your existing business tools.
Example API Usage:
// Node.js exampleconst { ApifyApi } = require('apify-client');const client = new ApifyApi({token: 'YOUR_API_TOKEN',});// Run with search filtersawait client.actor('YOUR_ACTOR_ID').call({searchQuery: "climate",topics: ["climate5434"],formats: ["CSV", "JSON"],organizationType: "Federal Government",maxItems: 50});// Run with direct URLawait client.actor('YOUR_ACTOR_ID').call({startUrl: "https://catalog.data.gov/dataset?q=climate&groups=climate5434",maxItems: 100});
- Node.js: Install the apify-client NPM package
- Python: Use the apify-client PyPI package
- See the Apify API reference for full details
Frequently Asked Questions
Q: How accurate is the data? A: We collect data directly from Data.gov's official website in real time, ensuring the most up to date and accurate government dataset information available.
Q: Can I filter by multiple topics or formats? A: Yes! You can select multiple topics, topic categories, tags, and formats to get datasets that match any of your selected criteria.
Q: What's the difference between using startUrl and search filters?
A: startUrl lets you use a direct Data.gov catalog URL you've already found, while search filters let you build a search from scratch. Both methods work great - choose the one that fits your workflow.
Q: How do I find organization slugs for the organization filter?
A: Visit Data.gov's organizations page to browse all available organizations. The organization slug is in the URL (e.g., noaa-gov from https://catalog.data.gov/organization/noaa-gov).
Q: Can I schedule regular runs? A: Yes! Use the Apify API to schedule daily, weekly, or monthly runs automatically. Perfect for ongoing government data monitoring and research.
Q: What if I need help? A: Our support team is available 24/7. Contact us through the Apify platform.
Q: Is my data secure? A: Absolutely. All data is encrypted in transit and at rest. We never share your data with third parties.
Q: How many datasets can I scrape? A: Free users can scrape up to 50 datasets per run. Paid users can scrape up to 1,000,000 datasets or leave maxItems empty for unlimited scraping.
Integrate Data.gov Scraper with any app and automate your workflow
Last but not least, Data.gov Scraper can be connected with almost any cloud service or web app thanks to integrations on the Apify platform.
These includes:
Alternatively, you can use webhooks to carry out an action whenever an event occurs, e.g. get a notification whenever Data.gov Scraper successfully finishes a run.
🔗 Recommended Actors
Looking for more data collection tools? Check out these related actors:
| Actor | Description | Link |
|---|---|---|
| GSA eLibrary Scraper | Collects government publications and documents from GSA eLibrary | https://apify.com/parseforge/gsa-elibrary-scraper |
| FINRA BrokerCheck Scraper | Extracts financial broker and advisor information from FINRA | https://apify.com/parseforge/finra-brokercheck-scraper |
| FAA Aircraft Registry (N-Number) Scraper | Collects aircraft registration and ownership data from FAA | https://apify.com/parseforge/faa-aircraft-registry-scraper |
| Greatschools Scraper | Extracts school information and ratings from GreatSchools.org | https://apify.com/parseforge/greatschools-scraper |
| PR Newswire Scraper | Collects press releases and news content from PR Newswire | https://apify.com/parseforge/pr-newswire-scraper |
Pro Tip: 💡 Browse our complete collection of data collection actors to find the perfect tool for your business needs.
Need Help? Our support team is here to help you get the most out of this tool.
⚠️ Disclaimer: This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Data.gov, the U.S. General Services Administration (GSA), or any of its subsidiaries. All trademarks mentioned are the property of their respective owners.