Pricing

Pay per event

USA HealthData.gov HHS Open Data Scraper

Collect health data catalog information from HealthData.gov . Filter by category, tags, view type, authority, and search terms to find exactly what you need. Perfect for researchers, data analysts, and healthcare professionals who need to discover and access public health datasets efficiently.

Pricing

Pay per event

Rating

0.0

(0)

Developer

ParseForge

Actor stats

Bookmarked

Total users

Monthly active users

6 days ago

Last modified

📊 USA HealthData.gov HHS Open Data Scraper

🚀 Collect health datasets, stories, charts, and maps from the U.S. HHS Open Data Catalog in seconds. Filter by category, tags, view type, and authority. No coding, no API keys required.

The HealthData.gov Scraper automates the discovery and collection of health datasets from the U.S. Department of Health and Human Services Open Data Catalog. Each record includes the dataset name, unique ID, description, publisher, contact information, categories, tags, view and download counts, license details, file download links, and timestamps. You can filter by keyword, category (CDC, FDA, CMS, NIH, HHS), tags, view type (datasets, stories, charts, maps, files), authority (official or community), and sort order. Free users can collect up to 10 items per run, while paid users can retrieve up to 1,000,000 records.

Whether you are a healthcare researcher tracking new CDC datasets, a data scientist building a catalog of open health data, or a policy analyst monitoring HHS publications, this tool eliminates the manual browsing that HealthData.gov requires. Results export to JSON, CSV, or Excel, making it easy to load records into your database, BI tool, or analysis pipeline. Schedule recurring runs to automatically detect new datasets as they are published. The scraper handles pagination, normalizes metadata fields, and processes multiple content types including datasets, stories, charts, maps, and downloadable files.

Target Audience	Use Cases
Healthcare Researchers	Discover and catalog open health datasets for analysis
Data Scientists	Build metadata indexes of available HHS data sources
Policy Analysts	Monitor new publications from CDC, FDA, CMS, and NIH
Public Health Teams	Track epidemiological datasets and surveillance data
Journalists	Find health data for investigative reporting
Academic Institutions	Locate research datasets for grant-funded projects

📋 What the HealthData.gov Scraper does

📝 Dataset names and IDs - capture the title, unique identifier, and description for every item in the HHS catalog
🔗 Direct URLs - collect working links to each dataset page for quick access and verification
📊 Engagement metrics - pull view counts and download counts to identify the most popular datasets
👤 Publisher and contact info - identify which health authority published the data and how to reach them
🏷️ Categories and tags - classify items by health topic, authority (CDC, FDA, CMS), and custom tags
📁 File downloads - extract download links with format and size information for each available file

The scraper connects to the HealthData.gov catalog API and iterates through results using your specified filters. It processes datasets, stories, charts, maps, files, and calendars. Each record is normalized with consistent field names and pushed to an Apify dataset in real time. The tool supports both URL-based browsing (paste a HealthData.gov browse URL) and filter-based searching (set keywords and categories directly).

💡 Why it matters: HealthData.gov hosts thousands of datasets from dozens of health agencies. Manually browsing and cataloging this content is time-consuming. This scraper gives you structured metadata for the entire catalog in minutes.

📊 Data fields

Each record includes: additionalAccessPoints, attribution, averageRating, blobFileSize, blobFilename, blobId, blobMimeType, category, contactEmail, createdAt, dataUpdatedAt, datasetId, datasetName, datasetType, datasetUrl, description, displayType, domain, downloadCount, isLocked, lastUpdated, metadata, metadataUpdatedAt, numberOfComments, owner, pageViews, provenance, publicationDate, publisher, resourceName, rowsUpdatedAt, scrapedTimestamp, tableId, tags, totalTimesRated, viewCount, viewType. All 37 field names come from a real production run, so what you see here is what lands in your dataset.

⚠️ Good to Know: Free users are automatically limited to 10 items per run. Use either startUrl OR the search filters (q, category, tags), not both at the same time. The limitTo field lets you focus on specific content types like datasets or charts.

🚀 How to use

Sign up - Create a free Apify account with $5 credit
Find the Actor - Search for "HealthData.gov Scraper" in the Apify Store
Configure your search - Set keywords, category, content type, and max items
Start the run - Click "Start" and watch results appear in real time
Export your data - Download as JSON, CSV, or Excel from the dataset tab

🕒 Typical run time: 30 seconds to 2 minutes for up to 50 items. Larger runs with 500+ items may take 5 to 15 minutes.

🔗 Recommended Actors

Actor	Description
GSA eLibrary Scraper	Collect government contractor and vendor data from the GSA eLibrary
USAspending Scraper	Extract federal spending data and contract information
PR Newswire Scraper	Collect press releases and news articles from PR Newswire
FINRA BrokerCheck Scraper	Search broker and firm registration data from the FINRA registry
FAA Aircraft Registry Scraper	Look up aircraft registration records by N-number from the FAA

💡 Pro Tip: Combine the HealthData.gov Scraper with the USAspending Scraper to cross-reference health datasets with federal health spending records.

Disclaimer: This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the U.S. Department of Health and Human Services, HealthData.gov, CDC, FDA, CMS, or NIH. All trademarks mentioned are the property of their respective owners.

🆘 Need Help?

If you hit a bug, have questions about setup, or need a scraper we haven't built yet, open our contact form or write to parseforge@protonmail.com. We also take on paid custom data projects.

For faster answers, join our Discord. It's the best place to get support and suggest new actors.

Data.gov Dataset Search

ryanclinton/datagov-dataset-search

Search and extract metadata from 300,000+ datasets in the official United States government open data catalog at [Data.gov](https://catalog.data.gov/).

Ryan Clinton

Data.gov Catalog Scraper

crawlergang/data-gov-catalog-scraper

Scrape the Data.gov catalog (catalog.data.gov). Search 300,000+ open government datasets by keyword, organization, and format. Fetch dataset details or list organizations. No API key required.

Crawler Gang

5.0

Data.gov Catalog Scraper

crawlerbros/data-gov-catalog-scraper

Scrape the Data.gov catalog (catalog.data.gov). Search 300,000+ open government datasets by keyword, organization, and format. Fetch dataset details or list organizations. No API key required.

Crawler Bros

Data.gov API - US Open Government Datasets

alizarin_refrigerator-owner/data-gov-api---us-open-government-datasets

Access the Data.gov catalog of 300,000+ US government datasets. Search datasets by topic, agency, format, and keywords. Discover open data from federal, state, and local governments

The Howlers

Public Health Intelligence MCP Server

martc03/public-health-mcp

MCP server for public health data. Gives AI assistants access to CDC datasets and WHO Global Health Observatory indicators for epidemiological research.

CoDee

Data Gov Catalog Scraper

fortuitous_pirate/data-gov-catalog-scraper

Search, filter, and download metadata for 300,000+ federal open datasets from Data.gov. Filter by agency (NASA, EPA, NOAA), format (CSV, JSON, API), tags, and topics. Returns dataset details, resource links, and organization info.

Fortuitous Pirate

Data.gov.uk Scraper

parseforge/data-gov-uk-scraper

Collect UK government open data effortlessly. Extract datasets, publishers, formats, topics, licenses, and download links from data.gov.uk — the official UK open data portal. Perfect for researchers, policy analysts, and developers building data catalogs.

ParseForge

5.0

Open Data Portal Harvester

datapilot/open-data-portal-harvester

Search and collect datasets from leading government open data portals with a single keyword. This Actor searches Data.gov (US) and CKAN-powered portals like Data.gov.uk to discover publicly available datasets, metadata, download links, licensing information, organizations, and update dates.