Hospital Compare Medicare Scraper
Pricing
Pay per event
Hospital Compare Medicare Scraper
Extract CMS Care Compare hospital locations, ownership, ratings, and Medicare quality fields from the public CMS Provider Data API.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Extract CMS Care Compare / Medicare Hospital Compare data from the public CMS Provider Data API. Get hospital names, CCNs, addresses, ownership, emergency services, overall ratings, quality fields, and source metadata in a clean Apify dataset.
What does Hospital Compare Medicare Scraper do?
Hospital Compare Medicare Scraper downloads structured hospital/provider records from CMS Provider Data. It is built for repeatable healthcare data workflows where teams need current public Medicare hospital data without manually downloading CSV files.
The default run uses the CMS Hospital General Information dataset (xubh-q36u). Advanced users can provide any compatible CMS Provider Data dataset ID and still receive normalized fields plus the full CMS row in cmsRow.
Who is it for?
- π₯ Healthcare analytics teams building hospital benchmarks
- π³ Payers and provider-network teams checking facility coverage
- π Consultants preparing CMS quality and compliance reports
- π§ Hospital strategy teams tracking ratings and ownership data
- π SEO and reputation teams creating local hospital directories
- βοΈ Compliance teams that need repeatable public-source evidence
Why use this scraper?
CMS data is public, but the export workflow can be repetitive. This actor turns the CMS Provider Data API into a reusable Apify workflow with filters, normalized output, charging, API access, scheduling, webhooks, and integrations.
Data source
The actor uses CMS Provider Data API endpoints under:
https://data.cms.gov/provider-data/api/1
Default dataset:
xubh-q36u β Hospital General Information
Example use cases
- Export all hospitals in a state for market sizing
- Monitor overall hospital ratings for local competitors
- Build a provider directory with CCNs and contact details
- Join CMS hospital data with internal claims or network data
- Pull raw CMS rows for a downstream BI warehouse
How much does it cost to scrape CMS hospital compare data?
Pricing is pay-per-event. A small start fee is charged per run and a per-record fee is charged for each hospital/provider record saved. The default input is set to 100 records so first runs stay inexpensive.
Use maxItems to control cost. For large nationwide exports, start with a state filter or a few hundred records, verify the output, then increase the limit.
Input options
| Field | Description |
|---|---|
datasetPreset | Choose hospital_general or raw_dataset |
datasetId | CMS Provider Data dataset ID when using raw_dataset |
maxItems | Maximum records saved to the dataset |
state | Optional two-letter state filter |
facilityNameContains | Optional hospital name substring filter |
facilityIds | Optional list of CMS CCNs / facility IDs |
pageSize | CMS API page size |
includeRawRecord | Include the full CMS row in rawRecord |
Output fields
| Field | Description |
|---|---|
datasetId | CMS dataset identifier |
datasetPreset | Selected preset |
sourceDatasetUrl | CMS API URL used as source |
providerCcn | CMS certification number / facility ID |
facilityName | Hospital or facility name |
address, city, state, zipCode, county | Location fields |
phone | Telephone number |
hospitalType | CMS hospital type |
hospitalOwnership | Ownership category |
emergencyServices | Emergency services flag |
overallRating | Overall hospital rating when present |
measureId, measureName, score | Measure fields for compatible datasets |
comparedToNational | National comparison where present |
footnote | CMS footnote text/code where present |
reportingPeriod | Reporting period/date field where present |
cmsRow | Complete source CMS row |
scrapedAt | ISO timestamp of extraction |
Sample output
{"datasetId": "xubh-q36u","datasetPreset": "hospital_general","providerCcn": "010001","facilityName": "SOUTHEAST HEALTH MEDICAL CENTER","city": "DOTHAN","state": "AL","hospitalType": "Acute Care Hospitals","overallRating": "4","sourceDatasetUrl": "https://data.cms.gov/provider-data/api/1/datastore/query/xubh-q36u/0","scrapedAt": "2026-06-29T00:00:00.000Z"}
How to run
- Open the actor on Apify.
- Keep the default
hospital_generalpreset or enter a CMS dataset ID. - Set
maxItemsand optional filters. - Click Start.
- Download the dataset as JSON, CSV, Excel, XML, or RSS.
Tips for best results
- Use
statefor state-level exports. - Use
facilityIdswhen you already know the CMS CCNs you need. - Keep
includeRawRecordoff unless you need a duplicate raw object. - For raw CMS datasets, inspect the returned
cmsRowto see dataset-specific columns.
Raw CMS dataset mode
Choose raw_dataset and provide a CMS Provider Data dataset ID such as xubh-q36u. The actor still normalizes common hospital fields when matching columns exist and preserves the full row in cmsRow.
Scheduling
Use Apify schedules to refresh hospital data monthly or after CMS data refreshes. Scheduled runs can send datasets to a webhook, cloud storage, or an integration platform.
Integrations
- Send output to Google Sheets for analyst review
- Push JSON to a data warehouse with Apify webhooks
- Use Make or Zapier to notify a team when ratings change
- Feed records into a provider directory or enrichment pipeline
API usage
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: process.env.APIFY_TOKEN });const run = await client.actor('automation-lab/hospital-compare-medicare-scraper').call({datasetPreset: 'hospital_general',state: 'CA',maxItems: 100,});console.log(run.defaultDatasetId);
Python
from apify_client import ApifyClientimport osclient = ApifyClient(os.environ['APIFY_TOKEN'])run = client.actor('automation-lab/hospital-compare-medicare-scraper').call(run_input={'datasetPreset': 'hospital_general','state': 'CA','maxItems': 100,})print(run['defaultDatasetId'])
cURL
curl -X POST "https://api.apify.com/v2/acts/automation-lab~hospital-compare-medicare-scraper/runs?token=$APIFY_TOKEN" \-H 'Content-Type: application/json' \-d '{"datasetPreset":"hospital_general","state":"CA","maxItems":100}'
MCP access
Use the Apify MCP server with Claude Code or Claude Desktop:
https://mcp.apify.com/?tools=automation-lab/hospital-compare-medicare-scraper
Claude Code setup:
$claude mcp add apify-hospital-compare "https://mcp.apify.com/?tools=automation-lab/hospital-compare-medicare-scraper"
Claude Desktop JSON config:
{"mcpServers": {"apify-hospital-compare": {"url": "https://mcp.apify.com/?tools=automation-lab/hospital-compare-medicare-scraper"}}}
Example prompts:
- "Run the Hospital Compare Medicare Scraper for California hospitals and summarize ownership types."
- "Extract 250 CMS hospital records and identify facilities with missing ratings."
- "Get hospital records for these CCNs and prepare a CSV export."
Data freshness
The actor fetches directly from CMS Provider Data at run time. Freshness depends on the CMS dataset itself, not a cached copy maintained by the actor.
Limitations
- It extracts public CMS Provider Data only.
- It does not access private Medicare systems or provider portals.
- Raw dataset mode normalizes common columns but dataset-specific fields remain in
cmsRow.
Legality
CMS Provider Data is public government data. You are responsible for using the data in accordance with applicable laws, CMS terms, and your organization's compliance requirements.
FAQ
Can I use this for any CMS Provider Data hospital dataset?
Yes. Use raw_dataset with a CMS Provider Data dataset ID. Common columns are normalized and every original column remains in cmsRow.
Does this scrape private Medicare systems?
No. It extracts public CMS Provider Data only.
Troubleshooting
Why do some rating fields look empty?
CMS may publish blanks, footnotes, or dataset-specific column names. Check cmsRow for the original source fields.
Why did a raw dataset ID fail?
The actor expects CMS Provider Data dataset IDs supported by the /provider-data/api/1/datastore/query/{datasetId}/0 endpoint. Verify the dataset ID on data.cms.gov.
Related scrapers
- https://apify.com/automation-lab/website-contact-finder
- https://apify.com/automation-lab/google-maps-lead-finder
- https://apify.com/automation-lab/business-data-enrichment
Support
If you need another CMS dataset preset or a different healthcare output shape, open an issue from the Apify actor page with an example dataset and desired fields.