CMS Hospital Price Transparency Scraper avatar

CMS Hospital Price Transparency Scraper

Pricing

Pay per event

Go to Apify Store
CMS Hospital Price Transparency Scraper

CMS Hospital Price Transparency Scraper

Extract hospital standard charges from CMS-mandated machine-readable files (MRF). Parses CMS v1/v2/v3 JSON schemas into rows by billing code (CPT, HCPCS, MS-DRG, NDC) and payer/plan. Fetches hospital identity from the CMS enrollment dataset. Filter by state, CCN, code type, billing code, or payer.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Scrapes hospital standard charges from machine-readable files mandated by the CMS Hospital Price Transparency rule. Parses CMS JSON schemas (v1, v2, v3) into structured rows by billing code — CPT, HCPCS, MS-DRG, Revenue Code, NDC — payer, and plan. Also retrieves hospital identity data for ~6,000 US hospitals from the CMS enrollment dataset.


CMS Hospital Price Transparency Scraper Features

  • Parses CMS JSON MRF files (v1/v2 flat schema and v3 nested schema) from any user-supplied URL
  • Auto-detects schema version — no configuration needed
  • Returns negotiated rates by payer and plan: dollar amounts, percentages, and algorithm descriptions
  • Returns gross charges, cash/self-pay prices, and de-identified min/max rates
  • Fetches hospital identity records (CCN, NPI, address) from the CMS enrollment dataset
  • Filters by billing code type (CPT, HCPCS, MS-DRG, RC, NDC), specific billing code, payer name substring, or state
  • Three modes: mrf_parse for a single file URL, hospital_list for enrollment data, discover_and_parse for the combined pipeline
  • No proxy required — CMS and GitHub are open public APIs

Who Uses CMS Hospital Price Transparency Data?

  • Healthcare price-comparison startups — Build comparison tools on top of actual negotiated rates across hospitals and payers
  • Employers and self-funded health plans — Compare in-network rates to negotiate better contracts or choose preferred networks
  • Benefits consultants and brokers — Analyze payer/plan rate variation for clients by procedure code
  • Journalists and researchers — Track compliance, investigate pricing disparities, publish hospital cost analyses
  • Healthcare data vendors — Supplement CMS enrollment records with charge data to build comprehensive hospital intelligence datasets
  • Academic institutions — Study pricing patterns across regions, facility types, and payer mixes

How CMS Hospital Price Transparency Scraper Works

  1. Pick a mode. mrf_parse takes a single MRF URL and parses it. hospital_list pages through the CMS enrollment dataset and returns hospital identity records. discover_and_parse does both.
  2. For MRF parsing, the scraper fetches the JSON file and auto-detects whether it uses the v3 nested schema (with standard_charges[] and payers_information[]) or the v1/v2 flat schema. It handles both without any configuration.
  3. Apply optional filters — billing code type, specific code, payer name substring — and the scraper applies them during parsing so only matching rows reach the output.
  4. Results land in the Apify dataset as structured JSON. One row per payer/plan combination per billing code per service setting, which is as granular as the CMS standard requires.

Input

{
"mode": "mrf_parse",
"mrfUrl": "https://example-hospital.com/standard-charges.json",
"billingCodeType": "CPT",
"maxItems": 1000,
"sp_intended_usage": "Rate comparison for employer health plan negotiation",
"sp_improvement_suggestions": "None"
}
FieldTypeDefaultDescription
modestringmrf_parsemrf_parse parses a single MRF URL. hospital_list returns CMS enrollment records. discover_and_parse combines both.
mrfUrlstringMRF JSON URL to fetch and parse. Required for mrf_parse mode.
stateFilterstringTwo-letter state code (e.g. CA, TX). Filters hospital list results.
hospitalCcnstringCMS Certification Number to filter to a single hospital.
billingCodeTypestringFilter by code system: CPT, HCPCS, MS-DRG, APR-DRG, RC, NDC, or Internal. Leave blank for all.
billingCodestringSpecific billing code to filter (e.g. 70551). Leave blank for all.
payerFilterstringCase-insensitive payer name substring filter.
maxItemsinteger15Maximum records to return. 0 = unlimited.

Hospital List Mode Input

{
"mode": "hospital_list",
"stateFilter": "TX",
"maxItems": 500,
"sp_intended_usage": "Building a hospital database for Texas",
"sp_improvement_suggestions": "None"
}

CMS Hospital Price Transparency Scraper Output Fields

MRF Parse Mode

Returns one row per payer/plan/billing code combination.

{
"hospital_name": "EXAMPLE REGIONAL MEDICAL CENTER",
"hospital_ccn": "",
"hospital_npi": "",
"hospital_address": "",
"hospital_city": "",
"hospital_state": "",
"hospital_zip": "",
"mrf_url": "https://example-hospital.com/standard-charges.json",
"mrf_version": "3.0.0",
"mrf_last_updated": "2025-01-15",
"billing_code": "70551",
"billing_code_type": "CPT",
"description": "MRI Brain without contrast",
"payer_name": "Aetna",
"plan_name": "Aetna PPO Standard",
"setting": "outpatient",
"methodology": "fee schedule",
"standard_charge_gross": 4200,
"standard_charge_discounted_cash": 1890,
"standard_charge_negotiated_dollar": 1240,
"standard_charge_negotiated_percentage": null,
"standard_charge_negotiated_algorithm": "",
"standard_charge_min": 980,
"standard_charge_max": 1600,
"estimated_amount": 1240,
"additional_payer_notes": "",
"record_type": "charge_row"
}
FieldTypeDescription
hospital_namestringHospital name from MRF header
hospital_ccnstringCMS Certification Number (populated in hospital_list mode)
hospital_npistringNational Provider Identifier
hospital_addressstringStreet address
hospital_citystringCity
hospital_statestringState abbreviation
hospital_zipstringZIP code
mrf_urlstringSource MRF file URL
mrf_versionstringCMS schema version (e.g. 3.0.0)
mrf_last_updatedstringDate the MRF was last updated
billing_codestringBilling code (e.g. 70551)
billing_code_typestringCode system: CPT, HCPCS, MS-DRG, RC, NDC, Internal
descriptionstringService or item description
payer_namestringPayer name
plan_namestringPlan name
settingstringService setting: inpatient, outpatient, or both
methodologystringRate methodology (fee schedule, percent of total billed charges, etc.)
standard_charge_grossnumberGross / chargemaster price
standard_charge_discounted_cashnumberCash / self-pay discount price
standard_charge_negotiated_dollarnumberNegotiated dollar amount
standard_charge_negotiated_percentagenumberNegotiated percentage of gross charge
standard_charge_negotiated_algorithmstringAlgorithm description when rate is formula-based
standard_charge_minnumberDe-identified minimum negotiated charge
standard_charge_maxnumberDe-identified maximum negotiated charge
estimated_amountnumberEstimated allowed amount
additional_payer_notesstringAdditional payer or plan notes
record_typestringcharge_row for MRF data, hospital_info for enrollment data

Hospital List Mode

Returns one row per hospital from the CMS enrollment dataset.

{
"hospital_name": "MEMORIAL HOSPITAL OF LARAMIE COUNTY",
"hospital_ccn": "530012",
"hospital_npi": "1568469223",
"hospital_address": "214 E 23RD ST",
"hospital_city": "CHEYENNE",
"hospital_state": "WY",
"hospital_zip": "82001",
"mrf_url": "",
"mrf_version": "",
"mrf_last_updated": "",
"billing_code": "",
"billing_code_type": "",
"description": "",
"payer_name": "",
"plan_name": "",
"setting": "",
"methodology": "",
"standard_charge_gross": null,
"standard_charge_discounted_cash": null,
"standard_charge_negotiated_dollar": null,
"standard_charge_negotiated_percentage": null,
"standard_charge_negotiated_algorithm": "",
"standard_charge_min": null,
"standard_charge_max": null,
"estimated_amount": null,
"additional_payer_notes": "",
"record_type": "hospital_info"
}

🔍 FAQ

How do I scrape hospital prices from CMS machine-readable files?

CMS Hospital Price Transparency Scraper handles this in mrf_parse mode. Supply the MRF URL in the mrfUrl field, set optional filters, and run. The scraper fetches the file, auto-detects whether it uses CMS v1/v2 or v3 schema, and outputs one structured row per payer/plan/billing code combination.

Where do I find hospital MRF URLs?

CMS does not publish a single comprehensive directory of MRF URLs. Individual hospitals publish their own files on their websites, typically in a "price transparency" or "standard charges" section. The CMS Hospital Price Transparency enforcement dataset tracks compliance but does not always include direct file links. Third-party aggregators like Turquoise Health and Dolthub's hospital-price-transparency project maintain compiled URL lists.

What billing code types does this scraper support?

CMS Hospital Price Transparency Scraper supports all billing code types defined in the CMS standard: CPT, HCPCS, MS-DRG, APR-DRG, Revenue Code (RC), NDC (drug codes), and Internal. Filter using the billingCodeType input field.

How much does this scraper cost to run?

CMS Hospital Price Transparency Scraper uses pay-per-event pricing: $0.10 per run start plus $0.001 per record. A run parsing 10,000 charge rows from a single MRF file costs about $10.10. Use maxItems to control scope on large hospital chargemasters.

Does this scraper need proxies?

No. CMS data APIs and GitHub-hosted example files are public and don't require proxies. The proxyConfiguration input field is available if your target MRF is hosted somewhere that rate-limits, but for standard CMS data sources it isn't necessary.

Can I filter to a single hospital?

Yes. Use hospitalCcn with the hospital's CMS Certification Number to filter hospital_list mode. For MRF parsing, supply the specific hospital's MRF URL in mrfUrl.


Need More Features?

Need support for CSV MRF format, streaming parsing for 1GB+ files, or a different data source? File an issue or get in touch.

Why Use CMS Hospital Price Transparency Scraper?

  • CMS schema coverage — Handles CMS v1, v2, and v3 JSON schemas with auto-detection. Most hospital-built parsers handle only one version.
  • Dual-mode output — Returns charge rows from MRF files and hospital identity data from the CMS enrollment API in the same output schema, so you can join the two datasets without additional ETL.
  • Affordable at scale — At $0.001 per record, parsing 100,000 charge rows costs $100, which is less than most healthcare data subscriptions charge per hospital.