Open Payments Scraper
Pricing
Pay per event
Open Payments Scraper
Extract official CMS Open Payments records, dataset metadata, and CSV URLs by year, payment type, NPI, company, and state.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Extract CMS Open Payments payment records, dataset metadata, and CSV download URLs from the official public CMS Open Payments APIs.
Open Payments Scraper helps compliance teams, healthcare analysts, journalists, pharma/medtech researchers, and sales operations teams turn the large CMS Open Payments catalog into clean Apify datasets. Use it to pull payments by reporting year, payment type, physician NPI, recipient name, company name, and state, or to discover official CSV files for bulk processing.
What does Open Payments Scraper do?
Open Payments Scraper connects to the public CMS Open Payments metadata and datastore APIs.
It can:
- ๐ฅ Extract detailed general, research, and ownership payment records.
- ๐ Select CMS reporting years such as 2024, 2023, or 2022.
- ๐ Filter output by recipient NPI, recipient name, company name, and state.
- ๐ Return dataset metadata only for lightweight discovery.
- ๐ Return official CMS CSV download URLs for bulk ETL workflows.
- ๐งพ Preserve the raw CMS API record for auditability.
Who is it for?
Open Payments Scraper is useful for teams that need repeatable access to CMS payment transparency data.
- Compliance teams monitoring physician-industry relationships.
- Healthcare analytics teams building payment dashboards.
- Pharma and medtech commercial teams researching customer segments.
- Journalists investigating healthcare payments.
- Academic researchers analyzing Open Payments trends.
- Data engineers who need stable CSV URLs and dataset identifiers.
Why use this actor?
The CMS website and bulk files are powerful but large. This actor gives you a controlled Apify interface with limits, filters, and normalized output.
Instead of manually navigating CMS pages, downloading huge CSV files, and writing one-off scripts, you can run a saved Apify task and export the results as JSON, CSV, Excel, or through the API.
What data can you extract?
| Field | Description |
|---|---|
recordType | payment, dataset_metadata, or csv_url |
datasetIdentifier | CMS dataset UUID |
datasetTitle | CMS dataset title |
datasetYear | Reporting year when available |
datasetType | General, research, ownership, summary, or profile |
recipientName | Physician, entity, or hospital name |
recipientNpi | Covered recipient or investigator NPI |
companyName | Manufacturer, GPO, or submitting organization |
paymentAmountUsd | Payment amount in USD |
paymentDate | CMS payment date |
paymentNature | Nature of payment or transfer of value |
csvDownloadUrl | Official CMS CSV download URL |
rawRecord | Full original CMS API row |
How much does it cost to scrape CMS Open Payments data?
This actor uses pay-per-event pricing.
- Start event: $0.005 per run.
- Per result item: Free $0.000032381, Starter/BRONZE $0.000028158, Scale/SILVER $0.000021963, Business/GOLD $0.000016895, Platinum $0.000011263, Diamond $0.00001.
- Each saved payment, metadata, or CSV URL row is charged as one result item.
- Use
metadataOnlyorcsvUrlsOnlyfor cheap discovery runs before bulk extraction. - Keep
maxItemslow while testing filters, then raise it for production runs.
Quick start
- Open the actor on Apify.
- Choose one or more reporting years.
- Choose dataset types such as
general,research, orownership. - Optionally enter an NPI, recipient name, company name, or state.
- Set
maxItems. - Run the actor.
- Export the dataset or consume it through the Apify API.
Input options
years
Array of CMS reporting years. Example: [2024].
datasetTypes
Choose one or more dataset families:
generalresearchownershipsummaryprofileall
datasetIdentifiers
Optional CMS dataset UUIDs. If you already know the exact dataset identifier, this overrides year/type selection.
recipientNpi
Filter by covered recipient or principal investigator NPI.
recipientName
Case-insensitive filter for physician, entity, or teaching hospital names.
companyName
Case-insensitive filter for manufacturer, GPO, or reporting entity names.
state
Two-letter state code such as CA, NY, or TX.
metadataOnly
Return one row per matching CMS dataset with title, identifier, modified date, source URL, and download links.
csvUrlsOnly
Return one row per matching dataset focused on official CSV download URLs.
maxItems
Maximum number of output rows to save.
Example input
{"years": [2024],"datasetTypes": ["general"],"state": "CA","maxItems": 100}
Output example
{"recordType": "payment","datasetIdentifier": "e6b17c6a-2534-4207-a4a1-6746a14911ff","datasetTitle": "2024 General Payment Data","datasetYear": "2024","datasetType": "general","recipientName": "Jane Smith","recipientNpi": "1234567890","recipientState": "CA","companyName": "Example Medical Inc.","paymentAmountUsd": 25.0,"paymentDate": "2024-05-10","paymentNature": "Food and Beverage","csvDownloadUrl": "https://download.cms.gov/openpayments/...csv"}
Tips for better results
- Start with
metadataOnly: trueto see available datasets. - Use
csvUrlsOnly: truewhen you want CMS bulk files instead of paginated rows. - Use exact NPI values for targeted provider checks.
- Use state filters for regional compliance reviews.
- Increase
maxItemsonly after confirming your filters.
Integrations
Open Payments Scraper works with common Apify workflows:
- Export to Google Sheets for compliance review.
- Send dataset items to Make or Zapier.
- Load JSONL output into Snowflake, BigQuery, or Postgres.
- Schedule recurring monitoring tasks.
- Combine with lead enrichment or healthcare-provider datasets.
API usage
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: process.env.APIFY_TOKEN });const run = await client.actor('automation-lab/open-payments-scraper').call({years: [2024],datasetTypes: ['general'],maxItems: 100,});console.log(run.defaultDatasetId);
Python
from apify_client import ApifyClientclient = ApifyClient('MY-APIFY-TOKEN')run = client.actor('automation-lab/open-payments-scraper').call(run_input={'years': [2024],'datasetTypes': ['general'],'maxItems': 100,})print(run['defaultDatasetId'])
cURL
curl -X POST "https://api.apify.com/v2/acts/automation-lab~open-payments-scraper/runs?token=$APIFY_TOKEN" \-H 'Content-Type: application/json' \-d '{"years":[2024],"datasetTypes":["general"],"maxItems":100}'
MCP usage
Use this actor from MCP-compatible clients through Apify MCP.
MCP URL:
https://mcp.apify.com/?tools=automation-lab/open-payments-scraper
Claude Code setup:
$claude mcp add apify-open-payments "https://mcp.apify.com/?tools=automation-lab/open-payments-scraper"
MCP JSON configuration:
{"mcpServers": {"apify-open-payments": {"url": "https://mcp.apify.com/?tools=automation-lab/open-payments-scraper"}}}
Example prompts:
- "Run Open Payments Scraper for 2024 general payments in California and summarize the largest payments."
- "Find CMS Open Payments CSV URLs for 2024 research and ownership payment datasets."
- "Extract 100 Open Payments records for a specific physician NPI."
Data freshness
The actor reads the live CMS Open Payments metadata API. CMS controls dataset publication schedules, modified dates, and CSV file paths.
Legality
CMS Open Payments data is public United States government transparency data. You are responsible for using it lawfully and for complying with your organization's privacy, compliance, and data-retention policies.
FAQ
Why did I get fewer rows than maxItems?
Your filters may be narrow, or the selected dataset may have fewer matching rows within the bounded scan window. Try metadata discovery first, broaden filters, or use the official CSV URL for bulk processing.
Why is rawRecord large?
CMS records contain many columns. The actor keeps the raw row so analysts can audit values that are not normalized into top-level fields.
Should I use CSV mode or payment mode?
Use payment mode for API-ready samples and filtered exports. Use CSV URL mode for warehouse-scale ETL.
Related scrapers
Other Automation Lab actors can complement this data:
- https://apify.com/automation-lab/website-contact-finder
- https://apify.com/automation-lab/google-maps-lead-finder
- https://apify.com/automation-lab/company-website-scraper
Support
If a CMS field changes or a dataset identifier stops working, open an issue on the actor page with the run ID and input used.