NPI Registry Scraper | 7M+ US Healthcare Providers (CMS)
Pricing
from $1.00 / 1,000 results
NPI Registry Scraper | 7M+ US Healthcare Providers (CMS)
Scrape the official US CMS NPPES NPI Registry (7M+ healthcare providers). Per-provider: NPI number, name + credential, taxonomy (specialty + license), addresses, phone, identifiers, organization data. Free public CMS API — pharma sales, healthcare staffing, insurance, credentialing.
Pricing
from $1.00 / 1,000 results
Rating
0.0
(0)
Developer
Haketa
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
13 hours ago
Last modified
Categories
Share
NPI Registry Scraper — 7M+ US Healthcare Providers from the Official CMS NPPES API
The most complete NPPES NPI Registry data extraction actor on Apify. Pull every licensed US healthcare provider — physicians, NPs, PAs, RNs, pharmacists, dentists, therapists, hospitals, clinics, group practices, pharmacies — straight from the official Centers for Medicare & Medicaid Services (CMS) National Plan and Provider Enumeration System at
npiregistry.cms.hhs.gov. Search by NPI, name, specialty (taxonomy), city, state, or ZIP. ~7,000,000+ providers indexed, no auth, no rate limit, no scraping of private sites — 100% official public CMS data.
What This Actor Does
The NPI Registry Scraper is a production-grade Apify Actor that queries the official US CMS NPPES API (https://npiregistry.cms.hhs.gov/api/?version=2.1) — the federal source-of-truth for every healthcare provider authorized to bill Medicare, Medicaid, or any HIPAA-covered payer in the United States.
Every clinician, allied-health professional, and healthcare organization operating in the US is required by HIPAA to obtain a National Provider Identifier (NPI) before submitting electronic transactions. The result: a freely-queryable federal registry of 7,000,000+ providers, refreshed continuously as new providers enumerate and existing providers update their licensure, address, or specialty. This actor turns that API into a structured, paginated, ready-for-CRM/data-warehouse dataset.
In a single run the actor can return tens, thousands, or hundreds of thousands of fully-normalized provider records — each one covering:
- Individual providers (NPI-1) — every physician (MD, DO), nurse practitioner (NP, CRNP, APRN), physician assistant (PA), registered nurse (RN), dentist (DDS, DMD), pharmacist (RPh, PharmD), psychologist, social worker (LCSW, LMSW), physical therapist (PT, DPT), occupational therapist (OT), chiropractor (DC), optometrist (OD), podiatrist (DPM), audiologist, dietitian, midwife, behavior analyst — and 230+ other taxonomy classes.
- Organizations (NPI-2) — every hospital, ambulatory surgery center, group practice, clinic, federally qualified health center (FQHC), skilled nursing facility (SNF), home health agency, hospice, DME supplier, independent lab, retail pharmacy, mail-order pharmacy, specialty pharmacy, telehealth platform, and ACO.
- Sub-parts & locations — multi-site organizations enumerate practice locations separately, all queryable.
- Identifiers — state Medicaid IDs, state license numbers, legacy UPINs and other linked credentials.
- FHIR Direct endpoints — the federally-published secure-messaging addresses that enable interoperable health-record exchange.
Each record includes the 10-digit NPI, full provider/organization name, professional credential, primary taxonomy (specialty), supporting taxonomies, all licensure numbers + state, status (Active / Deactivated), enumeration date, last-update date, mailing address with phone, location/practice address with phone, identifiers array, other-names array (DBA, former names, AKAs), endpoints array, and a canonical sourceUrl linking back to the official CMS provider-view page.
Why scrape NPPES yourself when this exists?
The NPPES API is technically "public," but engineering teams who try to roll their own pipeline keep hitting the same wall of friction:
- The API caps each response at 200 records — anything beyond requires manual offset pagination with no
next_pagetoken. - The response JSON is deeply nested (basic, taxonomies[], addresses[], identifiers[], other_names[], endpoints[], practiceLocations[]) and shape-shifts between NPI-1 individual and NPI-2 organization records.
- Address handling is messy — providers can have multiple addresses with different purposes (
LOCATION,MAILING,PRIMARY,SECONDARY) and you have to pick the right one for your use case (telephone outreach vs. mailing). - Taxonomy codes use the NUCC Healthcare Provider Taxonomy (15-character alphanumeric codes mapped to a 230+-entry classification tree) — useless without the human-readable
descfield correctly joined. - The CMS server returns HTTP 200 with an error object instead of HTTP 4xx/5xx — naive clients silently swallow failures.
- A single typo in a state code (
NyvsNY) returns zero results with no warning. - Searching specialties like "Family Medicine" requires knowing it's spelled "Family Medicine" not "Family Practice" (which is the deprecated label).
- Building a multi-state, multi-city, multi-specialty sweep is dozens of orchestrated queries with deduplication — a lot of pipeline code you don't want to write or maintain.
- There is no bulk-download endpoint — every record must be paged through the search API.
- And the CMS API has no SDK in your favorite language — you're on your own with raw HTTP.
This actor solves all of that: it builds the full task matrix from your filters, paginates through every page of every query in parallel-friendly batches, retries failed calls with exponential backoff, deduplicates by NPI, normalizes the nested JSON into a flat row, and pushes ready-to-consume records to the Apify dataset. Zero glue code.
Quick Start
One-Click Run
- Click "Try for free" on the Apify Store page for this actor.
- Enter at least one filter — for example,
lastName = "Smith"plusstates = ["NY"], ortaxonomyDescription = "Cardiology"pluscities = ["Los Angeles"]. Alternatively paste a list of 10-digit NPIs into the NPI Numbers field for direct lookup. - Hit Start — the actor paginates through every matching provider, 200 per page.
- Export the dataset as JSON, CSV, Excel (XLSX), HTML, XML, RSS, or JSON Lines from the Apify dataset view, or pull it via the REST API.
API Run (Python)
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("haketa/nppes-npi-registry-scraper").call(run_input={"taxonomyDescription": "Cardiology","states": ["NY", "NJ", "CT"],"enumerationType": "NPI-1","maxRecords": 2000})for provider in client.dataset(run["defaultDatasetId"]).iterate_items():print(provider["npi"],provider["fullName"],provider["credential"],provider["primaryTaxonomyDesc"],provider["city"], provider["state"],provider["phone"])
API Run (Node.js / TypeScript)
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });const run = await client.actor('haketa/nppes-npi-registry-scraper').call({npis: ['1982445060', '1932487765'],taxonomyDescription: 'Pharmacy',states: ['CA'],cities: ['Los Angeles', 'San Francisco'],enumerationType: 'NPI-2',maxRecords: 5000});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Pulled ${items.length} California pharmacy organizations + 2 direct NPI lookups`);
API Run (cURL)
curl -X POST "https://api.apify.com/v2/acts/haketa~nppes-npi-registry-scraper/runs?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"lastName": "Patel","states": ["TX", "FL"],"enumerationType": "NPI-1","maxRecords": 1000}'
API Run (raw NPI lookup, fastest mode)
curl -X POST "https://api.apify.com/v2/acts/haketa~nppes-npi-registry-scraper/runs?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"npis":["1982445060","1932487765","1689675476"]}'
How It Works
The actor talks to one upstream endpoint — the official CMS NPPES v2.1 JSON API:
| Endpoint | Method | Purpose |
|---|---|---|
https://npiregistry.cms.hhs.gov/api/?version=2.1&number={NPI} | GET | Direct 10-digit NPI lookup |
https://npiregistry.cms.hhs.gov/api/?version=2.1&first_name=...&last_name=...&state=... | GET | Individual provider search |
https://npiregistry.cms.hhs.gov/api/?version=2.1&organization_name=...&state=... | GET | Organization (NPI-2) search |
https://npiregistry.cms.hhs.gov/api/?version=2.1&taxonomy_description=Cardiology&state=NY | GET | Specialty search |
https://npiregistry.cms.hhs.gov/api/?version=2.1&postal_code=10001 | GET | ZIP-code search |
https://npiregistry.cms.hhs.gov/provider-view/{NPI} | GET (web) | Canonical provider-view page (returned as sourceUrl) |
Engineering details
- HTTP-only via
got-scraping— no Puppeteer, no Playwright, no Chromium overhead. The CMS API serves JSON directly; there is nothing to render. - No authentication required — NPPES is a federally-mandated public registry. No API key, no OAuth, no IP whitelisting.
- No proxy required — CMS does not rate-limit reasonable consumers. The actor exposes an optional proxy field for organizations with strict egress policies, but most runs leave it off.
- Pagination via
skipoffset — the API returns up to 200 records per call; the actor advancesskipby 200 until the page returns fewer than 200 rows (the last page). Verified to deep-paginate past 10,000+ offset on broad queries. - Task fan-out — when you supply multiple cities × states × NPIs, the actor builds a Cartesian-product task list and runs each combination as a discrete query.
- 3-attempt retry with exponential backoff —
fetchJson()retries each query up to 3 times with2000ms × attempt + random jitterbefore giving up on that one URL. - NPI-level deduplication — a
Setof seen NPIs prevents the same provider from being saved twice when overlapping queries (e.g.lastName=Smith + state=NYandlastName=Smith + state=NJ) happen to surface the same record. - Defensive nested-JSON parsing — the normalizer copes with missing
basic, missingtaxonomies, missingaddresses, and the divergent shapes of NPI-1 vs. NPI-2 records. - Address triage — MAILING and LOCATION addresses are extracted separately;
phone,city,state,postalCode, andcountryat the top level prefer the LOCATION (practice) address with fallback to MAILING. - Status normalization — CMS returns
"A"or"D"codes; the actor maps these to human-readableActive/Deactivated. - Fail-fast on empty results — if a run produces zero records the actor calls
Actor.fail()so scheduled pipelines visibly alert rather than silently writing empty datasets. - Polite request pacing — configurable
requestDelay(default 800 ms) with random jitter between page fetches.
What version=2.1 returns out of the box
The CMS v2.1 API returns rich nested objects per provider:
basic— names, credential, gender, sole-proprietor flag, enumeration date, last-update date, statustaxonomies[]— every taxonomy the provider claims, each with code, description, license number, license state, primary flag, taxonomy groupaddresses[]— multiple addresses withaddress_purpose(LOCATION / MAILING / PRIMARY / SECONDARY), telephone, fax, countryidentifiers[]— state Medicaid IDs, state license IDs, other-issuer credentialsother_names[]— former names, AKAs, DBAsendpoints[]— FHIR Direct messaging endpoints for interoperable exchangepracticeLocations[]— multi-site organizations' physical practice locations
The actor preserves all of this — see the Output Schema below.
Input Parameters
{"npis": [],"firstName": "","lastName": "","organizationName": "","taxonomyDescription": "","cities": [],"states": [],"postalCode": "","enumerationType": "any","addressPurpose": "any","maxRecords": 500,"requestDelay": 800}
Parameter reference
| Parameter | Type | Default | Description |
|---|---|---|---|
npis | array<string> | [] | 10-digit NPI numbers for direct lookup. Fastest mode — bypasses search and returns the exact provider. Example: ["1982445060", "1932487765"]. Each NPI becomes its own task. |
firstName | string | "" | Individual provider first name. Partial match, case-insensitive. Example: "Sarah". |
lastName | string | "" | Individual provider last name. Partial match. Example: "Patel". Use alone for broad search, combine with state for narrower results. |
organizationName | string | "" | Organization (NPI-2) legal-business-name search. Example: "Mount Sinai", "Kaiser Permanente", "CVS Pharmacy". |
taxonomyDescription | string | "" | Provider taxonomy text (NUCC specialty descriptors). Examples: "Family Medicine", "Cardiology", "Pharmacy", "Dentist", "Nurse Practitioner", "Pediatrics", "Internal Medicine". Partial match supported. |
cities | array<string> | [] | City-name filter. Example: ["New York", "Brooklyn", "Queens"]. Each city becomes a separate query when combined with other filters. |
states | array<string> | [] | 2-letter US state codes. Example: ["NY", "CA", "TX", "FL"]. Each state becomes a separate query when combined with other filters. |
postalCode | string | "" | ZIP-code filter. Partial match supported — "100" matches all 100xx ZIPs in Manhattan. |
enumerationType | enum | "any" | "NPI-1" = individual providers (physicians, NPs, RNs, etc.). "NPI-2" = organizations (hospitals, clinics, pharmacies). "any" returns both. |
addressPurpose | enum | "any" | "LOCATION" = practice address. "MAILING" = correspondence address. "PRIMARY" / "SECONDARY" = practice-location designations. "any" returns providers regardless of address-purpose match. |
maxRecords | integer | 500 | Hard cap across all queries. 0 = unlimited (pages until every query exhausts). |
requestDelay | integer | 800 | Delay in milliseconds between page fetches. CMS rate-limiting is soft; 500–2000 ms is courteous. |
proxyConfiguration | object | none | Optional Apify proxy. Not required — CMS NPPES is a public unrestricted API. |
Tip: Always supply at least one of
npis,firstName,lastName,organizationName,taxonomyDescription,cities,states, orpostalCode. The actor fails fast (Actor.fail) with no tasks built if none are provided.
Output Schema
Every record — individual or organization — uses the same flat JSON shape so downstream consumers ingest the entire dataset without per-type branching.
Identity & status
| Field | Type | Description |
|---|---|---|
npi | string | 10-digit National Provider Identifier (federal primary key) |
enumerationType | string | "NPI-1" (individual) or "NPI-2" (organization) |
firstName | string|null | Individual first name (null for orgs) |
middleName | string|null | Individual middle name (null for orgs) |
lastName | string|null | Individual last name (null for orgs) |
fullName | string | Joined full name (individual) or organization legal name (org) |
credential | string|null | Professional suffix: MD, DO, NP, PA, RN, DDS, DMD, PharmD, LCSW, PT, etc. |
organizationName | string|null | Legal business name or parent organization name |
gender | string|null | M / F (individuals only) |
status | string | Active (CMS A) or Deactivated (CMS D) |
soleProprietor | string|null | YES / NO / NA |
Dates
| Field | Type | Description |
|---|---|---|
enumerationDate | string | Date the NPI was first assigned (YYYY-MM-DD) |
certificationDate | string|null | Most-recent certification date |
lastUpdated | string | Most-recent record update (YYYY-MM-DD) |
scrapedAt | string | ISO-8601 timestamp of this scrape |
Specialty & licensure
| Field | Type | Description |
|---|---|---|
primaryTaxonomyCode | string | 10-character NUCC taxonomy code (e.g. 207RC0000X = Cardiovascular Disease) |
primaryTaxonomyDesc | string | Human-readable specialty (e.g. "Cardiovascular Disease") |
primaryLicense | string|null | State license number for primary specialty |
primaryLicenseState | string|null | 2-letter state of primary license |
taxonomies | array | Full list of taxonomies (code, desc, license, state, primary flag, taxonomy group) |
Addresses & contact
| Field | Type | Description |
|---|---|---|
mailingAddress | object | Full mailing address (address1, address2, city, state, postalCode, country, phone, fax) |
locationAddress | object | Full practice/location address (same shape) |
phone | string|null | Location address phone, fallback to mailing |
mailingPhone | string|null | Mailing-address phone |
city | string | Top-level city (location → mailing fallback) |
state | string | Top-level state |
postalCode | string | Top-level ZIP |
country | string | Top-level country |
practiceLocations | array|null | Sub-practice locations for multi-site organizations |
Identifiers & extras
| Field | Type | Description |
|---|---|---|
identifiers | array|null | State Medicaid IDs, state license IDs, other linked credentials |
otherNames | array|null | Former names, AKAs, DBAs |
endpoints | array|null | FHIR Direct messaging endpoints |
searchQuery | string | The exact query filter combination that returned this row (for provenance) |
sourceUrl | string | Canonical CMS provider-view URL (https://npiregistry.cms.hhs.gov/provider-view/{NPI}) |
Example: Individual provider (NPI-1)
{"npi": "1982445060","enumerationType": "NPI-1","firstName": "Rachel","middleName": "L","lastName": "Brown","fullName": "Rachel L Brown","credential": "LMSW","organizationName": null,"gender": "F","status": "Active","soleProprietor": "NO","enumerationDate": "2014-03-21","certificationDate": "2024-08-12","lastUpdated": "2025-11-04","primaryTaxonomyCode": "104100000X","primaryTaxonomyDesc": "Social Worker","primaryLicense": "117691","primaryLicenseState": "NY","taxonomies": [{ "code": "104100000X", "desc": "Social Worker", "license": "117691", "state": "NY", "primary": true, "taxonomyGroup": null }],"mailingAddress": {"address1": "123 W 42ND ST","address2": "SUITE 600","city": "NEW YORK","state": "NY","postalCode": "100360000","country": "US","phone": "212-555-0144","fax": null,"purpose": "MAILING"},"locationAddress": {"address1": "456 BROADWAY","address2": null,"city": "NEW YORK","state": "NY","postalCode": "100130000","country": "US","phone": "212-555-0188","fax": null,"purpose": "LOCATION"},"phone": "212-555-0188","mailingPhone": "212-555-0144","city": "NEW YORK","state": "NY","postalCode": "100130000","country": "US","practiceLocations": null,"identifiers": [{ "code": "06", "desc": "MEDICAID", "identifier": "99999123", "issuer": null, "state": "NY" }],"otherNames": null,"endpoints": null,"searchQuery": "last_name=Brown&state=NY","sourceUrl": "https://npiregistry.cms.hhs.gov/provider-view/1982445060","scrapedAt": "2026-05-18T09:14:22.401Z"}
Example: Organization (NPI-2)
{"npi": "1689675476","enumerationType": "NPI-2","firstName": null,"middleName": null,"lastName": null,"fullName": "MOUNT SINAI WEST PHARMACY","credential": null,"organizationName": "MOUNT SINAI WEST PHARMACY","gender": null,"status": "Active","soleProprietor": null,"enumerationDate": "2009-07-15","certificationDate": "2023-06-30","lastUpdated": "2025-09-18","primaryTaxonomyCode": "3336C0003X","primaryTaxonomyDesc": "Community/Retail Pharmacy","primaryLicense": "030099","primaryLicenseState": "NY","taxonomies": [{ "code": "3336C0003X", "desc": "Community/Retail Pharmacy", "license": "030099", "state": "NY", "primary": true, "taxonomyGroup": null },{ "code": "3336I0012X", "desc": "Institutional Pharmacy", "license": "030099", "state": "NY", "primary": false, "taxonomyGroup": null }],"mailingAddress": {"address1": "1000 10TH AVE","address2": null,"city": "NEW YORK","state": "NY","postalCode": "100193303","country": "US","phone": "212-555-0220","fax": "212-555-0221","purpose": "MAILING"},"locationAddress": {"address1": "1000 10TH AVE","address2": "GROUND FLOOR","city": "NEW YORK","state": "NY","postalCode": "100193303","country": "US","phone": "212-555-0220","fax": "212-555-0221","purpose": "LOCATION"},"phone": "212-555-0220","mailingPhone": "212-555-0220","city": "NEW YORK","state": "NY","postalCode": "100193303","country": "US","practiceLocations": null,"identifiers": [{ "code": "05", "desc": "MEDICAID", "identifier": "01234567", "issuer": null, "state": "NY" }],"otherNames": [{ "code": "5", "credential": null, "type": "Former Name", "organization_name": "ROOSEVELT HOSPITAL PHARMACY" }],"endpoints": null,"searchQuery": "organization_name=Mount Sinai&state=NY","sourceUrl": "https://npiregistry.cms.hhs.gov/provider-view/1689675476","scrapedAt": "2026-05-18T09:14:22.401Z"}
Reference Tables
Provider Type (enumerationType)
| Code | Meaning |
|---|---|
NPI-1 | Individual provider — physician, NP, PA, RN, pharmacist, dentist, therapist, etc. |
NPI-2 | Organization — hospital, clinic, group practice, pharmacy, ASC, FQHC, SNF, DME supplier, lab, ACO |
Provider Status
| Status | Meaning |
|---|---|
Active | NPI is current and may be used on claims (CMS code A) |
Deactivated | NPI is deactivated (death, dissolution, voluntary deactivation) — CMS code D |
Common Taxonomy / Specialty Examples
NPPES uses the NUCC Healthcare Provider Taxonomy (230+ codes). A sample relevant to most users:
| Taxonomy Code | Specialty Description |
|---|---|
207R00000X | Internal Medicine |
207RC0000X | Cardiovascular Disease |
207Q00000X | Family Medicine |
207V00000X | Obstetrics & Gynecology |
2084N0400X | Neurology |
208000000X | Pediatrics |
2085R0202X | Diagnostic Radiology |
2086S0102X | Surgical Critical Care |
367500000X | Certified Registered Nurse Anesthetist (CRNA) |
363L00000X | Nurse Practitioner |
363A00000X | Physician Assistant |
225100000X | Physical Therapist |
122300000X | Dentist |
1223G0001X | General Practice Dentistry |
183500000X | Pharmacist |
3336C0003X | Community/Retail Pharmacy |
3336M0002X | Mail Order Pharmacy |
3336S0011X | Specialty Pharmacy |
282N00000X | General Acute Care Hospital |
261QF0400X | Federally Qualified Health Center (FQHC) |
314000000X | Skilled Nursing Facility |
251E00000X | Home Health Agency |
103T00000X | Psychologist |
1041C0700X | Clinical Social Worker (LCSW) |
Tip: You don't need taxonomy codes — searching by
taxonomyDescription: "Cardiology"ortaxonomyDescription: "Family Medicine"works on the human-readable text.
Address Purpose (addressPurpose)
| Purpose | Meaning |
|---|---|
LOCATION | The provider's actual practice address — best for phone outreach |
MAILING | Correspondence address (PO Box common) — best for postal mail |
PRIMARY | NPI-2 organizations' primary practice location |
SECONDARY | Additional practice locations for multi-site orgs |
Use Cases
The NPPES registry is the single richest source of US healthcare-provider information in existence — and because it's a federal public registry, it powers commercial workflows across the entire healthcare ecosystem.
Pharmaceutical & Medical-Device Sales Targeting
Pharma and med-device reps use NPI data as the foundation of every territory and account plan:
- Build prescriber target lists by specialty and geography — every cardiologist, oncologist, endocrinologist, or rheumatologist in a 50-mile radius of a launch hub
- Identify KOLs and high-volume prescribers when joined with Medicare Part D Open Payments and Part D Prescriber datasets
- Map sales territories by physician density, specialty mix, and group-practice affiliation
- Find newly-enumerated specialists by filtering on
enumerationDate— early outreach to new-to-practice physicians beats every competitor by 6–12 months - Power CME and speaker-bureau outreach with up-to-date credentials, practice addresses, and phone numbers
- Cross-reference state license numbers to verify a provider is actively licensed in your launch states
Healthcare Staffing, Locum Tenens & Travel Nursing
Locum, travel-nursing, and per-diem platforms use NPPES as their cross-state credentialing source-of-truth:
- Verify candidate NPIs before placement to confirm identity and active status
- Find multi-state licensed clinicians — pull all NPI-1 records whose
taxonomies[]array contains licenses in multiple states, the gold signal for telehealth-ready or travel-ready providers - Source physicians, NPs, PAs, RNs, CRNAs, PTs, OTs, SLPs by specialty + city + state + license status
- Detect expired or deactivated NPIs in the candidate pipeline before submitting them to a client facility
- Build credentialing packets faster — many credentialing-verification orgs (CVOs) start their workflow from NPPES
Insurance & Payer Networks
Health insurers, PBMs, ACOs, and provider-directory aggregators rely on NPPES to maintain accurate networks:
- Audit network adequacy under federal Network Adequacy / No Surprises Act requirements — count active providers per specialty per ZIP
- Refresh provider directories continuously — federal "Consumer-Friendly Provider Directory" rules require monthly verification
- Identify out-of-network providers for steerage messaging
- Reconcile claims when the billing NPI doesn't match the rendering NPI
- Detect orphaned or duplicate provider records in payer master data using NPI as the federal primary key
Compliance, Credentialing & Claims Verification
Revenue-cycle, compliance, and credentialing teams use NPPES for automated provider verification:
- Verify provider NPIs at claim submission to prevent denied claims
- Cross-check against OIG LEIE and SAM.gov exclusion lists starting from a verified NPI roster
- Monitor employed-physician status changes —
status: Deactivatedflags death, license loss, or voluntary withdrawal - Confirm taxonomy alignment — billing taxonomy on the claim must match the provider's enumerated taxonomy
- Maintain audit-ready logs with timestamped
scrapedAtand provenance viasourceUrlfor every verification event - Replace expensive third-party credentialing APIs that charge per-lookup fees
Healthcare M&A and Investor Due Diligence
PE firms, strategic acquirers, and healthcare investment banks use NPPES to underwrite practice acquisitions:
- Inventory a target practice's provider roster — every NPI-1 affiliated with the NPI-2 organization
- Quantify multi-state license footprint of physician partners — critical for telehealth-rollup theses
- Assess specialty mix of a group practice or ASC at acquisition
- Detect "ghost providers" — billed-from NPIs not actually practicing at the location
- Benchmark provider density when comparing acquisition targets in adjacent geographies
- Track post-close provider attrition by re-scraping monthly
Public Health, Policy & Workforce Research
Academic researchers, state health departments, RWJF / Kaiser Family Foundation analysts, and policy think tanks use NPPES as a workforce dataset:
- Map provider supply per capita per specialty across rural vs. urban regions
- Identify pharmacy deserts, primary-care deserts, mental-health-provider deserts by ZIP code or census tract
- Track workforce trends — psychiatric NP growth, primary-care MD decline, retail-pharmacy consolidation
- Power Certificate of Need (CON) applications with current empirical provider distribution
- Inform graduate medical education (GME) planning by tracking residency-completing physician retention
- Quantify the multi-state telehealth-licensure landscape post-pandemic
Telehealth & Digital-Health Startup Launch
Telehealth platforms and digital-health startups use NPPES to scout licensed providers state-by-state:
- Find physicians licensed in every state your platform launches in —
taxonomies[]contains a license per state - Recruit clinicians by specialty + geography for asynchronous-care, virtual-first, or chronic-care platforms
- Audit network breadth when applying for HHS waivers, state Medicaid contracts, or commercial-payer onboarding
- Verify provider identity at clinician onboarding (KYC for clinical credentials)
- Maintain a credentialing data lake updated daily so onboarding never blocks on stale verifications
B2B Healthcare SaaS Marketing & Demand Generation
EHR vendors, RCM platforms, telehealth tooling, AI medical scribes, and clinical-decision-support startups use NPPES to drive demand:
- Segment outreach by practice type — independent group, hospital-employed, FQHC, ASC, urgent care
- Personalize messaging by specialty — a dental-practice SaaS messages 1223G0001X dentists differently than 122300000X general
- Build account-based-marketing (ABM) lists of high-priority organizations (NPI-2) with their affiliated providers (NPI-1)
- Enrich inbound leads by NPI lookup at form submission — auto-populate practice address, specialty, and credentials
- Power conference & event lists by city, specialty, and active-status filter
Government, Medicare/Medicaid Operations & Policy
Federal/state agency contractors and Medicare/Medicaid intermediaries use NPPES daily:
- Track Medicare/Medicaid enrolled provider populations when joined with PECOS data
- Monitor workforce-shortage area redesignation with current empirical provider counts
- Build equity dashboards — provider density by race-of-population, income tier, urbanicity
- Audit Medicaid managed-care organization provider networks for state compliance reviews
- Power state telehealth-licensure registries and interstate-compact tracking
Journalism & Investigative Reporting
Newsrooms (ProPublica, Kaiser Health News, STAT, local investigative units) use NPPES as a starting graph for healthcare investigations:
- Detect sham clinics and pill-mill patterns — high-volume NPI-2 organizations with thin provider rosters
- Map opioid-prescribing networks when joined with Medicare Part D data
- Track physician sanctioning patterns by cross-referencing state medical boards
- Investigate provider directory accuracy — federal "ghost network" reporting requires NPPES as the truth source
- Identify cross-state provider movement post-license revocation in one state
Real-Estate, Site-Selection & Retail Healthcare Strategy
Commercial-real-estate firms, retail-clinic operators (CVS MinuteClinic, Walgreens, One Medical), and urgent-care chains use NPPES for site selection:
- Identify primary-care shortage ZIPs as urgent-care site candidates
- Benchmark competition density before signing a lease
- Inform DTC retail-pharmacy expansion with current pharmacy distribution
- Support medical-office-building (MOB) investment theses with on-site provider density
Sample Queries & Recipes
Recipe 1 — Every cardiologist in the NY tri-state area
{"taxonomyDescription": "Cardiovascular Disease","states": ["NY", "NJ", "CT"],"enumerationType": "NPI-1","maxRecords": 0}
Pulls every active and historical cardiology specialist enumerated in NY, NJ, or CT — perfect for a pharma cardiac-drug launch list or a private-equity cardiology rollup target list.
Recipe 2 — All retail pharmacies in California
{"taxonomyDescription": "Community/Retail Pharmacy","states": ["CA"],"enumerationType": "NPI-2","maxRecords": 0}
Returns every NPI-2 community/retail pharmacy in California — independents plus chain locations — with phone numbers and mailing addresses.
Recipe 3 — Direct lookup batch for verification
{"npis": ["1982445060","1932487765","1689675476","1801880335","1023045678"]}
Fastest mode: one request per NPI, returns the full record for each. Ideal for nightly compliance verification of an employed-provider roster.
Recipe 4 — All NPs and PAs in greater Houston
{"taxonomyDescription": "Nurse Practitioner","cities": ["Houston", "Katy", "The Woodlands", "Sugar Land"],"states": ["TX"],"enumerationType": "NPI-1","maxRecords": 5000}
Run this once for NPs, then again with taxonomyDescription: "Physician Assistant" to assemble a full mid-level provider list.
Recipe 5 — Multi-specialty pull for telehealth launch in 5 states
{"taxonomyDescription": "Family Medicine","states": ["FL", "TX", "GA", "NC", "VA"],"enumerationType": "NPI-1","addressPurpose": "LOCATION","maxRecords": 0}
Use this as the cornerstone of a multi-state telehealth provider acquisition pipeline.
Recipe 6 — Single ZIP audit (network adequacy)
{"postalCode": "10001","enumerationType": "any","maxRecords": 0}
Returns every individual and organization NPI registered to ZIP 10001 — Manhattan Chelsea — for a payer's network-adequacy audit.
Recipe 7 — All Mount Sinai-affiliated organizations
{"organizationName": "Mount Sinai","states": ["NY"],"enumerationType": "NPI-2","maxRecords": 0}
Use the practiceLocations[] and identifiers[] fields downstream to roll up affiliated entities.
Recipe 8 — Sample 50 records before committing
{"lastName": "Smith","states": ["NY"],"maxRecords": 5}
Test mode — confirm the shape of returned records (the smoke-test query used during development) before launching a multi-thousand-record sweep.
Integration Examples
Google Sheets (via Apify integration)
- Schedule the actor daily at 06:00 with your search filters
- Add the Export to Google Sheets integration on the schedule
- A fresh NPI dataset arrives in your sheet every morning — ready for CRM sync, vlookups, and pivot tables
Make.com / Zapier / n8n
Use the Apify connector on any major automation platform. Trigger downstream workflows on:
- New NPIs (today's run minus yesterday's) — feed into Salesforce as Lead records
statuschanges (Active→Deactivated) — open a compliance review case- Address changes (provider relocations) — sync to HubSpot Account fields
- New
endpoints[]entries — push to your FHIR interoperability dashboard
Power BI / Tableau / Looker / Hex
Connect Apify's REST dataset endpoint as a data source. Refresh on the Apify run schedule. Visualizations to start with:
- Active providers by specialty by state
- New NPI enumerations per month
- Provider density per 100K population by county
- Network-adequacy heat maps for payer regions
- Multi-state licensure overlap analysis
Postgres / Snowflake / BigQuery / Redshift
Use the Apify webhook integration to POST the run results to your warehouse ingestion endpoint after each scheduled run. Drop into a providers_raw staging table, then dbt-model the nested arrays (taxonomies[], identifiers[], endpoints[]) into proper relational tables keyed on npi.
Salesforce / HubSpot / Pipedrive CRM Enrichment
Trigger an Apify run nightly, then upsert against Account records keyed on npi. Use the dataset's lastUpdated field as the change-detection signal. Specialty changes can trigger automatic re-assignment to the correct sales rep.
NPI Enrichment Microservice
Wrap the actor in a Make.com / n8n flow with npis as a webhook-triggered input. Any inbound lead with an NPI gets enriched in <5 seconds with full credentials, specialty, license, and address.
Provider Directory Refresh Pipeline
Schedule the actor weekly with states filtered to your payer footprint. Diff today's pull against last week's lastUpdated and status fields to detect:
- New providers joining the market
- Providers who moved or changed phone numbers
- Providers whose NPIs deactivated (death, dissolution, voluntary)
- New taxonomies added (provider specializing further)
Major US Healthcare Markets at a Glance
NPPES covers every US state, territory, and outlying area. Below are the highest-volume metro areas a typical user filters by.
| Metro Area | State | Healthcare Significance |
|---|---|---|
| New York | NY | NYC academic medical centers (Mount Sinai, NYU Langone, NewYork-Presbyterian, Northwell, Memorial Sloan Kettering) |
| Los Angeles | CA | Cedars-Sinai, UCLA, USC Keck, Kaiser Permanente regional headquarters |
| Chicago | IL | Northwestern, Rush, University of Chicago, advocate health systems |
| Houston | TX | Texas Medical Center — world's largest medical complex (54 institutions, MD Anderson, Houston Methodist) |
| Dallas–Fort Worth | TX | Baylor Scott & White, UT Southwestern, HCA networks |
| Philadelphia | PA | Penn Medicine, Jefferson, CHOP, Independence Blue Cross |
| Boston | MA | Mass General Brigham, Beth Israel Lahey, Tufts, Boston Children's |
| Atlanta | GA | Emory, Piedmont, CDC headquarters, telehealth & digital-health hub |
| Washington DC / Baltimore | DC / MD | Johns Hopkins, Medicare/CMS headquarters in Woodlawn MD |
| Miami | FL | UMiami, Jackson Health, Baptist Health South Florida |
| Phoenix | AZ | Mayo Clinic Arizona, Banner Health, HonorHealth |
| Seattle | WA | UW Medicine, Providence, Virginia Mason, Fred Hutch |
| San Francisco Bay Area | CA | UCSF, Stanford, Kaiser Permanente, telehealth-startup density |
| Minneapolis | MN | Mayo Clinic Rochester (1-hr south), Allina, Fairview, HealthPartners |
| Detroit | MI | Henry Ford, DMC, Beaumont, Michigan Medicine (Ann Arbor) |
| Cleveland | OH | Cleveland Clinic, UH, MetroHealth |
| St. Louis | MO | BJC, SSM, Mercy, Washington University |
| Pittsburgh | PA | UPMC, Allegheny Health Network |
| Tampa | FL | Moffitt, AdventHealth, BayCare |
| Charlotte | NC | Atrium Health, Novant Health |
| San Diego | CA | Scripps, Sharp, UCSD Health |
| Denver | CO | UCHealth, HealthONE, Centura, SCL |
| Portland | OR | OHSU, Providence, Legacy, Kaiser Permanente |
| Nashville | TN | HCA headquarters, Vanderbilt, Ascension Saint Thomas |
| Indianapolis | IN | IU Health, Community Health Network, Ascension St. Vincent |
| San Antonio | TX | Methodist, Baptist, University Health, BAMC (military) |
NPPES also covers Puerto Rico (PR), US Virgin Islands (VI), Guam (GU), American Samoa (AS), and Northern Mariana Islands (MP).
Cost & Performance
| Metric | Value |
|---|---|
| Engine | HTTP-only (got-scraping) — no browser |
| Data source | Official CMS NPPES JSON API (npiregistry.cms.hhs.gov/api/?version=2.1) |
| Page size | 200 records per HTTP call |
| Pagination depth | Verified past skip=10000 offset; no hard ceiling encountered |
| Runtime (single filter, 500 records) | 5–20 seconds |
| Runtime (statewide specialty sweep, 5,000 records) | 1–3 minutes |
| Runtime (multi-state sweep, 50,000+ records) | 15–45 minutes |
| Cost per run | Fractions of a Compute Unit — pay-per-event scales with record count |
| Pricing model | Pay-per-event (transparent per-record pricing) |
| Data freshness | Live at run time — every request hits the CMS API directly |
| Auth required | None |
| Proxy required | None — CMS API is public unrestricted |
| Concurrency | Safe to run multiple parallel filtered configurations |
| Memory footprint | 256 MB sufficient for most runs; 1024 MB max recommended for >100K records |
| Retry policy | 3 attempts with exponential backoff + jitter |
Compliance, Privacy & Legal Notes
This actor calls the official US federal CMS NPPES API. There is no scraping of any private website, no bypassing of any access control, and no violation of any Terms of Service. NPPES is a federally-mandated public registry created under HIPAA Administrative Simplification (45 CFR Part 162) specifically to be freely queryable by the public.
- 100% public data — every field returned is published by CMS at npiregistry.cms.hhs.gov under federal regulation
- No PHI (Protected Health Information) — NPPES contains zero patient data. It is a provider-identification registry, not a clinical database
- No SSNs, no DOBs, no salary data, no patient names — only provider-identification information
- Addresses are business/practice addresses as reported by the provider to CMS — not personal residence (in most cases)
- Phone numbers are practice phone numbers as reported by the provider to CMS
- HIPAA does not apply — provider-identification data is explicitly carved out of HIPAA protected-health-information definitions
- No emails — NPPES does not publish provider email addresses (only FHIR Direct endpoints, which are designed for secure interoperable exchange)
- No login required, no API key, no rate-limit waiver — the CMS API is intentionally open
- GDPR/CCPA/CAN-SPAM/TCPA compliance is the responsibility of the data consumer based on their downstream use case
- No robots.txt violation — the API has no
robots.txtto respect; it is an API endpoint, not a website
Important: NPPES data may not be used for unlawful purposes including identity fraud, harassment, or stalking. Commercial use (sales, marketing, recruiting, research) is explicitly permitted by federal regulation. Patient-facing outreach should always respect HIPAA Marketing Rule restrictions, TCPA call/text consent requirements, and CAN-SPAM email opt-out rules independently of NPPES sourcing.
Frequently Asked Questions
How fresh is the data?
Live at run time. Every page is fetched from the CMS NPPES API at the moment of the run. NPPES itself is updated continuously by CMS as providers enumerate new NPIs, update their addresses, add taxonomies, change license state, or deactivate.
How many records are in NPPES?
Over 7 million. As of mid-2026 the registry contains roughly 7M+ active and historical NPIs across both NPI-1 (individual) and NPI-2 (organization) types. The exact total is a moving target — new NPIs enumerate every day.
Does this scraper require login or API keys?
No. NPPES is a federally-mandated public API. No authentication, no key registration, no IP allowlisting. The only credentials you need are your Apify token to run the actor itself.
Is there a hard limit on how many records I can pull?
The CMS API caps each response at 200 records but does not cap pagination depth. We have verified pagination past skip=10000 on broad queries. For very broad queries (e.g. "every NPI in Texas"), split into multiple narrower queries (e.g. by city or ZIP) to keep individual task pagination manageable.
Do I get phone numbers?
Yes. NPPES providers report a telephone_number per address. The actor exposes both phone (location/practice address phone) and mailingPhone (mailing-address phone). Coverage is very high for NPI-2 organizations and high but not universal for NPI-1 individuals.
Do I get email addresses?
No. CMS does not collect or publish provider email addresses in NPPES. The actor exposes FHIR Direct messaging endpoints (when present) via the endpoints[] field — these are secure-interoperability endpoints, not general-purpose email.
Can I search by specialty?
Yes. Use the taxonomyDescription field with the human-readable specialty text ("Cardiology", "Family Medicine", "Pharmacy", "Dentist", "Nurse Practitioner", etc.). The actor passes it directly to the NPPES taxonomy_description query parameter, which performs partial-match search.
Can I search by state license number?
The CMS API does not expose license-number search directly. Pull a broader set (e.g. all dentists in a state) and filter downstream on the taxonomies[].license field.
Can I look up multiple NPIs in one run?
Yes. Put them in the npis array. Each NPI runs as its own task — direct lookup mode is the fastest path and works for any quantity (verified to thousands per run).
Does it work for all 50 states + DC?
Yes — plus US territories (Puerto Rico PR, US Virgin Islands VI, Guam GU, American Samoa AS, Northern Mariana Islands MP).
Does it work for non-US providers?
No. NPPES enumerates only US providers (plus territories) under HIPAA. For non-US healthcare directories see the related actors below (WhatClinic, Bookimed for global clinic directories).
Are deactivated NPIs included?
Yes — the status field distinguishes Active from Deactivated. Filter downstream as needed. Deactivated NPIs are useful for historical research, M&A due diligence, and detecting fraud signals.
Can I get sub-organization or practice-location data?
Yes. The practiceLocations[] field returns multi-site organizations' physical practice locations. Many large health systems enumerate one NPI-2 per location separately — searching by organization name returns all of them.
What's the difference between LOCATION and MAILING address?
- LOCATION is where the provider physically practices — best for phone outreach and proximity analysis.
- MAILING is where correspondence is sent — often a PO Box, HQ, or billing office. Best for postal-mail outreach.
The actor exposes both as nested objects (locationAddress, mailingAddress) and surfaces the LOCATION values at the top level (with MAILING fallback).
Can I run this on the Apify Free Plan?
Yes — full functionality on the free tier. A test run of 100–500 records costs pennies; even a 50,000-record statewide sweep stays well under typical free-tier credit.
Can I schedule the actor?
Yes. Apify's built-in Scheduler lets you trigger the actor on any cron expression. Combined with the Google Sheets / webhook / warehouse-load integrations, you can stand up a fully-automated NPI refresh pipeline in under 30 minutes.
Can I export to CSV, Excel, or JSON Lines?
Yes. The Apify dataset view exports to JSON, CSV, Excel (XLSX), HTML, XML, RSS, and JSON Lines — directly from the UI or via the dataset API.
What happens if my query returns zero results?
The actor calls Actor.fail() with a descriptive message. Common causes: misspelled state code (use 2-letter uppercase NY not Ny), misspelled specialty ("Family Practice" is deprecated — use "Family Medicine"), or impossible filter combinations (e.g. enumerationType: "NPI-1" + organizationName).
How do I cross-reference NPI to state-board license data?
Each NPI's taxonomies[] includes license and state per taxonomy. Join against state-board license actors (Texas TSBP, California DCA, Ohio eLicense, Illinois IDFPR, Virginia DPOR, Colorado DORA, Minnesota DLI — see Related Actors) on the license number + state pair.
Does the actor deduplicate?
Yes. A Set of seen NPIs prevents the same provider from being saved twice across overlapping query combinations.
How do I report a bug or request a feature?
Open an issue on the Apify Store actor page or contact the developer directly through the Apify Console.
Related Apify Actors by Haketa
If you build healthcare, compliance, lead-generation, or government-data pipelines, these related actors compose cleanly with the NPI Registry Scraper:
Healthcare directories & marketplaces
- WhatClinic.com Clinic Scraper — global clinic directory (US + 100+ countries) for cosmetic, dental, fertility, and elective specialties
- Bookimed.com Medical Tourism Scraper — international medical-tourism clinic listings, prices, and patient reviews
- ProductHunt Launches & Makers Scraper — daily startup launches, makers, votes & reviews — VC/founder/recruiter intel
US state professional licensing peers (cross-reference NPI license numbers)
- Texas Pharmacy License Scraper — TSBP — every pharmacist, pharmacy, intern, technician in Texas
- Ohio eLicense Scraper — every Ohio professional license (medical, nursing, pharmacy, etc.)
- Illinois IDFPR License Scraper — 1.2M+ Illinois professional licenses
- California DCA Professional License Scraper — California Department of Consumer Affairs healthcare boards
- Virginia DPOR Professional License Scraper — Virginia regulated occupations
- Colorado Professional License Scraper — Colorado DORA licensed professionals
- Minnesota DLI Professional License Scraper — Minnesota Department of Labor & Industry licensing
- Washington L&I Contractor License Scraper
- North Carolina Licensing Board for General Contractors Scraper
- Arizona ROC Contractor License Scraper
Federal disclosure & business registries
- SAM.gov Federal Contractor Entity Scraper — federal contractor entity registry
- TTB Alcohol Permittee Scraper — federal alcohol-industry permits
- BBB Business Scraper — Better Business Bureau profiles, ratings, complaints
- H1B Visa Database Scraper — US federal H1B visa disclosure data
Comparison vs. Alternatives
| Approach | Setup time | Data freshness | Cost per 10K records | Schema normalization | Multi-filter support |
|---|---|---|---|---|---|
| This actor | < 1 minute | Live at run time | Cents | Built-in flat schema | Cartesian fan-out |
| Manual NPPES Web UI queries | Per-query click-through | Live | Free but unscalable | None | One filter at a time |
| Roll-your-own NPPES API client | 4–12 hours dev | Live | Free + infra | DIY | DIY |
| Definitive Healthcare subscription | Days | Variable | $$$$ (enterprise contract) | Bundled | Bundled |
| IQVIA / Symphony Health datasets | Weeks | Refresh schedule | $$$$ (enterprise contract) | Bundled | Bundled |
| Health-specific data brokers | Days | Variable | $ per record | Bundled | Bundled |
| FOIA records request to CMS | Weeks | Stale | Free | None | None |
The advantage of going direct to NPPES via this actor: live data, no contract, no minimums, and full transparency about exactly which CMS endpoint sourced every field.
Why Pay-Per-Event Pricing?
Most healthcare data products either charge a flat enterprise subscription (you pay even if you don't query) or per-API-call (unpredictable spend that punishes deep pagination). This actor uses pay-per-event pricing:
- You only pay when the actor runs
- Charges scale linearly with the number of records actually delivered
- Transparent line-item billing inside Apify
- No monthly minimums, no annual commits
- Free to evaluate — sample with
maxRecords: 5for pennies - No surprise overage charges on broad statewide sweeps
Changelog
| Version | Date | Notes |
|---|---|---|
| 1.0.0 | 2026-05-18 | Initial public release — direct CMS NPPES v2.1 API, NPI / name / taxonomy / location / org-name search, NPI-1 + NPI-2 support, full nested-JSON normalization, retry + dedupe, pay-per-event pricing |
Keywords
NPPES NPI scraper · NPI Registry API · CMS NPI database · US healthcare provider scraper · NPI lookup bulk · physician database scraper · NPI taxonomy specialty · healthcare provider directory API · Medicare provider lookup · NPI bulk download · doctor specialty database US · pharma sales targeting database · healthcare credentialing automation · physician contact info scraper · medical license verification API · CMS NPPES API client · National Provider Identifier extractor · NPI-1 individual provider data · NPI-2 organization NPI scraper · US physician directory · US nurse practitioner database · US pharmacist NPI lookup · US dentist NPI database · US hospital NPI registry · US clinic NPI registry · US group practice NPI · retail pharmacy NPI scraper · mail-order pharmacy NPI lookup · specialty pharmacy directory · physician taxonomy code NUCC · NPI license number cross-reference · network adequacy audit data · provider directory refresh API · telehealth provider scouting · multi-state physician licensing · locum tenens credentialing source · healthcare M&A due diligence dataset · medical-device territory mapping · prescriber targeting database · KOL identification healthcare · pharma rep territory planning · Medicare Part D enrichment · FHIR Direct endpoint directory · healthcare lead generation B2B · payer network audit · clinical staffing source · CRM enrichment healthcare · Salesforce HubSpot NPI enrichment · public health workforce data · pharmacy desert mapping · primary care shortage area · health-policy research dataset · Apify healthcare actor · npiregistry.cms.hhs.gov API · NPI bulk export CSV JSON · provider-view scraper
Support
- Bug reports: Use the Issues tab on the Apify Store page
- Feature requests: Same place — please describe your healthcare workflow and the field or filter you need
- Direct contact: Through the Apify developer profile
If this actor saves your team hours of NPI verification, credentialing automation, or sales-targeting work, a 5-star rating on the Apify Store helps other healthcare, compliance, and life-sciences teams discover it. Thank you.