NPI Registry Scraper | 7M+ US Healthcare Providers (CMS) avatar

NPI Registry Scraper | 7M+ US Healthcare Providers (CMS)

Pricing

from $1.00 / 1,000 results

Go to Apify Store
NPI Registry Scraper | 7M+ US Healthcare Providers (CMS)

NPI Registry Scraper | 7M+ US Healthcare Providers (CMS)

Scrape the official US CMS NPPES NPI Registry (7M+ healthcare providers). Per-provider: NPI number, name + credential, taxonomy (specialty + license), addresses, phone, identifiers, organization data. Free public CMS API — pharma sales, healthcare staffing, insurance, credentialing.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Haketa

Haketa

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

13 hours ago

Last modified

Share

NPI Registry Scraper — 7M+ US Healthcare Providers from the Official CMS NPPES API

The most complete NPPES NPI Registry data extraction actor on Apify. Pull every licensed US healthcare provider — physicians, NPs, PAs, RNs, pharmacists, dentists, therapists, hospitals, clinics, group practices, pharmacies — straight from the official Centers for Medicare & Medicaid Services (CMS) National Plan and Provider Enumeration System at npiregistry.cms.hhs.gov. Search by NPI, name, specialty (taxonomy), city, state, or ZIP. ~7,000,000+ providers indexed, no auth, no rate limit, no scraping of private sites — 100% official public CMS data.

Apify Actor


What This Actor Does

The NPI Registry Scraper is a production-grade Apify Actor that queries the official US CMS NPPES API (https://npiregistry.cms.hhs.gov/api/?version=2.1) — the federal source-of-truth for every healthcare provider authorized to bill Medicare, Medicaid, or any HIPAA-covered payer in the United States.

Every clinician, allied-health professional, and healthcare organization operating in the US is required by HIPAA to obtain a National Provider Identifier (NPI) before submitting electronic transactions. The result: a freely-queryable federal registry of 7,000,000+ providers, refreshed continuously as new providers enumerate and existing providers update their licensure, address, or specialty. This actor turns that API into a structured, paginated, ready-for-CRM/data-warehouse dataset.

In a single run the actor can return tens, thousands, or hundreds of thousands of fully-normalized provider records — each one covering:

  • Individual providers (NPI-1) — every physician (MD, DO), nurse practitioner (NP, CRNP, APRN), physician assistant (PA), registered nurse (RN), dentist (DDS, DMD), pharmacist (RPh, PharmD), psychologist, social worker (LCSW, LMSW), physical therapist (PT, DPT), occupational therapist (OT), chiropractor (DC), optometrist (OD), podiatrist (DPM), audiologist, dietitian, midwife, behavior analyst — and 230+ other taxonomy classes.
  • Organizations (NPI-2) — every hospital, ambulatory surgery center, group practice, clinic, federally qualified health center (FQHC), skilled nursing facility (SNF), home health agency, hospice, DME supplier, independent lab, retail pharmacy, mail-order pharmacy, specialty pharmacy, telehealth platform, and ACO.
  • Sub-parts & locations — multi-site organizations enumerate practice locations separately, all queryable.
  • Identifiers — state Medicaid IDs, state license numbers, legacy UPINs and other linked credentials.
  • FHIR Direct endpoints — the federally-published secure-messaging addresses that enable interoperable health-record exchange.

Each record includes the 10-digit NPI, full provider/organization name, professional credential, primary taxonomy (specialty), supporting taxonomies, all licensure numbers + state, status (Active / Deactivated), enumeration date, last-update date, mailing address with phone, location/practice address with phone, identifiers array, other-names array (DBA, former names, AKAs), endpoints array, and a canonical sourceUrl linking back to the official CMS provider-view page.

Why scrape NPPES yourself when this exists?

The NPPES API is technically "public," but engineering teams who try to roll their own pipeline keep hitting the same wall of friction:

  • The API caps each response at 200 records — anything beyond requires manual offset pagination with no next_page token.
  • The response JSON is deeply nested (basic, taxonomies[], addresses[], identifiers[], other_names[], endpoints[], practiceLocations[]) and shape-shifts between NPI-1 individual and NPI-2 organization records.
  • Address handling is messy — providers can have multiple addresses with different purposes (LOCATION, MAILING, PRIMARY, SECONDARY) and you have to pick the right one for your use case (telephone outreach vs. mailing).
  • Taxonomy codes use the NUCC Healthcare Provider Taxonomy (15-character alphanumeric codes mapped to a 230+-entry classification tree) — useless without the human-readable desc field correctly joined.
  • The CMS server returns HTTP 200 with an error object instead of HTTP 4xx/5xx — naive clients silently swallow failures.
  • A single typo in a state code (Ny vs NY) returns zero results with no warning.
  • Searching specialties like "Family Medicine" requires knowing it's spelled "Family Medicine" not "Family Practice" (which is the deprecated label).
  • Building a multi-state, multi-city, multi-specialty sweep is dozens of orchestrated queries with deduplication — a lot of pipeline code you don't want to write or maintain.
  • There is no bulk-download endpoint — every record must be paged through the search API.
  • And the CMS API has no SDK in your favorite language — you're on your own with raw HTTP.

This actor solves all of that: it builds the full task matrix from your filters, paginates through every page of every query in parallel-friendly batches, retries failed calls with exponential backoff, deduplicates by NPI, normalizes the nested JSON into a flat row, and pushes ready-to-consume records to the Apify dataset. Zero glue code.


Quick Start

One-Click Run

  1. Click "Try for free" on the Apify Store page for this actor.
  2. Enter at least one filter — for example, lastName = "Smith" plus states = ["NY"], or taxonomyDescription = "Cardiology" plus cities = ["Los Angeles"]. Alternatively paste a list of 10-digit NPIs into the NPI Numbers field for direct lookup.
  3. Hit Start — the actor paginates through every matching provider, 200 per page.
  4. Export the dataset as JSON, CSV, Excel (XLSX), HTML, XML, RSS, or JSON Lines from the Apify dataset view, or pull it via the REST API.

API Run (Python)

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("haketa/nppes-npi-registry-scraper").call(run_input={
"taxonomyDescription": "Cardiology",
"states": ["NY", "NJ", "CT"],
"enumerationType": "NPI-1",
"maxRecords": 2000
})
for provider in client.dataset(run["defaultDatasetId"]).iterate_items():
print(
provider["npi"],
provider["fullName"],
provider["credential"],
provider["primaryTaxonomyDesc"],
provider["city"], provider["state"],
provider["phone"]
)

API Run (Node.js / TypeScript)

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
const run = await client.actor('haketa/nppes-npi-registry-scraper').call({
npis: ['1982445060', '1932487765'],
taxonomyDescription: 'Pharmacy',
states: ['CA'],
cities: ['Los Angeles', 'San Francisco'],
enumerationType: 'NPI-2',
maxRecords: 5000
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Pulled ${items.length} California pharmacy organizations + 2 direct NPI lookups`);

API Run (cURL)

curl -X POST "https://api.apify.com/v2/acts/haketa~nppes-npi-registry-scraper/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"lastName": "Patel",
"states": ["TX", "FL"],
"enumerationType": "NPI-1",
"maxRecords": 1000
}'

API Run (raw NPI lookup, fastest mode)

curl -X POST "https://api.apify.com/v2/acts/haketa~nppes-npi-registry-scraper/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"npis":["1982445060","1932487765","1689675476"]}'

How It Works

The actor talks to one upstream endpoint — the official CMS NPPES v2.1 JSON API:

EndpointMethodPurpose
https://npiregistry.cms.hhs.gov/api/?version=2.1&number={NPI}GETDirect 10-digit NPI lookup
https://npiregistry.cms.hhs.gov/api/?version=2.1&first_name=...&last_name=...&state=...GETIndividual provider search
https://npiregistry.cms.hhs.gov/api/?version=2.1&organization_name=...&state=...GETOrganization (NPI-2) search
https://npiregistry.cms.hhs.gov/api/?version=2.1&taxonomy_description=Cardiology&state=NYGETSpecialty search
https://npiregistry.cms.hhs.gov/api/?version=2.1&postal_code=10001GETZIP-code search
https://npiregistry.cms.hhs.gov/provider-view/{NPI}GET (web)Canonical provider-view page (returned as sourceUrl)

Engineering details

  • HTTP-only via got-scraping — no Puppeteer, no Playwright, no Chromium overhead. The CMS API serves JSON directly; there is nothing to render.
  • No authentication required — NPPES is a federally-mandated public registry. No API key, no OAuth, no IP whitelisting.
  • No proxy required — CMS does not rate-limit reasonable consumers. The actor exposes an optional proxy field for organizations with strict egress policies, but most runs leave it off.
  • Pagination via skip offset — the API returns up to 200 records per call; the actor advances skip by 200 until the page returns fewer than 200 rows (the last page). Verified to deep-paginate past 10,000+ offset on broad queries.
  • Task fan-out — when you supply multiple cities × states × NPIs, the actor builds a Cartesian-product task list and runs each combination as a discrete query.
  • 3-attempt retry with exponential backofffetchJson() retries each query up to 3 times with 2000ms × attempt + random jitter before giving up on that one URL.
  • NPI-level deduplication — a Set of seen NPIs prevents the same provider from being saved twice when overlapping queries (e.g. lastName=Smith + state=NY and lastName=Smith + state=NJ) happen to surface the same record.
  • Defensive nested-JSON parsing — the normalizer copes with missing basic, missing taxonomies, missing addresses, and the divergent shapes of NPI-1 vs. NPI-2 records.
  • Address triage — MAILING and LOCATION addresses are extracted separately; phone, city, state, postalCode, and country at the top level prefer the LOCATION (practice) address with fallback to MAILING.
  • Status normalization — CMS returns "A" or "D" codes; the actor maps these to human-readable Active / Deactivated.
  • Fail-fast on empty results — if a run produces zero records the actor calls Actor.fail() so scheduled pipelines visibly alert rather than silently writing empty datasets.
  • Polite request pacing — configurable requestDelay (default 800 ms) with random jitter between page fetches.

What version=2.1 returns out of the box

The CMS v2.1 API returns rich nested objects per provider:

  • basic — names, credential, gender, sole-proprietor flag, enumeration date, last-update date, status
  • taxonomies[] — every taxonomy the provider claims, each with code, description, license number, license state, primary flag, taxonomy group
  • addresses[] — multiple addresses with address_purpose (LOCATION / MAILING / PRIMARY / SECONDARY), telephone, fax, country
  • identifiers[] — state Medicaid IDs, state license IDs, other-issuer credentials
  • other_names[] — former names, AKAs, DBAs
  • endpoints[] — FHIR Direct messaging endpoints for interoperable exchange
  • practiceLocations[] — multi-site organizations' physical practice locations

The actor preserves all of this — see the Output Schema below.


Input Parameters

{
"npis": [],
"firstName": "",
"lastName": "",
"organizationName": "",
"taxonomyDescription": "",
"cities": [],
"states": [],
"postalCode": "",
"enumerationType": "any",
"addressPurpose": "any",
"maxRecords": 500,
"requestDelay": 800
}

Parameter reference

ParameterTypeDefaultDescription
npisarray<string>[]10-digit NPI numbers for direct lookup. Fastest mode — bypasses search and returns the exact provider. Example: ["1982445060", "1932487765"]. Each NPI becomes its own task.
firstNamestring""Individual provider first name. Partial match, case-insensitive. Example: "Sarah".
lastNamestring""Individual provider last name. Partial match. Example: "Patel". Use alone for broad search, combine with state for narrower results.
organizationNamestring""Organization (NPI-2) legal-business-name search. Example: "Mount Sinai", "Kaiser Permanente", "CVS Pharmacy".
taxonomyDescriptionstring""Provider taxonomy text (NUCC specialty descriptors). Examples: "Family Medicine", "Cardiology", "Pharmacy", "Dentist", "Nurse Practitioner", "Pediatrics", "Internal Medicine". Partial match supported.
citiesarray<string>[]City-name filter. Example: ["New York", "Brooklyn", "Queens"]. Each city becomes a separate query when combined with other filters.
statesarray<string>[]2-letter US state codes. Example: ["NY", "CA", "TX", "FL"]. Each state becomes a separate query when combined with other filters.
postalCodestring""ZIP-code filter. Partial match supported — "100" matches all 100xx ZIPs in Manhattan.
enumerationTypeenum"any""NPI-1" = individual providers (physicians, NPs, RNs, etc.). "NPI-2" = organizations (hospitals, clinics, pharmacies). "any" returns both.
addressPurposeenum"any""LOCATION" = practice address. "MAILING" = correspondence address. "PRIMARY" / "SECONDARY" = practice-location designations. "any" returns providers regardless of address-purpose match.
maxRecordsinteger500Hard cap across all queries. 0 = unlimited (pages until every query exhausts).
requestDelayinteger800Delay in milliseconds between page fetches. CMS rate-limiting is soft; 500–2000 ms is courteous.
proxyConfigurationobjectnoneOptional Apify proxy. Not required — CMS NPPES is a public unrestricted API.

Tip: Always supply at least one of npis, firstName, lastName, organizationName, taxonomyDescription, cities, states, or postalCode. The actor fails fast (Actor.fail) with no tasks built if none are provided.


Output Schema

Every record — individual or organization — uses the same flat JSON shape so downstream consumers ingest the entire dataset without per-type branching.

Identity & status

FieldTypeDescription
npistring10-digit National Provider Identifier (federal primary key)
enumerationTypestring"NPI-1" (individual) or "NPI-2" (organization)
firstNamestring|nullIndividual first name (null for orgs)
middleNamestring|nullIndividual middle name (null for orgs)
lastNamestring|nullIndividual last name (null for orgs)
fullNamestringJoined full name (individual) or organization legal name (org)
credentialstring|nullProfessional suffix: MD, DO, NP, PA, RN, DDS, DMD, PharmD, LCSW, PT, etc.
organizationNamestring|nullLegal business name or parent organization name
genderstring|nullM / F (individuals only)
statusstringActive (CMS A) or Deactivated (CMS D)
soleProprietorstring|nullYES / NO / NA

Dates

FieldTypeDescription
enumerationDatestringDate the NPI was first assigned (YYYY-MM-DD)
certificationDatestring|nullMost-recent certification date
lastUpdatedstringMost-recent record update (YYYY-MM-DD)
scrapedAtstringISO-8601 timestamp of this scrape

Specialty & licensure

FieldTypeDescription
primaryTaxonomyCodestring10-character NUCC taxonomy code (e.g. 207RC0000X = Cardiovascular Disease)
primaryTaxonomyDescstringHuman-readable specialty (e.g. "Cardiovascular Disease")
primaryLicensestring|nullState license number for primary specialty
primaryLicenseStatestring|null2-letter state of primary license
taxonomiesarrayFull list of taxonomies (code, desc, license, state, primary flag, taxonomy group)

Addresses & contact

FieldTypeDescription
mailingAddressobjectFull mailing address (address1, address2, city, state, postalCode, country, phone, fax)
locationAddressobjectFull practice/location address (same shape)
phonestring|nullLocation address phone, fallback to mailing
mailingPhonestring|nullMailing-address phone
citystringTop-level city (location → mailing fallback)
statestringTop-level state
postalCodestringTop-level ZIP
countrystringTop-level country
practiceLocationsarray|nullSub-practice locations for multi-site organizations

Identifiers & extras

FieldTypeDescription
identifiersarray|nullState Medicaid IDs, state license IDs, other linked credentials
otherNamesarray|nullFormer names, AKAs, DBAs
endpointsarray|nullFHIR Direct messaging endpoints
searchQuerystringThe exact query filter combination that returned this row (for provenance)
sourceUrlstringCanonical CMS provider-view URL (https://npiregistry.cms.hhs.gov/provider-view/{NPI})

Example: Individual provider (NPI-1)

{
"npi": "1982445060",
"enumerationType": "NPI-1",
"firstName": "Rachel",
"middleName": "L",
"lastName": "Brown",
"fullName": "Rachel L Brown",
"credential": "LMSW",
"organizationName": null,
"gender": "F",
"status": "Active",
"soleProprietor": "NO",
"enumerationDate": "2014-03-21",
"certificationDate": "2024-08-12",
"lastUpdated": "2025-11-04",
"primaryTaxonomyCode": "104100000X",
"primaryTaxonomyDesc": "Social Worker",
"primaryLicense": "117691",
"primaryLicenseState": "NY",
"taxonomies": [
{ "code": "104100000X", "desc": "Social Worker", "license": "117691", "state": "NY", "primary": true, "taxonomyGroup": null }
],
"mailingAddress": {
"address1": "123 W 42ND ST",
"address2": "SUITE 600",
"city": "NEW YORK",
"state": "NY",
"postalCode": "100360000",
"country": "US",
"phone": "212-555-0144",
"fax": null,
"purpose": "MAILING"
},
"locationAddress": {
"address1": "456 BROADWAY",
"address2": null,
"city": "NEW YORK",
"state": "NY",
"postalCode": "100130000",
"country": "US",
"phone": "212-555-0188",
"fax": null,
"purpose": "LOCATION"
},
"phone": "212-555-0188",
"mailingPhone": "212-555-0144",
"city": "NEW YORK",
"state": "NY",
"postalCode": "100130000",
"country": "US",
"practiceLocations": null,
"identifiers": [
{ "code": "06", "desc": "MEDICAID", "identifier": "99999123", "issuer": null, "state": "NY" }
],
"otherNames": null,
"endpoints": null,
"searchQuery": "last_name=Brown&state=NY",
"sourceUrl": "https://npiregistry.cms.hhs.gov/provider-view/1982445060",
"scrapedAt": "2026-05-18T09:14:22.401Z"
}

Example: Organization (NPI-2)

{
"npi": "1689675476",
"enumerationType": "NPI-2",
"firstName": null,
"middleName": null,
"lastName": null,
"fullName": "MOUNT SINAI WEST PHARMACY",
"credential": null,
"organizationName": "MOUNT SINAI WEST PHARMACY",
"gender": null,
"status": "Active",
"soleProprietor": null,
"enumerationDate": "2009-07-15",
"certificationDate": "2023-06-30",
"lastUpdated": "2025-09-18",
"primaryTaxonomyCode": "3336C0003X",
"primaryTaxonomyDesc": "Community/Retail Pharmacy",
"primaryLicense": "030099",
"primaryLicenseState": "NY",
"taxonomies": [
{ "code": "3336C0003X", "desc": "Community/Retail Pharmacy", "license": "030099", "state": "NY", "primary": true, "taxonomyGroup": null },
{ "code": "3336I0012X", "desc": "Institutional Pharmacy", "license": "030099", "state": "NY", "primary": false, "taxonomyGroup": null }
],
"mailingAddress": {
"address1": "1000 10TH AVE",
"address2": null,
"city": "NEW YORK",
"state": "NY",
"postalCode": "100193303",
"country": "US",
"phone": "212-555-0220",
"fax": "212-555-0221",
"purpose": "MAILING"
},
"locationAddress": {
"address1": "1000 10TH AVE",
"address2": "GROUND FLOOR",
"city": "NEW YORK",
"state": "NY",
"postalCode": "100193303",
"country": "US",
"phone": "212-555-0220",
"fax": "212-555-0221",
"purpose": "LOCATION"
},
"phone": "212-555-0220",
"mailingPhone": "212-555-0220",
"city": "NEW YORK",
"state": "NY",
"postalCode": "100193303",
"country": "US",
"practiceLocations": null,
"identifiers": [
{ "code": "05", "desc": "MEDICAID", "identifier": "01234567", "issuer": null, "state": "NY" }
],
"otherNames": [
{ "code": "5", "credential": null, "type": "Former Name", "organization_name": "ROOSEVELT HOSPITAL PHARMACY" }
],
"endpoints": null,
"searchQuery": "organization_name=Mount Sinai&state=NY",
"sourceUrl": "https://npiregistry.cms.hhs.gov/provider-view/1689675476",
"scrapedAt": "2026-05-18T09:14:22.401Z"
}

Reference Tables

Provider Type (enumerationType)

CodeMeaning
NPI-1Individual provider — physician, NP, PA, RN, pharmacist, dentist, therapist, etc.
NPI-2Organization — hospital, clinic, group practice, pharmacy, ASC, FQHC, SNF, DME supplier, lab, ACO

Provider Status

StatusMeaning
ActiveNPI is current and may be used on claims (CMS code A)
DeactivatedNPI is deactivated (death, dissolution, voluntary deactivation) — CMS code D

Common Taxonomy / Specialty Examples

NPPES uses the NUCC Healthcare Provider Taxonomy (230+ codes). A sample relevant to most users:

Taxonomy CodeSpecialty Description
207R00000XInternal Medicine
207RC0000XCardiovascular Disease
207Q00000XFamily Medicine
207V00000XObstetrics & Gynecology
2084N0400XNeurology
208000000XPediatrics
2085R0202XDiagnostic Radiology
2086S0102XSurgical Critical Care
367500000XCertified Registered Nurse Anesthetist (CRNA)
363L00000XNurse Practitioner
363A00000XPhysician Assistant
225100000XPhysical Therapist
122300000XDentist
1223G0001XGeneral Practice Dentistry
183500000XPharmacist
3336C0003XCommunity/Retail Pharmacy
3336M0002XMail Order Pharmacy
3336S0011XSpecialty Pharmacy
282N00000XGeneral Acute Care Hospital
261QF0400XFederally Qualified Health Center (FQHC)
314000000XSkilled Nursing Facility
251E00000XHome Health Agency
103T00000XPsychologist
1041C0700XClinical Social Worker (LCSW)

Tip: You don't need taxonomy codes — searching by taxonomyDescription: "Cardiology" or taxonomyDescription: "Family Medicine" works on the human-readable text.

Address Purpose (addressPurpose)

PurposeMeaning
LOCATIONThe provider's actual practice address — best for phone outreach
MAILINGCorrespondence address (PO Box common) — best for postal mail
PRIMARYNPI-2 organizations' primary practice location
SECONDARYAdditional practice locations for multi-site orgs

Use Cases

The NPPES registry is the single richest source of US healthcare-provider information in existence — and because it's a federal public registry, it powers commercial workflows across the entire healthcare ecosystem.

Pharmaceutical & Medical-Device Sales Targeting

Pharma and med-device reps use NPI data as the foundation of every territory and account plan:

  • Build prescriber target lists by specialty and geography — every cardiologist, oncologist, endocrinologist, or rheumatologist in a 50-mile radius of a launch hub
  • Identify KOLs and high-volume prescribers when joined with Medicare Part D Open Payments and Part D Prescriber datasets
  • Map sales territories by physician density, specialty mix, and group-practice affiliation
  • Find newly-enumerated specialists by filtering on enumerationDate — early outreach to new-to-practice physicians beats every competitor by 6–12 months
  • Power CME and speaker-bureau outreach with up-to-date credentials, practice addresses, and phone numbers
  • Cross-reference state license numbers to verify a provider is actively licensed in your launch states

Healthcare Staffing, Locum Tenens & Travel Nursing

Locum, travel-nursing, and per-diem platforms use NPPES as their cross-state credentialing source-of-truth:

  • Verify candidate NPIs before placement to confirm identity and active status
  • Find multi-state licensed clinicians — pull all NPI-1 records whose taxonomies[] array contains licenses in multiple states, the gold signal for telehealth-ready or travel-ready providers
  • Source physicians, NPs, PAs, RNs, CRNAs, PTs, OTs, SLPs by specialty + city + state + license status
  • Detect expired or deactivated NPIs in the candidate pipeline before submitting them to a client facility
  • Build credentialing packets faster — many credentialing-verification orgs (CVOs) start their workflow from NPPES

Insurance & Payer Networks

Health insurers, PBMs, ACOs, and provider-directory aggregators rely on NPPES to maintain accurate networks:

  • Audit network adequacy under federal Network Adequacy / No Surprises Act requirements — count active providers per specialty per ZIP
  • Refresh provider directories continuously — federal "Consumer-Friendly Provider Directory" rules require monthly verification
  • Identify out-of-network providers for steerage messaging
  • Reconcile claims when the billing NPI doesn't match the rendering NPI
  • Detect orphaned or duplicate provider records in payer master data using NPI as the federal primary key

Compliance, Credentialing & Claims Verification

Revenue-cycle, compliance, and credentialing teams use NPPES for automated provider verification:

  • Verify provider NPIs at claim submission to prevent denied claims
  • Cross-check against OIG LEIE and SAM.gov exclusion lists starting from a verified NPI roster
  • Monitor employed-physician status changesstatus: Deactivated flags death, license loss, or voluntary withdrawal
  • Confirm taxonomy alignment — billing taxonomy on the claim must match the provider's enumerated taxonomy
  • Maintain audit-ready logs with timestamped scrapedAt and provenance via sourceUrl for every verification event
  • Replace expensive third-party credentialing APIs that charge per-lookup fees

Healthcare M&A and Investor Due Diligence

PE firms, strategic acquirers, and healthcare investment banks use NPPES to underwrite practice acquisitions:

  • Inventory a target practice's provider roster — every NPI-1 affiliated with the NPI-2 organization
  • Quantify multi-state license footprint of physician partners — critical for telehealth-rollup theses
  • Assess specialty mix of a group practice or ASC at acquisition
  • Detect "ghost providers" — billed-from NPIs not actually practicing at the location
  • Benchmark provider density when comparing acquisition targets in adjacent geographies
  • Track post-close provider attrition by re-scraping monthly

Public Health, Policy & Workforce Research

Academic researchers, state health departments, RWJF / Kaiser Family Foundation analysts, and policy think tanks use NPPES as a workforce dataset:

  • Map provider supply per capita per specialty across rural vs. urban regions
  • Identify pharmacy deserts, primary-care deserts, mental-health-provider deserts by ZIP code or census tract
  • Track workforce trends — psychiatric NP growth, primary-care MD decline, retail-pharmacy consolidation
  • Power Certificate of Need (CON) applications with current empirical provider distribution
  • Inform graduate medical education (GME) planning by tracking residency-completing physician retention
  • Quantify the multi-state telehealth-licensure landscape post-pandemic

Telehealth & Digital-Health Startup Launch

Telehealth platforms and digital-health startups use NPPES to scout licensed providers state-by-state:

  • Find physicians licensed in every state your platform launches intaxonomies[] contains a license per state
  • Recruit clinicians by specialty + geography for asynchronous-care, virtual-first, or chronic-care platforms
  • Audit network breadth when applying for HHS waivers, state Medicaid contracts, or commercial-payer onboarding
  • Verify provider identity at clinician onboarding (KYC for clinical credentials)
  • Maintain a credentialing data lake updated daily so onboarding never blocks on stale verifications

B2B Healthcare SaaS Marketing & Demand Generation

EHR vendors, RCM platforms, telehealth tooling, AI medical scribes, and clinical-decision-support startups use NPPES to drive demand:

  • Segment outreach by practice type — independent group, hospital-employed, FQHC, ASC, urgent care
  • Personalize messaging by specialty — a dental-practice SaaS messages 1223G0001X dentists differently than 122300000X general
  • Build account-based-marketing (ABM) lists of high-priority organizations (NPI-2) with their affiliated providers (NPI-1)
  • Enrich inbound leads by NPI lookup at form submission — auto-populate practice address, specialty, and credentials
  • Power conference & event lists by city, specialty, and active-status filter

Government, Medicare/Medicaid Operations & Policy

Federal/state agency contractors and Medicare/Medicaid intermediaries use NPPES daily:

  • Track Medicare/Medicaid enrolled provider populations when joined with PECOS data
  • Monitor workforce-shortage area redesignation with current empirical provider counts
  • Build equity dashboards — provider density by race-of-population, income tier, urbanicity
  • Audit Medicaid managed-care organization provider networks for state compliance reviews
  • Power state telehealth-licensure registries and interstate-compact tracking

Journalism & Investigative Reporting

Newsrooms (ProPublica, Kaiser Health News, STAT, local investigative units) use NPPES as a starting graph for healthcare investigations:

  • Detect sham clinics and pill-mill patterns — high-volume NPI-2 organizations with thin provider rosters
  • Map opioid-prescribing networks when joined with Medicare Part D data
  • Track physician sanctioning patterns by cross-referencing state medical boards
  • Investigate provider directory accuracy — federal "ghost network" reporting requires NPPES as the truth source
  • Identify cross-state provider movement post-license revocation in one state

Real-Estate, Site-Selection & Retail Healthcare Strategy

Commercial-real-estate firms, retail-clinic operators (CVS MinuteClinic, Walgreens, One Medical), and urgent-care chains use NPPES for site selection:

  • Identify primary-care shortage ZIPs as urgent-care site candidates
  • Benchmark competition density before signing a lease
  • Inform DTC retail-pharmacy expansion with current pharmacy distribution
  • Support medical-office-building (MOB) investment theses with on-site provider density

Sample Queries & Recipes

Recipe 1 — Every cardiologist in the NY tri-state area

{
"taxonomyDescription": "Cardiovascular Disease",
"states": ["NY", "NJ", "CT"],
"enumerationType": "NPI-1",
"maxRecords": 0
}

Pulls every active and historical cardiology specialist enumerated in NY, NJ, or CT — perfect for a pharma cardiac-drug launch list or a private-equity cardiology rollup target list.

Recipe 2 — All retail pharmacies in California

{
"taxonomyDescription": "Community/Retail Pharmacy",
"states": ["CA"],
"enumerationType": "NPI-2",
"maxRecords": 0
}

Returns every NPI-2 community/retail pharmacy in California — independents plus chain locations — with phone numbers and mailing addresses.

Recipe 3 — Direct lookup batch for verification

{
"npis": [
"1982445060",
"1932487765",
"1689675476",
"1801880335",
"1023045678"
]
}

Fastest mode: one request per NPI, returns the full record for each. Ideal for nightly compliance verification of an employed-provider roster.

Recipe 4 — All NPs and PAs in greater Houston

{
"taxonomyDescription": "Nurse Practitioner",
"cities": ["Houston", "Katy", "The Woodlands", "Sugar Land"],
"states": ["TX"],
"enumerationType": "NPI-1",
"maxRecords": 5000
}

Run this once for NPs, then again with taxonomyDescription: "Physician Assistant" to assemble a full mid-level provider list.

Recipe 5 — Multi-specialty pull for telehealth launch in 5 states

{
"taxonomyDescription": "Family Medicine",
"states": ["FL", "TX", "GA", "NC", "VA"],
"enumerationType": "NPI-1",
"addressPurpose": "LOCATION",
"maxRecords": 0
}

Use this as the cornerstone of a multi-state telehealth provider acquisition pipeline.

Recipe 6 — Single ZIP audit (network adequacy)

{
"postalCode": "10001",
"enumerationType": "any",
"maxRecords": 0
}

Returns every individual and organization NPI registered to ZIP 10001 — Manhattan Chelsea — for a payer's network-adequacy audit.

Recipe 7 — All Mount Sinai-affiliated organizations

{
"organizationName": "Mount Sinai",
"states": ["NY"],
"enumerationType": "NPI-2",
"maxRecords": 0
}

Use the practiceLocations[] and identifiers[] fields downstream to roll up affiliated entities.

Recipe 8 — Sample 50 records before committing

{
"lastName": "Smith",
"states": ["NY"],
"maxRecords": 5
}

Test mode — confirm the shape of returned records (the smoke-test query used during development) before launching a multi-thousand-record sweep.


Integration Examples

Google Sheets (via Apify integration)

  1. Schedule the actor daily at 06:00 with your search filters
  2. Add the Export to Google Sheets integration on the schedule
  3. A fresh NPI dataset arrives in your sheet every morning — ready for CRM sync, vlookups, and pivot tables

Make.com / Zapier / n8n

Use the Apify connector on any major automation platform. Trigger downstream workflows on:

  • New NPIs (today's run minus yesterday's) — feed into Salesforce as Lead records
  • status changes (ActiveDeactivated) — open a compliance review case
  • Address changes (provider relocations) — sync to HubSpot Account fields
  • New endpoints[] entries — push to your FHIR interoperability dashboard

Power BI / Tableau / Looker / Hex

Connect Apify's REST dataset endpoint as a data source. Refresh on the Apify run schedule. Visualizations to start with:

  • Active providers by specialty by state
  • New NPI enumerations per month
  • Provider density per 100K population by county
  • Network-adequacy heat maps for payer regions
  • Multi-state licensure overlap analysis

Postgres / Snowflake / BigQuery / Redshift

Use the Apify webhook integration to POST the run results to your warehouse ingestion endpoint after each scheduled run. Drop into a providers_raw staging table, then dbt-model the nested arrays (taxonomies[], identifiers[], endpoints[]) into proper relational tables keyed on npi.

Salesforce / HubSpot / Pipedrive CRM Enrichment

Trigger an Apify run nightly, then upsert against Account records keyed on npi. Use the dataset's lastUpdated field as the change-detection signal. Specialty changes can trigger automatic re-assignment to the correct sales rep.

NPI Enrichment Microservice

Wrap the actor in a Make.com / n8n flow with npis as a webhook-triggered input. Any inbound lead with an NPI gets enriched in <5 seconds with full credentials, specialty, license, and address.

Provider Directory Refresh Pipeline

Schedule the actor weekly with states filtered to your payer footprint. Diff today's pull against last week's lastUpdated and status fields to detect:

  • New providers joining the market
  • Providers who moved or changed phone numbers
  • Providers whose NPIs deactivated (death, dissolution, voluntary)
  • New taxonomies added (provider specializing further)

Major US Healthcare Markets at a Glance

NPPES covers every US state, territory, and outlying area. Below are the highest-volume metro areas a typical user filters by.

Metro AreaStateHealthcare Significance
New YorkNYNYC academic medical centers (Mount Sinai, NYU Langone, NewYork-Presbyterian, Northwell, Memorial Sloan Kettering)
Los AngelesCACedars-Sinai, UCLA, USC Keck, Kaiser Permanente regional headquarters
ChicagoILNorthwestern, Rush, University of Chicago, advocate health systems
HoustonTXTexas Medical Center — world's largest medical complex (54 institutions, MD Anderson, Houston Methodist)
Dallas–Fort WorthTXBaylor Scott & White, UT Southwestern, HCA networks
PhiladelphiaPAPenn Medicine, Jefferson, CHOP, Independence Blue Cross
BostonMAMass General Brigham, Beth Israel Lahey, Tufts, Boston Children's
AtlantaGAEmory, Piedmont, CDC headquarters, telehealth & digital-health hub
Washington DC / BaltimoreDC / MDJohns Hopkins, Medicare/CMS headquarters in Woodlawn MD
MiamiFLUMiami, Jackson Health, Baptist Health South Florida
PhoenixAZMayo Clinic Arizona, Banner Health, HonorHealth
SeattleWAUW Medicine, Providence, Virginia Mason, Fred Hutch
San Francisco Bay AreaCAUCSF, Stanford, Kaiser Permanente, telehealth-startup density
MinneapolisMNMayo Clinic Rochester (1-hr south), Allina, Fairview, HealthPartners
DetroitMIHenry Ford, DMC, Beaumont, Michigan Medicine (Ann Arbor)
ClevelandOHCleveland Clinic, UH, MetroHealth
St. LouisMOBJC, SSM, Mercy, Washington University
PittsburghPAUPMC, Allegheny Health Network
TampaFLMoffitt, AdventHealth, BayCare
CharlotteNCAtrium Health, Novant Health
San DiegoCAScripps, Sharp, UCSD Health
DenverCOUCHealth, HealthONE, Centura, SCL
PortlandOROHSU, Providence, Legacy, Kaiser Permanente
NashvilleTNHCA headquarters, Vanderbilt, Ascension Saint Thomas
IndianapolisINIU Health, Community Health Network, Ascension St. Vincent
San AntonioTXMethodist, Baptist, University Health, BAMC (military)

NPPES also covers Puerto Rico (PR), US Virgin Islands (VI), Guam (GU), American Samoa (AS), and Northern Mariana Islands (MP).


Cost & Performance

MetricValue
EngineHTTP-only (got-scraping) — no browser
Data sourceOfficial CMS NPPES JSON API (npiregistry.cms.hhs.gov/api/?version=2.1)
Page size200 records per HTTP call
Pagination depthVerified past skip=10000 offset; no hard ceiling encountered
Runtime (single filter, 500 records)5–20 seconds
Runtime (statewide specialty sweep, 5,000 records)1–3 minutes
Runtime (multi-state sweep, 50,000+ records)15–45 minutes
Cost per runFractions of a Compute Unit — pay-per-event scales with record count
Pricing modelPay-per-event (transparent per-record pricing)
Data freshnessLive at run time — every request hits the CMS API directly
Auth requiredNone
Proxy requiredNone — CMS API is public unrestricted
ConcurrencySafe to run multiple parallel filtered configurations
Memory footprint256 MB sufficient for most runs; 1024 MB max recommended for >100K records
Retry policy3 attempts with exponential backoff + jitter

This actor calls the official US federal CMS NPPES API. There is no scraping of any private website, no bypassing of any access control, and no violation of any Terms of Service. NPPES is a federally-mandated public registry created under HIPAA Administrative Simplification (45 CFR Part 162) specifically to be freely queryable by the public.

  • 100% public data — every field returned is published by CMS at npiregistry.cms.hhs.gov under federal regulation
  • No PHI (Protected Health Information) — NPPES contains zero patient data. It is a provider-identification registry, not a clinical database
  • No SSNs, no DOBs, no salary data, no patient names — only provider-identification information
  • Addresses are business/practice addresses as reported by the provider to CMS — not personal residence (in most cases)
  • Phone numbers are practice phone numbers as reported by the provider to CMS
  • HIPAA does not apply — provider-identification data is explicitly carved out of HIPAA protected-health-information definitions
  • No emails — NPPES does not publish provider email addresses (only FHIR Direct endpoints, which are designed for secure interoperable exchange)
  • No login required, no API key, no rate-limit waiver — the CMS API is intentionally open
  • GDPR/CCPA/CAN-SPAM/TCPA compliance is the responsibility of the data consumer based on their downstream use case
  • No robots.txt violation — the API has no robots.txt to respect; it is an API endpoint, not a website

Important: NPPES data may not be used for unlawful purposes including identity fraud, harassment, or stalking. Commercial use (sales, marketing, recruiting, research) is explicitly permitted by federal regulation. Patient-facing outreach should always respect HIPAA Marketing Rule restrictions, TCPA call/text consent requirements, and CAN-SPAM email opt-out rules independently of NPPES sourcing.


Frequently Asked Questions

How fresh is the data?

Live at run time. Every page is fetched from the CMS NPPES API at the moment of the run. NPPES itself is updated continuously by CMS as providers enumerate new NPIs, update their addresses, add taxonomies, change license state, or deactivate.

How many records are in NPPES?

Over 7 million. As of mid-2026 the registry contains roughly 7M+ active and historical NPIs across both NPI-1 (individual) and NPI-2 (organization) types. The exact total is a moving target — new NPIs enumerate every day.

Does this scraper require login or API keys?

No. NPPES is a federally-mandated public API. No authentication, no key registration, no IP allowlisting. The only credentials you need are your Apify token to run the actor itself.

Is there a hard limit on how many records I can pull?

The CMS API caps each response at 200 records but does not cap pagination depth. We have verified pagination past skip=10000 on broad queries. For very broad queries (e.g. "every NPI in Texas"), split into multiple narrower queries (e.g. by city or ZIP) to keep individual task pagination manageable.

Do I get phone numbers?

Yes. NPPES providers report a telephone_number per address. The actor exposes both phone (location/practice address phone) and mailingPhone (mailing-address phone). Coverage is very high for NPI-2 organizations and high but not universal for NPI-1 individuals.

Do I get email addresses?

No. CMS does not collect or publish provider email addresses in NPPES. The actor exposes FHIR Direct messaging endpoints (when present) via the endpoints[] field — these are secure-interoperability endpoints, not general-purpose email.

Can I search by specialty?

Yes. Use the taxonomyDescription field with the human-readable specialty text ("Cardiology", "Family Medicine", "Pharmacy", "Dentist", "Nurse Practitioner", etc.). The actor passes it directly to the NPPES taxonomy_description query parameter, which performs partial-match search.

Can I search by state license number?

The CMS API does not expose license-number search directly. Pull a broader set (e.g. all dentists in a state) and filter downstream on the taxonomies[].license field.

Can I look up multiple NPIs in one run?

Yes. Put them in the npis array. Each NPI runs as its own task — direct lookup mode is the fastest path and works for any quantity (verified to thousands per run).

Does it work for all 50 states + DC?

Yes — plus US territories (Puerto Rico PR, US Virgin Islands VI, Guam GU, American Samoa AS, Northern Mariana Islands MP).

Does it work for non-US providers?

No. NPPES enumerates only US providers (plus territories) under HIPAA. For non-US healthcare directories see the related actors below (WhatClinic, Bookimed for global clinic directories).

Are deactivated NPIs included?

Yes — the status field distinguishes Active from Deactivated. Filter downstream as needed. Deactivated NPIs are useful for historical research, M&A due diligence, and detecting fraud signals.

Can I get sub-organization or practice-location data?

Yes. The practiceLocations[] field returns multi-site organizations' physical practice locations. Many large health systems enumerate one NPI-2 per location separately — searching by organization name returns all of them.

What's the difference between LOCATION and MAILING address?

  • LOCATION is where the provider physically practices — best for phone outreach and proximity analysis.
  • MAILING is where correspondence is sent — often a PO Box, HQ, or billing office. Best for postal-mail outreach.

The actor exposes both as nested objects (locationAddress, mailingAddress) and surfaces the LOCATION values at the top level (with MAILING fallback).

Can I run this on the Apify Free Plan?

Yes — full functionality on the free tier. A test run of 100–500 records costs pennies; even a 50,000-record statewide sweep stays well under typical free-tier credit.

Can I schedule the actor?

Yes. Apify's built-in Scheduler lets you trigger the actor on any cron expression. Combined with the Google Sheets / webhook / warehouse-load integrations, you can stand up a fully-automated NPI refresh pipeline in under 30 minutes.

Can I export to CSV, Excel, or JSON Lines?

Yes. The Apify dataset view exports to JSON, CSV, Excel (XLSX), HTML, XML, RSS, and JSON Lines — directly from the UI or via the dataset API.

What happens if my query returns zero results?

The actor calls Actor.fail() with a descriptive message. Common causes: misspelled state code (use 2-letter uppercase NY not Ny), misspelled specialty ("Family Practice" is deprecated — use "Family Medicine"), or impossible filter combinations (e.g. enumerationType: "NPI-1" + organizationName).

How do I cross-reference NPI to state-board license data?

Each NPI's taxonomies[] includes license and state per taxonomy. Join against state-board license actors (Texas TSBP, California DCA, Ohio eLicense, Illinois IDFPR, Virginia DPOR, Colorado DORA, Minnesota DLI — see Related Actors) on the license number + state pair.

Does the actor deduplicate?

Yes. A Set of seen NPIs prevents the same provider from being saved twice across overlapping query combinations.

How do I report a bug or request a feature?

Open an issue on the Apify Store actor page or contact the developer directly through the Apify Console.


If you build healthcare, compliance, lead-generation, or government-data pipelines, these related actors compose cleanly with the NPI Registry Scraper:

Healthcare directories & marketplaces

US state professional licensing peers (cross-reference NPI license numbers)

Federal disclosure & business registries


Comparison vs. Alternatives

ApproachSetup timeData freshnessCost per 10K recordsSchema normalizationMulti-filter support
This actor< 1 minuteLive at run timeCentsBuilt-in flat schemaCartesian fan-out
Manual NPPES Web UI queriesPer-query click-throughLiveFree but unscalableNoneOne filter at a time
Roll-your-own NPPES API client4–12 hours devLiveFree + infraDIYDIY
Definitive Healthcare subscriptionDaysVariable$$$$ (enterprise contract)BundledBundled
IQVIA / Symphony Health datasetsWeeksRefresh schedule$$$$ (enterprise contract)BundledBundled
Health-specific data brokersDaysVariable$ per recordBundledBundled
FOIA records request to CMSWeeksStaleFreeNoneNone

The advantage of going direct to NPPES via this actor: live data, no contract, no minimums, and full transparency about exactly which CMS endpoint sourced every field.


Why Pay-Per-Event Pricing?

Most healthcare data products either charge a flat enterprise subscription (you pay even if you don't query) or per-API-call (unpredictable spend that punishes deep pagination). This actor uses pay-per-event pricing:

  • You only pay when the actor runs
  • Charges scale linearly with the number of records actually delivered
  • Transparent line-item billing inside Apify
  • No monthly minimums, no annual commits
  • Free to evaluate — sample with maxRecords: 5 for pennies
  • No surprise overage charges on broad statewide sweeps

Changelog

VersionDateNotes
1.0.02026-05-18Initial public release — direct CMS NPPES v2.1 API, NPI / name / taxonomy / location / org-name search, NPI-1 + NPI-2 support, full nested-JSON normalization, retry + dedupe, pay-per-event pricing

Keywords

NPPES NPI scraper · NPI Registry API · CMS NPI database · US healthcare provider scraper · NPI lookup bulk · physician database scraper · NPI taxonomy specialty · healthcare provider directory API · Medicare provider lookup · NPI bulk download · doctor specialty database US · pharma sales targeting database · healthcare credentialing automation · physician contact info scraper · medical license verification API · CMS NPPES API client · National Provider Identifier extractor · NPI-1 individual provider data · NPI-2 organization NPI scraper · US physician directory · US nurse practitioner database · US pharmacist NPI lookup · US dentist NPI database · US hospital NPI registry · US clinic NPI registry · US group practice NPI · retail pharmacy NPI scraper · mail-order pharmacy NPI lookup · specialty pharmacy directory · physician taxonomy code NUCC · NPI license number cross-reference · network adequacy audit data · provider directory refresh API · telehealth provider scouting · multi-state physician licensing · locum tenens credentialing source · healthcare M&A due diligence dataset · medical-device territory mapping · prescriber targeting database · KOL identification healthcare · pharma rep territory planning · Medicare Part D enrichment · FHIR Direct endpoint directory · healthcare lead generation B2B · payer network audit · clinical staffing source · CRM enrichment healthcare · Salesforce HubSpot NPI enrichment · public health workforce data · pharmacy desert mapping · primary care shortage area · health-policy research dataset · Apify healthcare actor · npiregistry.cms.hhs.gov API · NPI bulk export CSV JSON · provider-view scraper


Support

  • Bug reports: Use the Issues tab on the Apify Store page
  • Feature requests: Same place — please describe your healthcare workflow and the field or filter you need
  • Direct contact: Through the Apify developer profile

If this actor saves your team hours of NPI verification, credentialing automation, or sales-targeting work, a 5-star rating on the Apify Store helps other healthcare, compliance, and life-sciences teams discover it. Thank you.