Healthcare Provider Scraper avatar
Healthcare Provider Scraper
Under maintenance

Pricing

from $2.00 / 1,000 results

Go to Apify Store
Healthcare Provider Scraper

Healthcare Provider Scraper

Under maintenance

Extract healthcare provider data from US NPI Registry. Get doctor profiles, specialties, locations & contact info. Ideal for medical marketing, research & patient referrals.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Vhub Systems

Vhub Systems

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

2 hours ago

Last modified

Categories

Share

Healthcare Provider Directory Scraper

Professional actor for scraping and searching healthcare provider data from the official US National Provider Identifier (NPI) Registry and other public medical directories.

Overview

This Apify actor provides programmatic access to authoritative healthcare provider data including:

  • Provider Information: Names, credentials (MD, DO, NP, PA, etc.), specialties
  • Location Data: Address, city, state, ZIP code, phone numbers
  • Professional Details: NPI numbers, taxonomy codes, practice information
  • Status Information: Active/inactive status, last updated dates
  • Organization Support: Both individual providers and healthcare organizations

Data Sources

Primary: NPI Registry (npiregistry.cms.hhs.gov)

  • Official Source: Centers for Medicare & Medicaid Services (CMS)
  • Coverage: All licensed healthcare providers in the United States
  • Data Quality: High - maintained by CMS with strict validation
  • Cost: Free - no API key required
  • Rate Limits: Reasonable (no strict published limits, but respectful rate limiting implemented)

Secondary Sources (Future)

  • Healthgrades provider directory
  • Zocdoc listings
  • Vitals provider profiles
  • State medical board directories

Features

  • ✅ Free access to official NPI Registry data
  • ✅ Multiple search modes (name, specialty, location, NPI)
  • ✅ Geographic and specialty filtering
  • ✅ Batch processing with pagination
  • ✅ Real-time data from authoritative sources
  • ✅ Professional credential extraction
  • ✅ Contact information retrieval
  • ✅ Active/inactive status filtering

Installation

Local Development

$npm install

Apify Platform

Push to Apify directly:

$apify push

Input Schema

The actor accepts the following input parameters:

Required

  • searchType (string): Type of search to perform
    • byName - Search by provider name
    • bySpecialty - Search by medical specialty
    • byLocation - Search by geographic location
    • byNPI - Search by specific NPI number
  • firstName (string): Provider's first name
  • lastName (string): Provider's last name
  • specialty (string): Medical specialty (e.g., "Cardiologist", "Orthopedic Surgeon", "Family Practice")
  • city (string): City name
  • state (string): US state abbreviation (e.g., "CA", "NY", "TX")
  • zipCode (string): ZIP code
  • npi (string): 10-digit National Provider Identifier

Optional - General Parameters

  • organizationType (string): Filter by organization type

    • Individual - Individual providers only
    • Organization - Organizations only
    • Both - Both (default)
  • maxResults (integer): Maximum providers to return (1-1000, default: 100)

  • includeInactiveProviders (boolean): Include deactivated providers (default: false)

  • dataSourcePriority (array): Which data sources to query in order

    • Default: ["npiRegistry"]
    • Available: ["npiRegistry", "healthgrades", "zocdoc", "vitals"]
  • enrichWithSecondaryData (boolean): Enrich NPI data with secondary sources (default: false)

Output Format

Provider Record

Each provider record contains:

{
"npi": "1234567890",
"providerName": "John Smith, MD",
"providerType": "Individual",
"credentials": "MD",
"specialty": "Internal Medicine",
"specialties": [
{
"code": "207RI0200X",
"description": "Physicians & Surgeons, Internal Medicine, Cardiovascular Disease",
"primary": true
}
],
"address": "123 Medical Plaza, Suite 100",
"city": "Boston",
"state": "MA",
"zipCode": "02101",
"phone": "+1-617-555-0000",
"status": "Active",
"lastUpdated": "2024-01-15",
"otherIdentifiers": [
{
"type": "01",
"value": "XX-1234567",
"state": "MA"
}
],
"mobileProviderIndicator": false
}

Summary Record

Each run also includes a summary record:

{
"_type": "summary",
"totalProviders": 42,
"searchCriteria": {
"searchType": "byLocation",
"city": "Boston",
"state": "MA",
"specialty": "Cardiologist"
},
"timestamp": "2024-01-15T10:30:00.000Z",
"dataSourcesUsed": ["npiRegistry"]
}

Usage Examples

Search by Name

{
"searchType": "byName",
"firstName": "John",
"lastName": "Smith",
"maxResults": 50
}

Search by Specialty in Location

{
"searchType": "bySpecialty",
"specialty": "Cardiologist",
"state": "CA",
"maxResults": 100
}

Search by Location with Specialty

{
"searchType": "byLocation",
"city": "New York",
"state": "NY",
"specialty": "Orthopedic Surgeon",
"maxResults": 200
}

Search by NPI Number

{
"searchType": "byNPI",
"npi": "1234567890"
}
{
"searchType": "byLocation",
"zipCode": "10001",
"maxResults": 500,
"includeInactiveProviders": false,
"organizationType": "Individual"
}

API Documentation

NPI Registry API

  • Endpoint: https://npiregistry.cms.hhs.gov/api
  • Version: 2.1
  • Authentication: None required (public API)
  • Rate Limiting: No published limits, reasonable usage expected
  • Format: JSON

Query Parameters Supported

  • version - API version (2.1)
  • number - NPI number
  • first_name - Provider first name
  • last_name - Provider last name
  • organization_name - Organization name
  • city - City
  • state - State abbreviation
  • postal_code - ZIP code
  • taxonomy_description - Specialty description
  • limit - Results per page (up to 200)
  • skip - Pagination offset

Rate Limiting

This actor implements respectful rate limiting:

  • 500ms delay between API requests
  • Batch fetching with up to 200 results per request
  • Automatic pagination

Error Handling

The actor includes comprehensive error handling for:

  • Missing required parameters
  • Invalid search criteria
  • API rate limiting
  • Network timeouts
  • Malformed responses

Limitations

  • NPI Registry API has no published rate limits (we implement reasonable delays)
  • Maximum 200 results per API call (actor handles pagination)
  • Some provider details may be incomplete in the NPI Registry
  • Inactive/deactivated providers are filtered by default

Performance

Typical performance metrics:

Query TypeAvg ResultsAvg Time
By NPI1< 1s
By Name (specific)5-501-3s
By Specialty + Location50-5005-15s
By Location (large city)1000+30+ s
  • Data Source: Official US government database (CMS)
  • Public Data: All data in NPI Registry is publicly available
  • Terms: Use complies with CMS policies
  • Attribution: Data from Centers for Medicare & Medicaid Services

Support

For issues or questions:

  1. Check the input parameters against the schema
  2. Review the error messages in the run log
  3. Verify the NPI Registry is accessible
  4. Consult Apify documentation

Changelog

v1.0.0 (2024-01-15)

  • Initial release
  • NPI Registry integration
  • Multiple search modes
  • Comprehensive provider data extraction
  • Professional credential handling

License

Apache-2.0

Author

V | Healthcare Data Agent


Note: This actor provides access to publicly available healthcare provider data. Always comply with applicable laws and regulations when using this data, including HIPAA and state privacy laws.