PsychologyToday.com Search and Details scraper avatar

PsychologyToday.com Search and Details scraper

Pricing

from $1.00 / 1,000 results

Go to Apify Store
PsychologyToday.com Search and Details scraper

PsychologyToday.com Search and Details scraper

Scrape PsychologyToday by state, city or direct profile URL for therapists, psychiatrists, groups and treatment-rehab. Returns name, credentials, bio, address, phone, image, specialties, expertise, types of therapy, insurance carriers, license, client focus and nearby areas.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Psychology Today Therapist Scraper

Turn Psychology Today into a structured therapist database. Scrape every therapist on a directory page — name, credentials, full bio, specialties, accepted insurance, license number, fees, location and more — from any /us/therapists/{state}/{city} page, or scrape a single therapist by profile URL.

Psychology Today Scraper - How It Works

✨ Why use this scraper?

Psychology Today rate-limits aggressively, has no public API, and renders different sections per profile. This actor handles all of that.

  • 🎯 Three starting points. Paste city directory URLs, paste direct therapist profile URLs, or supply stateCode + citySlug and the actor builds the directory URL.
  • 👥 One row per therapist. Each dataset item merges JSON-LD (clean structured data) with HTML extraction (the parts JSON-LD doesn't cover).
  • 🛡️ Resilient fetch pipeline. Every request races impit (Firefox TLS, HTTP/3 when no proxy) against got-scraping. Each retry rotates the proxy. Built for sites that rate-limit aggressively.
  • 🧾 Listing card preserved. When a therapist is discovered through a directory page, the original LIST-card data is kept under basicInfo on the row.
  • 🔁 Built-in retry on rate limits. 4xx responses rotate the proxy and retry — typical runs come through 22/22 even on heavily blocked IPs.
  • 📦 No browser. Pure HTTP + Cheerio. Cheap, fast, low memory.

📖 Overview

This is a Psychology Today therapist scraper. It is built around the directory page format (/us/therapists/{state}/{city}) and produces one structured row per therapist found on those listing pages.

  • Start from one or many city directory URLs.
  • The actor paginates each URL (?page=2, ?page=3, …) up to a configurable cap.
  • For every therapist card on every page, the actor fetches that therapist's profile page and pushes a single dataset row.
  • Each row contains the JSON-LD identity block, the long bio, every visible HTML section (specialties, expertise, types of therapy, client focus, insurance, fees, license, location, nearby areas) and the original listing card under basicInfo.

It is not a search engine for therapist names across the whole site. The starting point is always a city directory page.

🌐 Supported inputs

Starting URLs

Two URL types are accepted in startUrls — mix them in any combination:

TypePatternExampleBehavior
City directory/us/therapists/{state}/{city}https://www.psychologytoday.com/us/therapists/ny/new-yorkPaginates ?page=2, ?page=3, … and fetches every therapist found
Direct profile/us/therapists/{slug}/{numericId}https://www.psychologytoday.com/us/therapists/i-wang-new-york-ny/1392436Scrapes that one therapist; basicInfo is null (no listing-card data available)

The actor classifies each URL automatically by checking whether the last path segment is numeric.

{
"startUrls": [
"https://www.psychologytoday.com/us/therapists/ny/new-york",
"https://www.psychologytoday.com/us/therapists/ca/los-angeles",
"https://www.psychologytoday.com/us/therapists/i-wang-new-york-ny/1392436"
]
}

Filter mode (used when startUrls is empty)

{
"stateCode": "ca",
"citySlug": "san-francisco"
}

stateCode is the lowercase two-letter US state code, citySlug is the city as it appears in the URL. This mode only builds city directory URLs — to scrape an individual therapist by ID, pass the profile URL in startUrls.

Unsupported

  • Insurance / specialty / language filters at the actor level — apply those filters on Psychology Today first, then paste the filtered URL into startUrls.
  • International directories outside psychologytoday.com/us/....
  • Search by therapist name across the whole site.

🎯 Use cases

TeamWhat they build
Mental health marketplacesCoverage maps showing therapist density by city, specialty, and insurance carrier
Insurance & benefitsProvider network audits — who actually accepts your plan in a given metro
B2B lead genOutbound lists for selling SaaS, billing tools, or marketing services to therapy practices
Research teamsDatasets for academic studies on therapy access, fee distribution, and demographic coverage
Health-tech foundersCompetitor and supply-side analysis when sizing a market
Marketing agenciesTherapist persona research from real bios, specialties, and client-focus tiles

⚙️ How it works

  1. You provide one or more URLs — city directory, profile, or both — or a stateCode + citySlug.
  2. For each city directory URL, the actor fetches the LIST page and extracts every .results-row therapist card on it (~20 per page), then walks ?page=2, ?page=3, … until maxItems or maxPages is reached.
  3. For each direct profile URL in startUrls, the actor fetches that one therapist directly with no LIST stage.
  4. Every page fetch races impit against got-scraping in parallel. The first one that returns clean HTML wins. Each retry rotates the proxy.
  5. Each therapist becomes one dataset row combining JSON-LD + HTML extraction. Therapists discovered through a LIST page also keep the original listing-card data under basicInfo; therapists scraped via a direct profile URL get basicInfo: null.

📥 Input configuration

FieldTypeDefaultDescription
startUrlsstring[]PT therapist directory URLs and/or direct profile URLs. Take priority over filter mode.
stateCodestringTwo-letter state code (lowercase), e.g. ny. Used only when startUrls is empty.
citySlugstringCity slug as it appears in the URL, e.g. new-york. Used with stateCode.
maxItemsinteger1000Stop after this many therapist profiles have been pushed.
maxPagesinteger25Hard cap on listing pagination per directory URL.
detailConcurrencyinteger6How many therapist profile pages to fetch in parallel.
maxRequestRetriesinteger5Per-request retry budget. The actor multiplies this internally because each retry rotates the proxy.
proxyobjectApify ResidentialProxy configuration. Residential is strongly recommended.

Examples

Single city:

{
"startUrls": ["https://www.psychologytoday.com/us/therapists/ny/new-york"],
"maxItems": 100
}

Multiple cities + a direct profile URL, in one run:

{
"startUrls": [
"https://www.psychologytoday.com/us/therapists/ny/new-york",
"https://www.psychologytoday.com/us/therapists/ca/san-francisco",
"https://www.psychologytoday.com/us/therapists/i-wang-new-york-ny/1392436"
],
"maxItems": 500,
"detailConcurrency": 4,
"proxy": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }
}

Filter mode:

{
"stateCode": "tx",
"citySlug": "austin",
"maxItems": 50
}

Single therapist by URL only:

{
"startUrls": ["https://www.psychologytoday.com/us/therapists/i-wang-new-york-ny/1392436"],
"maxItems": 1
}

📤 Output overview

  • One dataset row per therapist.
  • Top-level fields are merged from JSON-LD (canonical structured data) and HTML extraction.
  • Nested objects: license, address, clientFocus, nearbyAreas.
  • The original listing-card data is preserved under basicInfo on every row.
  • Items are written to storage/datasets/default/ and additionally exported as a single data.json at the project root after the run.

📋 Output sample

Trimmed real row (from data.json):

{
"profileUrl": "https://www.psychologytoday.com/us/therapists/dani-saliani-new-york-ny/914610",
"name": "Dani Saliani",
"jobTitle": "Counselor",
"credentials": "Counselor, LMHC-D, MA",
"honorificSuffix": "LMHC-D, MA",
"image": "https://photos.psychologytoday.com/.../320x400.jpeg",
"description": "You can keep it together, but inside, you feel stuck...",
"telephone": "(929) 930-5468",
"verifiedByPsychologyToday": true,
"acceptsInsurance": true,
"topSpecialties": ["ADHD", "Anxiety", "Men's Issues"],
"expertise": ["Body Image", "Career Counseling", "Codependency", "Depression", "..."],
"typesOfTherapy": ["Compassion Focused", "Eclectic", "Family Systems", "..."],
"clientFocus": {
"Age": ["Teen", "Adults"],
"Communities": ["Gay Allied", "Non-Binary Allied", "Queer Allied", "Racial Justice Allied"]
},
"insurance": ["Aetna", "Cigna and Evernorth", "Out of Network"],
"yearsInPractice": 7,
"license": { "number": "011817", "issuingState": "New York", "expires": "2026-11-01" },
"address": { "locality": "New York", "region": "New York", "postalCode": "10010", "country": "US" },
"primaryLocation": "1133 Broadway, New York, NY 10010",
"nearbyAreas": {
"cities": ["Brooklyn, NY", "New York, NY"],
"counties": ["Kings", "New York"],
"zips": ["10001", "10003", "10010", "11222"]
},
"basicInfo": {
"name": "Dani Saliani",
"title": "Counselor, LMHC-D, MA",
"address": "New York, NY 10010",
"shortDescription": "You can keep it together, but inside, you feel stuck...",
"image": "https://photos.psychologytoday.com/.../320x400.jpeg",
"detailsUrl": "https://www.psychologytoday.com/us/therapists/dani-saliani-new-york-ny/914610",
"phone": "(929) 930-5468"
}
}

🗂️ Key output fields

Identity

FieldTypeDescription
profileUrlstringTherapist profile URL (the URL fetched).
namestringFull name (JSON-LD, fallback to .profile-title).
jobTitlestringProfession (e.g. Counselor, Psychologist).
credentialsstringTitle line including post-nominals.
honorificSuffixstringLetters only (e.g. PhD, LMHC-D, MA).
imagestringProfile photo URL.
telephonestringFrom JSON-LD; may be empty even when the listing has one.
verifiedByPsychologyTodaybooleanTrue if the "Verified" badge is on the profile.

Bio & specialties

FieldTypeDescription
descriptionstringLong bio paragraph.
topSpecialtiesstring[]The 1–3 specialties PT highlights.
expertisestring[]Full list of issues treated.
typesOfTherapystring[]Modalities (CBT, DBT, ACT, EMDR, …).
clientFocusobjectMap of group title → values (Age, Participants, Communities, …).
knowsAboutRawstringOriginal comma-separated knowsAbout string from JSON-LD.

Insurance & practice

FieldTypeDescription
acceptsInsurancebooleanConvenience flag from the at-a-glance row.
insurancestring[]Named insurance carriers (may include Out of Network).
yearsInPracticenumber | nullParsed from the education section, null if not displayed.
license.numberstringState license number.
license.issuingStatestringState that issued the license.
license.expiresstringExpiration date (YYYY-MM-DD).

Location

FieldTypeDescription
primaryLocationstringSingle-line address as displayed on the page.
address.localitystringCity.
address.regionstringState (full name).
address.postalCodestringZIP.
address.countrystringCountry code or name.
nearbyAreas.citiesstring[]Cities the therapist also serves.
nearbyAreas.countiesstring[]Counties covered.
nearbyAreas.zipsstring[]Zip codes covered.

Listing card (preserved under basicInfo)

FieldTypeDescription
name, title, address, shortDescription, image, detailsUrl, phonestringExactly what was visible on the directory listing card before the DETAIL fetch. address may be Online Only for virtual therapists.

❓ FAQ

Which Psychology Today URLs are supported? City directory URLs of the form https://www.psychologytoday.com/us/therapists/{state}/{city}. Direct therapist profile URLs are not supported as input — the actor expects directory pages and discovers profiles from them.

Can I scrape multiple cities at once? Yes — pass several URLs in startUrls. Pagination, maxItems, and maxPages apply per start URL.

Does the actor support filtering by insurance, specialty, or language? Not at the actor level. Apply Psychology Today's own filters on the website first, then paste the resulting URL into startUrls. Pagination on filtered URLs works the same way.

How is basicInfo different from the top-level fields? basicInfo is what the listing card showed before the DETAIL request fired. Top-level fields are the richer DETAIL-page data. Some fields only exist on the listing card (shortDescription); some only on the DETAIL page (full description, license, clientFocus, insurance, etc.); some are duplicated for convenience. When a therapist is scraped via a direct profile URL (no LIST stage), basicInfo is null.

How does the actor handle Psychology Today's rate limiting? Every request races impit (Firefox TLS fingerprint, HTTP/3 when no proxy) against got-scraping in parallel. The first one returning clean HTML wins. Each retry rotates to a fresh proxy IP. On heavily blocked IPs you'll typically see "Recovered on attempt 2 via …" lines — that's the rotation working.

What happens when a profile section is missing? Lists come back as [], optional objects come back as null, optional strings as "". PT profiles are not uniform — not every therapist displays Top Specialties, years in practice, or every client-focus group.

Can I run this without an Apify residential proxy? You can run with useApifyProxy: false, but PT rate-limits IPs aggressively and the per-attempt rotation has nothing to rotate to — every retry comes from the same banned IP. Residential proxy is strongly recommended.

Are paginated profiles re-fetched if a city is bigger than maxItems? The actor stops as soon as maxItems is reached on the dataset. Subsequent listing pages are not fetched.

📬 Support

  • Issues — open one on the actor's repository / Issues tab.
  • Custom work — need extra fields, different start strategies, or a tailored CSV export? Reach out and we can extend the scraper.

Response time: usually under 24 hours.

🛠️ Additional services

  • Custom field extraction (e.g. raw-HTML capture for downstream NLP, accept-superbill flag, languages spoken).
  • Cron-style monitoring runs on saved start URLs.
  • Tailored exports (CSV with flattened nested objects, Parquet, BigQuery uploads).

🔎 Explore more scrapers

Looking for related directory or marketplace scrapers? The same two-stage LIST → DETAIL pattern with the optional internal-handler engine is reusable; ask for a tailored build.