
Doctolib
Pricing
$9.00/month + usage

Doctolib
Scraping Doctolib is now super easy and cheap! Extract phones, names, contact, timings, image and addresses of medics, doctors, hospitals... Best part : you can even customize what info to extract from Doctolib!
1.0 (1)
Pricing
$9.00/month + usage
5
Total users
157
Monthly users
12
Runs succeeded
88%
Issues response
22 hours
Last modified
14 days ago
Output only URL?
Closed
I tried to scrape Doctolib based on the tutorial that you have provided, and unfortunately, no data is collected. If this is a UX issue, please update the tutorial. If this is a scraper issue, please fix that. Let me know, thanks.

Hello, Thanks for trying Doctolib scrapper. I am sorry that i did not work out of the box for you... let me help you with that. Can you provide your logs and actor run input so that I can reproduce on my side ? Thanks
alexalexalexalex
Sorry I dont understand where to get the logs so I just copy paste stuff:
alexalexalexalex
{
"hideSearchPages": true,
"pageFunction": "async function pageFunction(context) {\n\n let data = {}\n let userData = context.request.userData\n data.url = context.request.url\n data.label = userData.label\n \n if(userData && userData.label === 'doctor'){ \n data.nom = await context.page.locator('#main-content h1').innerText({timeout:6000})\n data.tarif = await context.innerTextwrapper(context,'#payment_means')\n data.horaire_contact = await context.innerTextwrapper(context,'#openings_and_contact')\n data.description = await context.innerTextwrapper(context,'.dl-profile-bio')\n data.specialite = await context.innerTextwrapper(context,'.dl-profile-header-speciality')\n data.expertise = await context.innerTextwrapper(context,'#skills')\n try{\n data.phones = await context.getPhones(data.horaire_contact)\n }catch(e){\n context.log.info('Phones not found',e); \n }\n try{\n data.image = await context.page.locator('.dl-profile img').first().getAttribute('src',{timeout:2000})\n if(data.image.startsWith('/')){ data.image = 'https:' + data.image}\n }catch(e){\n context.log.info('Image not found',e); \n } \n \n }else{\n context.log.info('we are not on a doctor page: so a search or pagination page.');\n userData.label = 'doctor';\n const elements = context.page.locator('.search-result-card a[href]');\n const links = await elements.evaluateAll(elems => elems.map(elem => elem.getAttribute('href')));\n let extenstion = 'fr'\n if(context.request.url.includes('doctolib.de')){ extenstion = 'de' }\n if(context.request.url.includes('doctolib.it')){ extenstion = 'it' }\n links.forEach(async link => {\n if(link.startsWith('/')){ link = https://www.doctolib.${extenstion}${link}
}\n await context.enqueueRequest(link, userData , true);\n })\n\n }\n context.log.info(ending this page now
);\n delete data.label\n return data;\n}\n",
"startUrls": [
{
"url": "https://www.doctolib.de/privatklinik/stuttgart/lipoklinik?pid=practice-624868&phs=true&page=1&insurance_sector=public&highlight%5Bspeciality_ids%5D%5B%5D=1297",
"method": "GET"
},
{
"url": "https://www.doctolib.de/hautarzt/stuttgart/sandra-teufel?pid=practice-228783&phs=true&page=1&index=1&insurance_sector=public",
"method": "GET"
},
{
"url": "https://www.doctolib.de/plastische-und-asthetische-chirurgie/stuttgart/jens-neumann?pid=practice-561047&phs=true&page=1&index=2&insurance_sector=public",
"method": "GET"
},
{
"url": "https://www.doctolib.de/plastische-und-asthetische-chirurgie/stuttgart/mirela-anghel-bota?pid=practice-578733&phs=true&page=1&index=3&insurance_sector=public",
"method": "GET"
}
]
}
alexalexalexalex
Ah found it!
alexalexalexalex
It seems that I cannot upload the log file here.
alexalexalexalex
2025-06-04T09:12:08.857Z ACTOR: Pulling Docker image of build aPSuTUe6Z2JFtF9vM from registry. 2025-06-04T09:12:22.108Z ACTOR: Creating Docker container. 2025-06-04T09:12:22.306Z ACTOR: Starting Docker container. 2025-06-04T09:12:22.511Z Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp 2025-06-04T09:12:22.512Z Executing main command 2025-06-04T09:12:23.655Z INFO System info {"apifyVersion":"2.3.2","apifyClientVersion":"2.9.3","osType":"Linux","nodeVersion":"v16.20.2"} 2025-06-04T09:12:24.117Z INFO FingerprintInjector: Successfully initialized. 2025-06-04T09:12:24.119Z INFO Starting the crawl. 2025-06-04T09:12:24.171Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":0,"desiredConcurrency":2,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":null},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":null},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":null},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":null}}} 2025-06-04T09:12:31.601Z INFO Page opened, url:https://www.doctolib.de/privatklinik/stuttgart/lipoklinik?pid=practice-624868&phs=true&page=1&insurance_sector=public&highlight%5Bspeciality_ids%5D%5B%5D=1297 - label:undefined 2025-06-04T09:12:31.632Z INFO Page opened, url:https://www.doctolib.de/hautarzt/stuttgart/sandra-teufel?pid=practice-228783&phs=true&page=1&index=1&insurance_sector=public - label:undefined 2025-06-04T09:12:31.664Z INFO we are not on a doctor page: so a search or pagination page. 2025-06-04T09:12:31.684Z INFO we are not on a doctor page: so a search or pagination page. 2025-06-04T09:12:31.711Z INFO ending this page now 2025-06-04T09:12:31.781Z INFO ending this page now 2025-06-04T09:12:39.377Z INFO Page opened, url:https://www.doctolib.de/plastische-und-asthetische-chirurgie/stuttgart/mirela-anghel-bota?pid=practice-578733&phs=true&page=1&index=3&insurance_sector=public - label:undefined 2025-06-04T09:12:39.678Z INFO we are not on a doctor page: so a search or pagination page. 2025-06-04T09:12:40.085Z INFO Page opened, url:https://www.doctolib.de/plastische-und-asthetische-chirurgie/stuttgart/jens-neumann?pid=practice-561047&phs=true&page=1&index=2&insurance_sector=public - label:undefined 2025-06-04T09:12:40.166Z INFO ending this page now 2025-06-04T09:12:40.682Z INFO we are not on a doctor page: so a search or pagination page. 2025-06-04T09:12:40.783Z INFO ending this page now 2025-06-04T09:12:40.890Z INFO PlaywrightCrawler: All the requests from request list and/or request queue have been processed, the crawler will shut down. 2025-06-04T09:12:41.625Z INFO PlaywrightCrawler: Final request statistics: {"requestsFinished":4,"requestsFailed":0,"retryHistogram":[4],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":8099,"requestsFinishedPerMinute":14,"requestsFailedPerMinute":0,"requestTotalDurationMillis":32397,"requestsTotal":4,"crawlerRuntimeMillis":17512} 2025-06-04T09:12:41.625Z INFO Crawl finished.
alexalexalexalex
What I do want is a list with all the doctors and all the website domains scraped from the links that I have provided above. Unfortunately, this only works in about 10-15% of the entries. The majority of them don't provide any website link of the individual doctors.

Let me try to help you. I opened your first link : https://www.doctolib.de/privatklinik/stuttgart/lipoklinik?pid=practice-624868&phs=true&page=1&insurance_sector=public&highlight%5Bspeciality_ids%5D%5B%5D=1297
This page looks like a good doctor page indeed. What you want is extract the website "https://lipoklinik.com/" ? Do I understand correctly ?

I added a new column in the output in version 0.7 this morning, maybe it will fit your need. Also, in your INPUT, you are supposed to put "search urls" not doctors "urls". If you really want to put doctor urls, you will need to add "label":"doctor" in the userdata of your input url


Closing this issue, feel free to reopen if necessary

If you are happy with this Actor, can I ask you to rate it 5 ⭐ ? I would really appreciate :)