Doctolib
3 days trial then $9.00/month - No credit card required now
Doctolib
3 days trial then $9.00/month - No credit card required now
Scraping Doctolib is now super easy and cheap! Extract phones, names, contact, timings, image and addresses of medics, doctors, hospitals... Best part : you can even customize what info to extract from Doctolib!
Hi! I used your actor from Apify to scrape doctors from Doctolib.fr but ran into a problem: with a browser, this search (https://www.doctolib.fr/medecin-generaliste/france?language=16) returns 81 results, but the scraper returns only 20 (the log says 21, but the first string is empty). See the log attached. Could you please suggest what could be the source of the problem? Or is it due to the site's protection from scrapers?
After the fix, the scraper found 30 pages but saved 0 results (Timeout error), see the log attached
Thanks for your issue here :)
There is one thing you might try : reset the "pageFunction" to the default value. Let me know if this fixes it. What I think that causes the problem is that you may have been updated to the version 0.5 of the Actor but it kept your last INPUT. Since I made changes to the pagefunction, it needs to be updated as well Or if you prefer, here is the JSON version you can use as the INPUT :
{ "hideSearchPages": true, "maxPagesPerCrawl": 90, "pageFunction": "async function pageFunction(context) {\n\n let data = {}\n let userData = context.request.userData\n data.url = context.request.url\n data.label = userData.label\n \n if(userData && userData.label === 'doctor'){ \n data.nom = await context.page.locator('#main-content h1').innerText({timeout:6000})\n data.tarif = await context.innerTextwrapper(context,'#payment_means')\n data.horaire_contact = await context.innerTextwrapper(context,'#openings_and_contact')\n data.description = await context.innerTextwrapper(context,'.dl-profile-bio')\n data.specialite = await context.innerTextwrapper(context,'.dl-profile-header-speciality')\n data.expertise = await context.innerTextwrapper(context,'#skills')\n try{\n data.phones = await context.getPhones(data.horaire_contact)\n }catch(e){\n context.log.info('Phones not found',e); \n }\n try{\n data.image = await cont... [trimmed]
Guessing this worked so closing the issue. feel free to reopen if necessary
Actor Metrics
9 monthly users
-
4 stars
>99% runs succeeded
Created in Jul 2022
Modified 21 days ago