Doctolib avatar
Doctolib
Try for free

3 days trial then $19.00/month - No credit card required now

View all Actors
Doctolib

Doctolib

anchor/doctolib
Try for free

3 days trial then $19.00/month - No credit card required now

Scraping Doctolib is now super easy and cheap! Extract phones, names, contact, timings, image and addresses of medics, doctors, hospitals... Best part : you can even customize what info to extract from Doctolib!

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn more

1# Set API token
2API_TOKEN=<YOUR_API_TOKEN>
3
4# Prepare Actor input
5cat > input.json <<'EOF'
6{
7  "startUrls": [
8    {
9      "url": "https://www.doctolib.fr/infectiologue/75001-paris"
10    }
11  ],
12  "pageFunction": "async function pageFunction(context) {\n\n    let data = {}\n    let userData = context.request.userData\n    data.url = context.request.url\n    data.label = userData.label\n    \n    if(userData && userData.label === 'doctor'){   \n        data.nom = await context.page.locator('#main-content h1').innerText({timeout:6000})\n        data.tarif = await context.innerTextwrapper(context,'#payment_means')\n        data.horaire_contact = await context.innerTextwrapper(context,'#openings_and_contact')\n        data.description = await context.innerTextwrapper(context,'.dl-profile-bio')\n        data.specialite = await context.innerTextwrapper(context,'.dl-profile-header-speciality')\n        data.expertise = await context.innerTextwrapper(context,'#skills')\n        try{\n            data.phones = await context.getPhones(data.horaire_contact)\n        }catch(e){\n            context.log.info('Phones not found',e);     \n        }\n        try{\n            data.image = await context.page.locator('.dl-profile img').first().getAttribute('src',{timeout:2000})\n            if(data.image.startsWith('/')){ data.image = 'https:' + data.image}\n        }catch(e){\n            context.log.info('Image not found',e);     \n        }        \n        \n    }else{\n        context.log.info('we are not on a doctor page: so a search or pagination page.');\n        userData.label = 'doctor';\n        const elements = context.page.locator('.search-result-card a[href]');\n        const links = await elements.evaluateAll(elems => elems.map(elem => elem.getAttribute('href')));\n        let extenstion = 'fr'\n        if(context.request.url.includes('doctolib.de')){ extenstion = 'de' }\n        if(context.request.url.includes('doctolib.it')){ extenstion = 'it' }\n        links.forEach(async link => {\n            if(link.startsWith('/')){ link = `https://www.doctolib.${extenstion}${link}` }\n            await context.enqueueRequest(link, userData , true);\n        })\n\n    }\n    context.log.info(`ending this page now`);\n    delete data.label\n    return data;\n}\n"
13}
14EOF
15
16# Run the Actor using an HTTP API
17# See the full API reference at https://docs.apify.com/api/v2
18curl "https://api.apify.com/v2/acts/anchor~doctolib/runs?token=$API_TOKEN" \
19  -X POST \
20  -d @input.json \
21  -H 'Content-Type: application/json'
Developer
Maintained by Community
Actor metrics
  • 9 monthly users
  • 4 stars
  • 92.0% runs succeeded
  • 7.6 hours response time
  • Created in Jul 2022
  • Modified 5 days ago