Doctolib avatar
Doctolib
Try for free

3 days trial then $19.00/month - No credit card required now

View all Actors
Doctolib

Doctolib

anchor/doctolib
Try for free

3 days trial then $19.00/month - No credit card required now

Scraping Doctolib is now super easy! Get phones, names, contact, timings and addresses of medics, doctors, hospitals... Best part : you can even customize what info to extract from Doctolib! Don't drown in a sea of searches ! Watch the scrapper do the magic :)

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn more

1from apify_client import ApifyClient
2
3# Initialize the ApifyClient with your Apify API token
4client = ApifyClient("<YOUR_API_TOKEN>")
5
6# Prepare the Actor input
7run_input = {
8    "startUrls": [{ "url": "https://www.doctolib.fr/infectiologue/75001-paris" }],
9    "pageFunction": """async function pageFunction(context) {
10    let data = {}
11    let userData = context.request.userData
12    data.url = context.request.url
13    data.label = userData.label
14    // product here is a reference to a doctor page. Naming is mislieading, my apologies. 
15    // it's here because usually I work with marketplaces.
16    if(userData && userData.label === 'product'){   
17        context.log.info('label product.');     
18        // data.img = await context.page.locator('[data-qa-id=adview_spotlight_container] img >> nth=0').getAttribute('src')
19        data.nom = await context.page.locator('#main-content h1').innerText({timeout:6000})
20        try{
21            data.tarif = await context.page.locator('#payment_means').first().innerText({timeout:3000})
22            data.horaire_contact = await context.page.locator('#openings_and_contact').first().innerText({timeout:3000})
23            data.description = await context.page.locator('.dl-profile-bio').first().innerText({timeout:3000})
24            data.specialite = await context.page.locator('.dl-profile-header-speciality').first().innerText({timeout:3000})
25            data.expertise = await context.page.locator('#skills').first().innerText({timeout:3000})        
26            data.phones = await context.getPhones(data.horaire_contact)
27        }catch(e){
28            context.log.info('not found',e);     
29        }        
30        
31    }else{
32        context.log.info('we are not on a doctor page, so a search or pagination page.');
33        // we are looking for \"doctors\" (called \"product\" here) to be queued, let's write it down
34        userData.label = 'product';
35        const elements = context.page.locator('.search-result-card a[href]');
36        const links = await elements.evaluateAll(elems => elems.map(elem => elem.getAttribute('href')));
37        links.forEach(async link => {
38            if(link.startsWith('/')){ link = 'https://www.doctolib.fr' + link }
39            await context.enqueueRequest(link, userData , false);
40        })
41    }
42    context.log.info(`function ended`);
43    delete data.label
44    return data;
45}
46""",
47    "pseudoUrls": [{ "purl": "https://www.doctolib.fr/[.*]" }],
48}
49
50# Run the Actor and wait for it to finish
51run = client.actor("anchor/doctolib").call(run_input=run_input)
52
53# Fetch and print Actor results from the run's dataset (if there are any)
54print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
55for item in client.dataset(run["defaultDatasetId"]).iterate_items():
56    print(item)
57
58# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start
Developer
Maintained by Community
Actor metrics
  • 7 monthly users
  • 2 stars
  • 100.0% runs succeeded
  • 21 hours response time
  • Created in Jul 2022
  • Modified 2 months ago