Doctolib avatar
Doctolib
Try for free

3 days trial then $50.00/month - No credit card required now

View all Actors
Doctolib

Doctolib

anchor/doctolib
Try for free

3 days trial then $50.00/month - No credit card required now

Scraping Doctolib is now super easy! By default, you will get names, contact, timings and addresses. Best part : you can customize what info to extract from the app!

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn mode

Node.js

Python

curl

1from apify_client import ApifyClient
2
3# Initialize the ApifyClient with your Apify API token
4client = ApifyClient("<YOUR_API_TOKEN>")
5
6# Prepare the Actor input
7run_input = {
8    "startUrls": [{ "url": "https://www.doctolib.fr/infectiologue/75001-paris" }],
9    "pageFunction": """async function pageFunction(context) {
10    let data = {}
11    let userData = context.request.userData
12    data.url = context.request.url
13    data.label = userData.label
14    // product here is a reference to a doctor page. Naming is mislieading, my apologies. 
15    // it's here because usually I work with marketplaces.
16    if(userData && userData.label === 'product'){   
17        context.log.info('label product.');     
18        // data.img = await context.page.locator('[data-qa-id=adview_spotlight_container] img >> nth=0').getAttribute('src')
19        data.nom = await context.page.locator('#main-content h1').innerText({timeout:6000})
20        try{
21            data.tarif = await context.page.locator('#payment_means').first().innerText({timeout:3000})
22            data.horaire_contact = await context.page.locator('#openings_and_contact').first().innerText({timeout:3000})
23            data.description = await context.page.locator('.dl-profile-bio').first().innerText({timeout:3000})
24            data.specialite = await context.page.locator('.dl-profile-header-speciality').first().innerText({timeout:3000})
25            data.expertise = await context.page.locator('#skills').first().innerText({timeout:3000})        
26            data.phones = await context.getPhones(data.horaire_contact)
27        }catch(e){
28            context.log.info('not found',e);     
29        }        
30        
31    }else{
32        context.log.info('we are not on a doctor page, so a search or pagination page.');
33        // we are looking for \"doctors\" (called \"product\" here) to be queued, let's write it down
34        userData.label = 'product';
35        const elements = context.page.locator('.dl-search-result-presentation a[href]');
36        const links = await elements.evaluateAll(elems => elems.map(elem => elem.getAttribute('href')));
37        links.forEach(async link => {
38            if(link.startsWith('/')){ link = 'https://www.doctolib.fr' + link }
39            await context.enqueueRequest(link, userData , false);
40        })
41    }
42    context.log.info(`function ended`);
43    delete data.label
44    return data;
45}
46""",
47    "pseudoUrls": [{ "purl": "https://www.doctolib.fr/[.*]" }],
48}
49
50# Run the Actor and wait for it to finish
51run = client.actor("anchor/doctolib").call(run_input=run_input)
52
53# Fetch and print Actor results from the run's dataset (if there are any)
54for item in client.dataset(run["defaultDatasetId"]).iterate_items():
55    print(item)
Developer
Community logoMaintained by Community
Actor metrics
  • 1 monthly users
  • 100.0% runs succeeded
  • Created in Jul 2022
  • Modified 8 months ago

You might also like these Actors