Doctolib avatar
Doctolib
Try for free

3 days trial then $50.00/month - No credit card required now

View all Actors
Doctolib

Doctolib

anchor/doctolib
Try for free

3 days trial then $50.00/month - No credit card required now

Scraping Doctolib is now super easy! By default, you will get names, contact, timings and addresses. Best part : you can customize what info to extract from the app!

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn mode

Node.js

Python

curl

1import { ApifyClient } from 'apify-client';
2
3// Initialize the ApifyClient with API token
4const client = new ApifyClient({
5    token: '<YOUR_API_TOKEN>',
6});
7
8// Prepare Actor input
9const input = {
10    "startUrls": [
11        {
12            "url": "https://www.doctolib.fr/infectiologue/75001-paris"
13        }
14    ],
15    "pageFunction": async function pageFunction(context) {
16        let data = {}
17        let userData = context.request.userData
18        data.url = context.request.url
19        data.label = userData.label
20        // product here is a reference to a doctor page. Naming is mislieading, my apologies. 
21        // it's here because usually I work with marketplaces.
22        if(userData && userData.label === 'product'){   
23            context.log.info('label product.');     
24            // data.img = await context.page.locator('[data-qa-id=adview_spotlight_container] img >> nth=0').getAttribute('src')
25            data.nom = await context.page.locator('#main-content h1').innerText({timeout:6000})
26            try{
27                data.tarif = await context.page.locator('#payment_means').first().innerText({timeout:3000})
28                data.horaire_contact = await context.page.locator('#openings_and_contact').first().innerText({timeout:3000})
29                data.description = await context.page.locator('.dl-profile-bio').first().innerText({timeout:3000})
30                data.specialite = await context.page.locator('.dl-profile-header-speciality').first().innerText({timeout:3000})
31                data.expertise = await context.page.locator('#skills').first().innerText({timeout:3000})        
32                data.phones = await context.getPhones(data.horaire_contact)
33            }catch(e){
34                context.log.info('not found',e);     
35            }        
36            
37        }else{
38            context.log.info('we are not on a doctor page, so a search or pagination page.');
39            // we are looking for "doctors" (called "product" here) to be queued, let's write it down
40            userData.label = 'product';
41            const elements = context.page.locator('.dl-search-result-presentation a[href]');
42            const links = await elements.evaluateAll(elems => elems.map(elem => elem.getAttribute('href')));
43            links.forEach(async link => {
44                if(link.startsWith('/')){ link = 'https://www.doctolib.fr' + link }
45                await context.enqueueRequest(link, userData , false);
46            })
47        }
48        context.log.info(`function ended`);
49        delete data.label
50        return data;
51    },
52    "pseudoUrls": [
53        {
54            "purl": "https://www.doctolib.fr/[.*]"
55        }
56    ]
57};
58
59(async () => {
60    // Run the Actor and wait for it to finish
61    const run = await client.actor("anchor/doctolib").call(input);
62
63    // Fetch and print Actor results from the run's dataset (if any)
64    console.log('Results from dataset');
65    const { items } = await client.dataset(run.defaultDatasetId).listItems();
66    items.forEach((item) => {
67        console.dir(item);
68    });
69})();
Developer
Community logoMaintained by Community
Actor metrics
  • 1 monthly users
  • 100.0% runs succeeded
  • Modified 8 months ago

You might also like these Actors