Playwright Scraper avatar
Playwright Scraper
Try for free

No credit card required

View all Actors
Playwright Scraper

Playwright Scraper

apify/playwright-scraper
Try for free

No credit card required

Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn mode

Node.js

Python

curl

1import { ApifyClient } from 'apify-client';
2
3// Initialize the ApifyClient with your Apify API token
4const client = new ApifyClient({
5    token: '<YOUR_API_TOKEN>',
6});
7
8// Prepare Actor input
9const input = {
10    "startUrls": [
11        {
12            "url": "https://crawlee.dev"
13        }
14    ],
15    "globs": [
16        {
17            "glob": "https://crawlee.dev/*/*"
18        }
19    ],
20    "pseudoUrls": [],
21    "excludes": [
22        {
23            "glob": "/**/*.{png,jpg,jpeg,pdf}"
24        }
25    ],
26    "linkSelector": "a",
27    "pageFunction": async function pageFunction(context) {
28        const { page, request, log } = context;
29        const title = await page.title();
30        log.info(`URL: ${request.url} TITLE: ${title}`);
31        return {
32            url: request.url,
33            title
34        };
35    },
36    "proxyConfiguration": {
37        "useApifyProxy": true
38    },
39    "initialCookies": [],
40    "launcher": "chromium",
41    "waitUntil": "networkidle",
42    "preNavigationHooks": `// We need to return array of (possibly async) functions here.
43        // The functions accept two arguments: the "crawlingContext" object
44        // and "gotoOptions".
45        [
46            async (crawlingContext, gotoOptions) => {
47                const { page } = crawlingContext;
48                // ...
49            },
50        ]`,
51    "postNavigationHooks": `// We need to return array of (possibly async) functions here.
52        // The functions accept a single argument: the "crawlingContext" object.
53        [
54            async (crawlingContext) => {
55                const { page } = crawlingContext;
56                // ...
57            },
58        ]`,
59    "customData": {}
60};
61
62(async () => {
63    // Run the Actor and wait for it to finish
64    const run = await client.actor("apify/playwright-scraper").call(input);
65
66    // Fetch and print Actor results from the run's dataset (if any)
67    console.log('Results from dataset');
68    const { items } = await client.dataset(run.defaultDatasetId).listItems();
69    items.forEach((item) => {
70        console.dir(item);
71    });
72})();
Developer
Maintained by Apify
Actor metrics
  • 59 monthly users
  • 94.5% runs succeeded
  • 5.9 days response time
  • Created in Aug 2022
  • Modified 4 months ago