Playwright Scraper
Try for free
No credit card required
View all Actors
Playwright Scraper
apify/playwright-scraper
Try for free
No credit card required
Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.
Do you want to learn more about this Actor?
Get a demoThe code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn more
1# Set API token
2API_TOKEN=<YOUR_API_TOKEN>
3
4# Prepare Actor input
5cat > input.json <<'EOF'
6{
7 "startUrls": [
8 {
9 "url": "https://crawlee.dev"
10 }
11 ],
12 "globs": [
13 {
14 "glob": "https://crawlee.dev/*/*"
15 }
16 ],
17 "pseudoUrls": [],
18 "excludes": [
19 {
20 "glob": "/**/*.{png,jpg,jpeg,pdf}"
21 }
22 ],
23 "linkSelector": "a",
24 "pageFunction": "async function pageFunction(context) {\n const { page, request, log } = context;\n const title = await page.title();\n log.info(`URL: ${request.url} TITLE: ${title}`);\n return {\n url: request.url,\n title\n };\n}",
25 "proxyConfiguration": {
26 "useApifyProxy": true
27 },
28 "initialCookies": [],
29 "launcher": "chromium",
30 "waitUntil": "networkidle",
31 "preNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept two arguments: the \"crawlingContext\" object\n// and \"gotoOptions\".\n[\n async (crawlingContext, gotoOptions) => {\n const { page } = crawlingContext;\n // ...\n },\n]",
32 "postNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept a single argument: the \"crawlingContext\" object.\n[\n async (crawlingContext) => {\n const { page } = crawlingContext;\n // ...\n },\n]",
33 "customData": {}
34}
35EOF
36
37# Run the Actor using an HTTP API
38# See the full API reference at https://docs.apify.com/api/v2
39curl "https://api.apify.com/v2/acts/apify~playwright-scraper/runs?token=$API_TOKEN" \
40 -X POST \
41 -d @input.json \
42 -H 'Content-Type: application/json'
Developer
Maintained by Apify
Actor metrics
- 56 monthly users
- 13 stars
- 99.3% runs succeeded
- 14 days response time
- Created in Aug 2022
- Modified 3 months ago
Categories