Playwright Scraper

No credit card required

Playwright Scraper

Playwright Scraper

apify/playwright-scraper

No credit card required

Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn mode

# Set API token
API_TOKEN=<YOUR_API_TOKEN>

# Prepare Actor input
cat > input.json <<'EOF'
{
  "startUrls": [
    {
      "url": "https://crawlee.dev"
    }
  ],
  "globs": [
    {
      "glob": "https://crawlee.dev/*/*"
    }
  ],
  "pseudoUrls": [],
  "excludes": [
    {
      "glob": "/**/*.{png,jpg,jpeg,pdf}"
    }
  ],
  "linkSelector": "a",
  "pageFunction": "async function pageFunction(context) {\n    const { page, request, log } = context;\n    const title = await page.title();\n    log.info(`URL: ${request.url} TITLE: ${title}`);\n    return {\n        url: request.url,\n        title\n    };\n}",
  "proxyConfiguration": {
    "useApifyProxy": true
  },
  "initialCookies": [],
  "launcher": "chromium",
  "waitUntil": "networkidle",
  "preNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept two arguments: the \"crawlingContext\" object\n// and \"gotoOptions\".\n[\n    async (crawlingContext, gotoOptions) => {\n        const { page } = crawlingContext;\n        // ...\n    },\n]",
  "postNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept a single argument: the \"crawlingContext\" object.\n[\n    async (crawlingContext) => {\n        const { page } = crawlingContext;\n        // ...\n    },\n]",
  "customData": {}
}
EOF

# Run the Actor
curl "https://api.apify.com/v2/acts/apify~playwright-scraper/runs?token=$API_TOKEN" \
  -X POST \
  -d @input.json \
  -H 'Content-Type: application/json'
Developer
Maintained by Apify
Actor stats
  • 453 users
  • 32.1k runs
  • Modified 14 days ago

You might also like these Actors