Playwright Scraper
Try for free
No credit card required
Go to Store
Playwright Scraper
apify/playwright-scraper
Try for free
No credit card required
Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.
Do you want to learn more about this Actor?
Get a demoWhen running the Playwright Actor with a startUrl which is a sitemap XML - getting back the following error:
12024-10-19T14:54:41.339Z DEBUG PlaywrightCrawler:AutoscaledPool: scaling up {"oldConcurrency":2,"newConcurrency":3,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}} 22024-10-19T14:54:49.152Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. page.evaluate: document.body is null 32024-10-19T14:54:49.153Z @debugger eval code line 226 > eval:1:7 42024-10-19T14:54:49.154Z evaluate@debugger eval code:228:17 52024-10-19T14:54:49.155Z @debugger eval code:1:44 62024-10-19T14:54:49.155Z 72024-10-19T14:54:49.156Z at CrawlerSetup._requestHandler (/home/myuser/dist/internals/crawler_setup.js:379:35) {"id":"vwv0onJJ2YlCPdo","url":"https://apify.com/sitemap.xml","retryCount":1}
It seems to fail before reaching the page function itself. However, here is the pageFunction that was used:
1async function pageFunction(context) { 2 const { page, request, log } = context; 3 4 async function pageEvaluate(context) { 5 return { 6 url: document.URL, 7 html: document.body?.innerHTML ?? document.querySelector('urlset')?.innerHTML, 8 }; 9 } 10 11 let data = await page.evaluate(pageEva... [trimmed]
Developer
Maintained by Apify
Actor Metrics
67 monthly users
-
18 stars
>99% runs succeeded
54 days response time
Created in Aug 2022
Modified 6 months ago
Categories