Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.
Dismiss cookie modals (closeCookieModals): Using the I don't care about cookies browser extension. When on, the crawler will automatically try to dismiss cookie consent modals. This can be useful when crawling European websites that show cookie consent modals.
Maximum scrolling distance in pixels (maxScrollHeightPixels): The crawler will scroll down the page until all content is loaded or the maximum scrolling distance is reached. Setting this to 0 disables scrolling altogether.
Exclude Glob Patterns (excludes): Glob patterns to match links in the page that you want to exclude from being enqueued.