Back to template gallery

Crawlee + Playwright + Camoufox

Web scraper example with Crawlee, Playwright and headless Camoufox. Camoufox is a custom stealthy fork of Firefox. Try this template if you're facing anti-scraping challenges.

Language

typescript

Tools

nodejs

crawlee

playwright

Use cases

Web scraping

Features

src/main.ts

src/routes.ts

1/**
2 * This template is a production ready boilerplate for developing with `PlaywrightCrawler`.
3 * Use this to bootstrap your projects using the most up-to-date code.
4 * If you're looking for examples or want to learn more, see README.
5 */
6
7// For more information, see https://docs.apify.com/sdk/js
8import { Actor } from 'apify';
9// For more information, see https://crawlee.dev
10import { PlaywrightCrawler } from 'crawlee';
11// this is ESM project, and as such, it requires you to specify extensions in your relative imports
12// read more about this here: https://nodejs.org/docs/latest-v18.x/api/esm.html#mandatory-file-extensions
13// note that we need to use `.js` even when inside TS files
14import { router } from './routes.js';
15import { firefox } from 'playwright';
16import { launchOptions as camoufoxLaunchOptions } from 'camoufox-js';
17
18interface Input {
19    startUrls: string[];
20    maxRequestsPerCrawl: number;
21}
22
23// Initialize the Apify SDK
24await Actor.init();
25
26// Structure of input is defined in input_schema.json
27const {
28    startUrls = ['https://crawlee.dev'],
29    maxRequestsPerCrawl = 100,
30} = await Actor.getInput<Input>() ?? {} as Input;
31
32const proxyConfiguration = await Actor.createProxyConfiguration();
33
34const crawler = new PlaywrightCrawler({
35    proxyConfiguration,
36    maxRequestsPerCrawl,
37    requestHandler: router,
38    launchContext: {
39        launcher: firefox,
40        launchOptions: await camoufoxLaunchOptions({
41            headless: true,
42            // fonts: ['Times New Roman'] // <- custom Camoufox options
43        }),
44    }
45});
46
47await crawler.run(startUrls);
48
49// Exit successfully
50await Actor.exit();

PlaywrightCrawler template

This template is a production-ready boilerplate for developing an Actor with PlaywrightCrawler. It has Camoufox - a stealthy fork of Firefox - preinstalled. Note that Camoufox might consume more resources than the default Playwright-bundled Chromium or Firefox.

Use this template to bootstrap your projects using the most up-to-date code.

We decided to split Apify SDK into two libraries, Crawlee and Apify SDK v3. Crawlee will retain all the crawling and scraping-related tools and will always strive to be the best web scraping library for its community. At the same time, Apify SDK will continue to exist, but keep only the Apify-specific features related to building actors on the Apify platform. Read the upgrading guide to learn about the changes.

Resources

If you're looking for examples or want to learn more visit:

Already have a solution in mind?

Sign up for a free Apify account and deploy your code to the platform in just a few minutes! If you want a head start without coding it yourself, browse our Store of existing solutions.