Empty TypeScript project
Empty template with basic structure for the Actor with Apify SDK that allows you to easily add your own functionality.
src/main.ts
1// Apify SDK - toolkit for building Apify Actors (Read more at https://docs.apify.com/sdk/js/)
2import { Actor } from 'apify';
3// Crawlee - web scraping and browser automation library (Read more at https://crawlee.dev)
4// import { CheerioCrawler } from 'crawlee';
5
6// this is ESM project, and as such, it requires you to specify extensions in your relative imports
7// read more about this here: https://nodejs.org/docs/latest-v18.x/api/esm.html#mandatory-file-extensions
8// note that we need to use `.js` even when inside TS files
9// import { router } from './routes.js';
10
11// The init() call configures the Actor for its environment. It's recommended to start every Actor with an init()
12await Actor.init();
13
14console.log('Hello from the Actor!');
15/**
16 * Actor code
17 */
18
19// Gracefully exit the Actor process. It's recommended to quit all Actors with an exit()
20await Actor.exit();
Empty TypeScript template
Start a new web scraping project quickly and easily in TypeScript (Node.js) with our empty project template. It provides a basic structure for the Actor with Apify SDK and allows you to easily add your own functionality.
Included features
How it works
Insert your own code between await Actor.init()
and await Actor.exit()
. If you would like to use the Crawlee library simply uncomment its import import { CheerioCrawler } from 'crawlee';
.
Resources
- TypeScript vs. JavaScript: which to use for web scraping?
- Node.js tutorials in Academy
- Video guide on getting scraped data using Apify API
- Integration with Airbyte, Make, Zapier, Google Drive, and other apps
- A short guide on how to build web scrapers using code templates:
Scrape single page with provided URL with Axios and extract data from page's HTML with Cheerio.
A scraper example that uses Cheerio to parse HTML. It's fast, but it can't run the website's JavaScript or pass JS anti-scraping challenges.
Example of a Puppeteer and headless Chrome web scraper. Headless browsers render JavaScript and are harder to block, but they're slower than plain HTTP.
Web scraper example with Crawlee, Playwright and headless Chrome. Playwright is more modern, user-friendly and harder to block than Puppeteer.
Example of using the Playwright Test project to run automated website tests in the cloud and display their results. Usable as an API.
Template with basic structure for an Actor using Standby mode that allows you to easily add your own functionality.