TypeScript web scraping templates
Choose from multiple web scraping templates to quickly build web scrapers in TypeScript
Scrape single page with provided URL with Axios and extract data from page's HTML with Cheerio.
StarterA scraper example that uses Cheerio to parse HTML. It's fast, but it can't run the website's JavaScript or pass JS anti-scraping challenges.
Example of a Puppeteer and headless Chrome web scraper. Headless browsers render JavaScript and are harder to block, but they're slower than plain HTTP.
Web scraper example with Crawlee, Playwright and headless Chrome. Playwright is more modern, user-friendly and harder to block than Puppeteer.
Example of using the Playwright Test project to run automated website tests in the cloud and display their results. Usable as an API.
Empty template with basic structure for the Actor with Apify SDK that allows you to easily add your own functionality.
Template with basic structure for an Actor using Standby mode that allows you to easily add your own functionality.
StarterScrape single page with provided URL with Axios and extract data from page's HTML with Cheerio.
StarterApify Universal Scrapers
Universal Scrapers provide you with a solid boilerplate to build fully functioning scrapers directly on the Apify platform. Configure and run your web scrapers manually in a user interface or programmatically via an API.
Cheerio Scraper
apify/cheerio-scraper
Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.
5.6k
79
Web Scraper
apify/web-scraper
Crawls arbitrary websites using the Chrome browser and extracts data from pages using JavaScript code. The Actor supports both recursive crawling and lists of URLs and automatically manages concurrency for maximum performance. This is Apify's basic tool for web crawling and scraping.
71.5k
219
Puppeteer Scraper
apify/puppeteer-scraper
Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.
4.6k
63
Playwright Scraper
apify/playwright-scraper
Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.
849
17
Vanilla JS Scraper
mstephen190/vanilla-js-scraper
Scrape the web using familiar JavaScript methods! Crawls websites using raw HTTP requests, parses the HTML with the JSDOM package, and extracts data from the pages using Node.js code. Supports both recursive crawling and lists of URLs. This actor is a non jQuery alternative to CheerioScraper.
423
3
BeautifulSoup Scraper
apify/beautifulsoup-scraper
Crawls websites using raw HTTP requests. It parses the HTML with the BeautifulSoup library and extracts data from the pages using Python code. Supports both recursive crawling and lists of URLs. This Actor is a Python alternative to Cheerio Scraper.
726
4