Open Source Actors Scraper
No credit card required
Open Source Actors Scraper
No credit card required
Get all open-source Actors from Apify Store.
TypeScript Crawlee & CheerioCrawler template
A template example built with Crawlee to scrape data from a website using Cheerio wrapped into CheerioCrawler.
Included features
- Apify SDK - toolkit for building Actors
- Crawlee - web scraping and browser automation library
- Input schema - define and easily validate a schema for your Actor's input
- Dataset - store structured data where each object stored has the same attributes
- Cheerio - a fast, flexible & elegant library for parsing and manipulating HTML and XML
How it works
This code is a TypeScript script that uses Crawlee CheerioCralwer framework to crawl a website and extract the data from the crawled URLs with Cheerio. It then stores the website titles in a dataset.
- The crawler starts with URLs provided from the input
startUrls
field defined by the input schema. Number of scraped pages is limited bymaxPagesPerCrawl
field from input schema. - The crawler uses
requestHandler
for each URL to extract the data from the page with the Cheerio library and to save the title and URL of each page to the dataset. It also logs out each result that is being saved.
Resources
- Video tutorial on building a scraper using CheerioCrawler
- Written tutorial on building a scraper using CheerioCrawler
- Web scraping with Cheerio in 2023
- How to scrape a dynamic page using Cheerio
- TypeScript vs. JavaScript: which to use for web scraping?
- Integration with Zapier, Make, Google Drive and others
- Video guide on getting scraped data using Apify API
- A short guide on how to build web scrapers using code templates:
Getting started
For complete information see this article. To run the actor use the following command:
apify run
Deploy to Apify
Connect Git repository to Apify
If you've created a Git repository for the project, you can easily connect to Apify:
- Go to Actor creation page
- Click on Link Git Repository button
Push project on your local machine to Apify
You can also deploy the project on your local machine to Apify without the need for the Git repository.
-
Log in to Apify. You will need to provide your Apify API Token to complete this action.
apify login
-
Deploy your Actor. This command will deploy and build the Actor on the Apify Platform. You can find your newly created Actor under Actors -> My Actors.
apify push
Documentation reference
To learn more about Apify and Actors, take a look at the following resources:
Actor Metrics
1 monthly user
-
0 No stars yet
>99% runs succeeded
Created in Sep 2024
Modified 11 days ago