Pricing

Pay per usage

Try for free

Go to Apify Store

Cheerio Scraper

Try for free

Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.

Pricing

Pay per usage

Rating

4.6

(32)

Developer

Apify

Actor stats

304

Bookmarked

18K

Total users

1.2K

Monthly active users

12 days ago

Last modified

Categories

Developer tools

Open source

You can access the Cheerio Scraper programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

Python

JavaScript

CLI

OpenAPI

HTTP

MCP

$echo '{
<  "startUrls": [
<    {
<      "url": "https://crawlee.dev/js"
<    }
<  ],
<  "respectRobotsTxtFile": true,
<  "globs": [
<    {
<      "glob": "https://crawlee.dev/js/*/*"
<    }
<  ],
<  "pseudoUrls": [],
<  "excludes": [
<    {
<      "glob": "/**/*.{png,jpg,jpeg,pdf}"
<    }
<  ],
<  "linkSelector": "a[href]",
<  "pageFunction": "async function pageFunction(context) {\\n    const { $, request, log } = context;\\n\\n    // The \\"$\\" property contains the Cheerio object which is useful\\n    // for querying DOM elements and extracting data from them.\\n    const pageTitle = $('\''title'\'').first().text();\\n\\n    // The \\"request\\" property contains various information about the web page loaded. \\n    const url = request.url;\\n    \\n    // Use \\"log\\" object to print information to Actor log.\\n    log.info('\''Page scraped'\'', { url, pageTitle });\\n\\n    // Return an object with the data extracted from the page.\\n    // It will be stored to the resulting dataset.\\n    return {\\n        url,\\n        pageTitle\\n    };\\n}",
<  "proxyConfiguration": {
<    "useApifyProxy": true
<  },
<  "initialCookies": [],
<  "additionalMimeTypes": [],
<  "preNavigationHooks": "// We need to return array of (possibly async) functions here.\\n// The functions accept two arguments: the \\"crawlingContext\\" object\\n// and \\"requestAsBrowserOptions\\" which are passed to the `requestAsBrowser()`\\n// function the crawler calls to navigate..\\n[\\n    async (crawlingContext, requestAsBrowserOptions) => {\\n        // ...\\n    }\\n]",
<  "postNavigationHooks": "// We need to return array of (possibly async) functions here.\\n// The functions accept a single argument: the \\"crawlingContext\\" object.\\n[\\n    async (crawlingContext) => {\\n        // ...\\n    },\\n]",
<  "customData": {}
<}' |
<apify call apify/cheerio-scraper --silent --output-dataset

Cheerio Scraper - HTML scraping tool API through CLI

The Apify CLI is the official tool that allows you to use Cheerio Scraper locally, providing convenience functions and automatic retries on errors.

Install Apify CLI

Using installation script (macOS/Linux):

$curl -fsSL https://apify.com/install-cli.sh | bash

Using installation script (Windows):

$irm https://apify.com/install-cli.ps1 | iex

Using Homebrew:

$brew install apify-cli

Using npm:

$npm install -g apify-cli

Other API clients include:

Cheerio Scraper API in Python

Cheerio Scraper API in JavaScript

Cheerio Scraper OpenAPI definition

Cheerio Scraper API

Apify Store Scraper

applora/apify-store-scraper

A powerful Apify Actor designed to extract comprehensive data from the Apify Store. This scraper can discover available Actors, collect detailed Actor information, and extract key metadata, making it perfect for market research and competitor analysis.

Applora

5.0

BeautifulSoup Scraper

apify/beautifulsoup-scraper

Crawls websites using raw HTTP requests. It parses the HTML with the BeautifulSoup library and extracts data from the pages using Python code. Supports both recursive crawling and lists of URLs. This Actor is a Python alternative to Cheerio Scraper.

Apify

5.0

Vanilla JS Scraper

mstephen190/vanilla-js-scraper

Scrape the web using familiar JavaScript methods! Crawls websites using raw HTTP requests, parses the HTML with the JSDOM package, and extracts data from the pages using Node.js code. Supports both recursive crawling and lists of URLs. This actor is a non jQuery alternative to CheerioScraper.

Matthias Stephens

525

IndiaMART Product Scraper

jungle_synthesizer/indiamart-scraper

Scrape product listings and supplier data from IndiaMART's B2B export marketplace. Extract product names, prices, MOQ, supplier details, locations, GST/IEC verification status, ratings, and export history. Search by product keyword with automatic pagination.

BowTiedRaccoon

Indiamart Product Scraper

scrapeai/indiamart-product-scraper

This Apify actor retrieves product and supplier data through the IndiaMart search API. Search by query and city to collect structured product information including name, price, company, contact details, and location. Perfect for B2B lead generation, market research, and sourcing products from India.

ScrapeAI

5.0

IndiaMART Scraper

parseforge/indiamart-scraper

Scrape product listings and supplier data from IndiaMART, India's largest B2B marketplace. Get product names, prices, supplier details, locations, ratings, and contact info. Search by product keyword with automatic pagination.

ParseForge

IndiaMART Scraper - Suppliers, Exporters, Prices & GST

haketa/indiamart-scraper

IndiaMART scraper & API: find Indian B2B suppliers and products by keyword and export company, product, price, contact email and phone, city and profile URL. India sourcing, manufacturer and supplier discovery plus B2B lead generation — fast, no login.

Haketa

Python Scraper

sovanza.inc/python-scraper

Python Scraper extracts web page data using Requests and BeautifulSoup. It collects titles, meta tags, headings, links, images, Open Graph data, text snippets, and custom CSS selector fields, with exports to JSON, CSV, Excel, XML, or HTML.

Sovanza

5.0

Shopify Store Leads Scraper

parsebird/shopify-store-leads-scraper

Scrape Shopify store leads by keyword or category. Extract emails, phone numbers, addresses, ratings, social links, and sample products. Filter by location, price, and shipping. Export as JSON, CSV, Excel.