Pricing

$12.00/month + usage

RegExp Scraper

This actor scrapes data from a list of provided URLs using regular expressions for precise and customizable pattern matching. It can handle both static and dynamic web pages and supports depth-based crawling to explore links and extract data from multiple levels of the web.

Pricing

$12.00/month + usage

Rating

0.0

(0)

Developer

Iqbal R

Actor stats

Bookmarked

Total users

Monthly active users

a year ago

Last modified

Categories

Automation

Developer tools

You can access the RegExp Scraper programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

Python

JavaScript

CLI

OpenAPI

HTTP

MCP

1from apify_client import ApifyClient
2
3# Initialize the ApifyClient with your Apify API token
4# Replace '<YOUR_API_TOKEN>' with your token.
5client = ApifyClient("<YOUR_API_TOKEN>")
6
7# Prepare the Actor input
8run_input = {
9    "startUrls": [{ "url": "https://apify.com" }],
10    "patterns": "(?<=href=[\"'])([^\"']+)",
11    "crawlerType": "Crawlee + Cheerio",
12}
13
14# Run the Actor and wait for it to finish
15run = client.actor("ib4ngz/regexp-scraper").call(run_input=run_input)
16
17# Fetch and print Actor results from the run's dataset (if there are any)
18print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
19for item in client.dataset(run["defaultDatasetId"]).iterate_items():
20    print(item)
21
22# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

RegExp Scraper API in Python

The Apify API client for Python is the official library that allows you to use RegExp Scraper API in Python, providing convenience functions and automatic retries on errors.

Install the apify-client

$pip install apify-client

Other API clients include:

RegExp Scraper API in JavaScript

RegExp Scraper API through CLI

RegExp Scraper OpenAPI definition

RegExp Scraper API

Web Scraper

apify/web-scraper

Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.

Apify

121K

4.6

Email Scraper

ib4ngz/email-scraper

This actor scrapes email addresses from a list of provided URLs. It recursively crawls pages, extracts unique emails, and stores them in a dataset. The actor supports DNS validation to ensure domain authenticity and allows filtering based on custom crawling depth.

Iqbal R

547

Dynamic Web Scraper

josejet/dynamic-web-scraper

Dynamic Web Scraper is an Apify Actor that gathers information online by simulating user browsing behavior on the web. It reduces the time and amount of scraped web pages by using a model (ChatGPT) to make decisions regarding browser navigation and results evaluation.

Pepa J

353

Puppeteer Scraper

apify/puppeteer-scraper

Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.

Apify

15K

5.0

LeadScraper

cdubiel/lead-scraper

Scrape a list of urls and receive business contact information, social media links, and a description of the services. This actor will scrape across multiple pages in the sitemap and returns a confidence score to every phone number and email that it finds. webscraper, scrape leads, web scraper

Claire Dubiel

146

5.0

Pro Web Content Crawler (With Images)

assertive_analogy/pro-web-content-crawler

Pro Web Content Crawler is a powerful tool that digs deep into web content and images. It handles complex sites, dynamic pages, and hidden content, making it perfect for extracting both data and images. Customizable and API-ready for your unique data needs.

Gideon Nesh

260

5.0

Cheerio Scraper

apify/cheerio-scraper

Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.

Apify

18K

4.6

bcv-tasa-oficial

grupoaceivzla/bcv-tasa-oficial

Grupo ACEI

Playwright Scraper

apify/playwright-scraper

Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.

Apify

10K

3.9

Camoufox Scraper

apify/camoufox-scraper

Crawls websites with stealthy Camoufox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.

Apify

360

5.0