RegExp Scraper avatar

RegExp Scraper

Try for free

30 minutes trial then $25.00/month - No credit card required now

Go to Store
RegExp Scraper

RegExp Scraper

ib4ngz/regexp-scraper
Try for free

30 minutes trial then $25.00/month - No credit card required now

This actor scrapes data from a list of provided URLs using regular expressions for precise and customizable pattern matching. It can handle both static and dynamic web pages and supports depth-based crawling to explore links and extract data from multiple levels of the web.

You can access the RegExp Scraper programmatically from your own Python applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

1from apify_client import ApifyClient
2
3# Initialize the ApifyClient with your Apify API token
4# Replace '<YOUR_API_TOKEN>' with your token.
5client = ApifyClient("<YOUR_API_TOKEN>")
6
7# Prepare the Actor input
8run_input = {
9    "startUrls": [{ "url": "https://apify.com" }],
10    "patterns": "(?<=href=[\"'])([^\"']+)",
11    "crawlerType": "Crawlee + Cheerio",
12}
13
14# Run the Actor and wait for it to finish
15run = client.actor("ib4ngz/regexp-scraper").call(run_input=run_input)
16
17# Fetch and print Actor results from the run's dataset (if there are any)
18print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
19for item in client.dataset(run["defaultDatasetId"]).iterate_items():
20    print(item)
21
22# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

RegExp Scraper API in Python

The Apify API client for Python is the official library that allows you to use RegExp Scraper API in Python, providing convenience functions and automatic retries on errors.

Install the apify-client

pip install apify-client

Other API clients include:

Developer
Maintained by Community

Actor Metrics

  • 1 monthly user

  • 1 star

  • >99% runs succeeded

  • Created in Jan 2025

  • Modified 15 hours ago