Pricing

$12.00/month + usage

Go to Store

RegExp Scraper

Try for free

Developed by

Iqbal R

This actor scrapes data from a list of provided URLs using regular expressions for precise and customizable pattern matching. It can handle both static and dynamic web pages and supports depth-based crawling to explore links and extract data from multiple levels of the web.

0.0 (0)

Pricing

$12.00/month + usage

Total users

Monthly users

Runs succeeded

>99%

Last modified

2 months ago

Automation

Developer tools

Start URLs

startUrlsarrayRequired

URLs to start with.

Maximum Depth

maxDepthintegerOptional

The maximum depth for crawling.

Default value of this property is 1

Regex Patterns

patternsstringRequired

Patterns to search in the HTML content. Each pattern should be on a new line.

Crawler Type

crawlerTypeEnumOptional

Select the type of crawler to use.

Value options:

"Crawlee + Cheerio": string"Crawlee + Puppeteer + Chrome": string

Default value of this property is "Crawlee + Cheerio"

Proxy configuration

proxyConfigurationobjectOptional

Select proxies to be used by your crawler.

Minimum Concurrency

minConcurrencyintegerOptional

The minimum number of concurrent requests or pages being processed.

Default value of this property is 1

Maximum Concurrency

maxConcurrencyintegerOptional

The maximum number of concurrent requests or pages being processed.

Default value of this property is 10

Web Scraper

apify/web-scraper

Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.

Apify

89K

4.5

Email Scraper

ib4ngz/email-scraper

This actor scrapes email addresses from a list of provided URLs. It recursively crawls pages, extracts unique emails, and stores them in a dataset. The actor supports DNS validation to ensure domain authenticity and allows filtering based on custom crawling depth.

Iqbal R

142

Simple Contact Info and Social Media Scraper

pajoe/simple-contact-info-and-social-media-scraper

This Apify actor is designed to crawl web pages and extract social media handles, emails, and phone numbers using Puppeteer. It can handle dynamic content and navigate through multiple pages, making it suitable for comprehensive data extraction tasks.

va-gasd

Dynamic Web Scraper

josejet/dynamic-web-scraper

Dynamic Web Scraper is an Apify Actor that gathers information online by simulating user browsing behavior on the web. It reduces the time and amount of scraped web pages by using a model (ChatGPT) to make decisions regarding browser navigation and results evaluation.

Pepa J W̚͠h̾̔̎̿͊͛̄͊e̢̦̲̰̦̋̇͗̾̑oi̟͈̯̝̊̉́̇͑̕ğ̆͘͡e͗͛o͊̔̇̄

146

web-scrape-data

angelbabyai123/my-actor

web-scrape-data

Angel Baby

Puppeteer Scraper

apify/puppeteer-scraper

Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.

Apify

8.2K

5.0

Pro Web Content Crawler (With Images)

assertive_analogy/pro-web-content-crawler

Pro Web Content Crawler is a powerful tool that digs deep into web content and images. It handles complex sites, dynamic pages, and hidden content, making it perfect for extracting both data and images. Customizable and API-ready for your unique data needs.

Gideon Nesh

119

5.0

bcv-tasa-oficial

grupoaceivzla/bcv-tasa-oficial

Grupo ACEI

Cheerio Scraper

apify/cheerio-scraper

Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.

Apify

8.9K

4.7

Playwright Scraper

apify/playwright-scraper

Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.

Apify

1.9K

4.3