Scrape Emails Websites avatar
Scrape Emails Websites

Pricing

$10.00/month + usage

Go to Store
Scrape Emails Websites

Scrape Emails Websites

Developed by

Techionik

Techionik

Maintained by Community

This Actor is a powerful and scalable solution designed to extract email addresses from static websites in a reliable and efficient manner. It leverages Python’s requests and BeautifulSoup libraries to parse HTML pages.

0.0 (0)

Pricing

$10.00/month + usage

0

Total users

2

Monthly users

2

Runs succeeded

>99%

Last modified

2 days ago

📬 Email Extractor Apify Actor This Apify Actor is a multi-threaded email scraper that crawls one or more websites, collects email addresses, and stores them in a structured format. It supports recursive crawling up to a specified depth and filters out irrelevant domains automatically.

🚀 Features 🔎 Extracts email addresses from page content using regex

🌐 Recursive crawling with domain filtering to stay on the target site

⚙️ Multi-threaded processing for faster execution (uses ThreadPoolExecutor)

📈 Progress visualization with tqdm

🧠 Smart URL normalization from input strings

🛡️ Thread-safe operation using locks for visited URLs and shared output

🔄 Apify SDK Integration – push results to the Apify dataset

📦 Input Schema The actor expects the following input:

Json comma delimited { "URL": "https://example.com, https://another.com" } or white spaces { "URL": "https://example.com https://another.com" } You can provide one or multiple URLs separated by commas, spaces, or newlines.

🧪 Output Schema Each output record follows this structure:

{ "url": "https://example.com", "emails": [ "contact@example.com", "info@example.com" ] } 🛠️ How It Works Normalizes the input string into valid URLs.

For each URL:

Downloads and parses the page using requests + BeautifulSoup.

Extracts all email addresses using regex.

Recursively crawls valid links on the same domain depth = 4.

All data is pushed to the Apify dataset.