Scrape Emails Websites avatar
Scrape Emails Websites

Pricing

$10.00/month + usage

Go to Store
Scrape Emails Websites

Scrape Emails Websites

Developed by

Techionik

Techionik

Maintained by Community

This Actor is a powerful and scalable solution designed to extract email addresses from static websites in a reliable and efficient manner. It leverages Pythonโ€™s requests and BeautifulSoup libraries to parse HTML pages.

0.0 (0)

Pricing

$10.00/month + usage

0

Total users

20

Monthly users

17

Runs succeeded

>99%

Last modified

17 days ago

๐Ÿ“ฌ Email Extractor Apify Actor This Apify Actor is a multi-threaded email scraper that crawls one or more websites, collects email addresses, and stores them in a structured format. It supports recursive crawling up to a specified depth and filters out irrelevant domains automatically.

๐Ÿš€ Features ๐Ÿ”Ž Extracts email addresses from page content using regex

๐ŸŒ Recursive crawling with domain filtering to stay on the target site

โš™๏ธ Multi-threaded processing for faster execution (uses ThreadPoolExecutor)

๐Ÿ“ˆ Progress visualization with tqdm

๐Ÿง  Smart URL normalization from input strings

๐Ÿ›ก๏ธ Thread-safe operation using locks for visited URLs and shared output

๐Ÿ”„ Apify SDK Integration โ€“ push results to the Apify dataset

๐Ÿ“ฆ Input Schema The actor expects the following input:

Json comma delimited { "URL": "https://example.com, https://another.com" } or white spaces { "URL": "https://example.com https://another.com" } You can provide one or multiple URLs separated by commas, spaces, or newlines.

๐Ÿงช Output Schema Each output record follows this structure:

{ "url": "https://example.com", "emails": [ "contact@example.com", "info@example.com" ] } ๐Ÿ› ๏ธ How It Works Normalizes the input string into valid URLs.

For each URL:

Downloads and parses the page using requests + BeautifulSoup.

Extracts all email addresses using regex.

Recursively crawls valid links on the same domain depth = 4.

All data is pushed to the Apify dataset.

Share Actor: