Scrape Emails Websites
Pricing
$10.00/month + usage
Scrape Emails Websites
This Actor is a powerful and scalable solution designed to extract email addresses from static websites in a reliable and efficient manner. It leverages Python’s requests and BeautifulSoup libraries to parse HTML pages.
0.0 (0)
Pricing
$10.00/month + usage
0
Total users
2
Monthly users
2
Runs succeeded
>99%
Last modified
2 days ago
📬 Email Extractor Apify Actor This Apify Actor is a multi-threaded email scraper that crawls one or more websites, collects email addresses, and stores them in a structured format. It supports recursive crawling up to a specified depth and filters out irrelevant domains automatically.
🚀 Features 🔎 Extracts email addresses from page content using regex
🌐 Recursive crawling with domain filtering to stay on the target site
⚙️ Multi-threaded processing for faster execution (uses ThreadPoolExecutor)
📈 Progress visualization with tqdm
🧠 Smart URL normalization from input strings
🛡️ Thread-safe operation using locks for visited URLs and shared output
🔄 Apify SDK Integration – push results to the Apify dataset
📦 Input Schema The actor expects the following input:
Json comma delimited { "URL": "https://example.com, https://another.com" } or white spaces { "URL": "https://example.com https://another.com" } You can provide one or multiple URLs separated by commas, spaces, or newlines.
🧪 Output Schema Each output record follows this structure:
{ "url": "https://example.com", "emails": [ "contact@example.com", "info@example.com" ] } 🛠️ How It Works Normalizes the input string into valid URLs.
For each URL:
Downloads and parses the page using requests + BeautifulSoup.
Extracts all email addresses using regex.
Recursively crawls valid links on the same domain depth = 4.
All data is pushed to the Apify dataset.