Website Emails Scraper
Pricing
$5.00 / 1,000 emails
Website Emails Scraper
It goes to a website and extracts every email addresses. Super simple.
Pricing
$5.00 / 1,000 emails
Rating
4.6
(3)
Developer
Maxime Dupré
Actor stats
12
Bookmarked
536
Total users
76
Monthly active users
0.42 hours
Issues response
2 days ago
Last modified
Categories
Share
Find website contact emails without manual digging
Website Emails Scraper shallow-crawls the sites you already care about and emits one row per discovered email tied back to the exact seed URL. Start with 1 to 3 sites in URLs to scrape, keep the crawl shallow by default, and set Max emails to scrape only when you want a smaller first batch.
- ✅ One row per email - every discovered address keeps the source seed URL attached for exports, QA, and follow-up.
- ✅ Shallow by default - checks the seed plus a limited set of internal pages instead of spidering an entire site.
- ✅ Duplicate-safe - deduplicates normalized seed URLs before crawl and deduplicates emails per source URL.
- ✅ Small seed list first - test 1 to 3 sites or add a per-site cap before you scale up.
🏆 Benefits
- 📬 Turn a curated list of websites into contact-email rows without opening contact pages one by one.
- 🧭 Keep source context attached to every email so downstream CRMs, enrichment jobs, and audits stay traceable.
- 🧪 Start with a small bounded run before you commit a larger list or automation.
- ♻️ Avoid duplicate crawl work when the same website appears more than once in your input.
🚀 Quick start
- Open the Input tab and add 1 to 3 sites to
URLs to scrape.For the fastest trust check, start with a site where you already know a public contact email should exist.
- If you want to limit one site on the first run, set
Max emails to scrapefor that specific row. Leave it empty or0to keep all deduplicated emails for that seed. - Leave
Proxy configurationon the default Apify proxy settings unless you intentionally want direct mode. - Run the actor, then review the finished dataset in that run, or pull the same rows through the API.
⚙️ Features
- 🌐 Shallow-crawls each seed URL plus a limited set of internal pages instead of attempting a full-site crawl.
- 🧹 Deduplicates emails per source URL, so repeated mentions on the same site do not inflate the dataset.
- 📄 Emits one dataset row per email and keeps the source seed attached for traceable exports.
- 🛡️ Supports Apify proxy settings by default and explicit direct mode when you pass
nullforproxyConfiguration. - 🎚️ Lets you bound first runs with an optional per-URL email cap.
📊 Output
See the full Output tab for the complete contract.
Example
{"url": "https://apify.com/contact","seedUrl": "https://apify.com/contact","email": "hello@apify.com"}
Params
| Field | Type | Description |
|---|---|---|
url | string | The source website URL that produced this email row. |
seedUrl | string | The original input seed URL whose shallow crawl found the email. |
email | string | One deduplicated email address discovered on that site. |
🛠️ Input
Example
This example is from the same live run that produced the output example above.
{"urls": [{"url": "https://apify.com/contact","userData": {"maxNbEmailsToScrape": 3}}],"proxyConfiguration": null}
Params
| Field | Type | Description | Default / empty behavior |
|---|---|---|---|
urls | array<object> | Required list of seed website URLs to shallow-crawl. Each unique email found for a seed becomes its own dataset row. | Must contain at least 1 item. |
urls[*].url | string | Website URL to crawl for contact emails. | Required for each seed. |
urls[*].userData | object | Optional per-URL settings container. | Omit it when you do not need per-seed overrides. |
urls[*].userData.maxNbEmailsToScrape | integer | Optional per-URL cap for emitted email rows from that seed. | Empty or 0 keeps all deduplicated emails for that seed. |
proxyConfiguration | object or null | Apify proxy settings used for the seed URLs and their shallow internal links. | Defaults to Apify Proxy with the US country; pass null for direct mode. |
Important
- Malformed
urls[*].urlentries are skipped with a warning before crawl. Remaining valid seeds still run in the same actor run. - Duplicate normalized URLs are crawled once. If duplicates disagree on
Max emails to scrape, the first valid entry wins and later conflicting duplicates are skipped with a warning. - This actor always crawls for emails, so there is no separate scrape-emails toggle or public concurrency field to configure.
🔍 Error handling
- Malformed seed entries are skipped with a warning that points to the bad
urls[*].urlpath, while the remaining valid URLs continue. - Invalid
urls[*].userData.maxNbEmailsToScrapevalues skip only the affected seed and explain that the value must be0or a positive integer. - Duplicate normalized URLs with conflicting per-URL caps keep the first valid entry and warn about the skipped duplicates.
403and404responses are ignored for extraction purposes, so those pages simply emit no email rows and do not fail the whole run.- If cleanup leaves no valid seed URLs, the actor finishes successfully with zero dataset items and a concise warning.
🆘 Support
For issues, questions, or feature requests, file a ticket and I'll fix or implement it in less than 24h 🫡
🔗 Other actors
- Product Hunt Scraper to source startup and product-launch sites before you crawl them for emails.
- Tiny Startups Scraper to pull startup homepages you can enrich for contact emails next.
- TinySeed Scraper to export portfolio-company sites, descriptions, and optional emails in one run.
- Uneed Scraper to collect fresh tool sites plus maker links before deeper outreach enrichment.
- Twitter Scraper to find accounts, posts, and company sites worth sending through this email crawler.
- Reddit Scraper to find communities and companies worth enriching with website-email discovery.
Made with ❤️ by Maxime Dupré