Website Contact Scraper | Enterprise-Grade | $12 / mo
Pricing
$11.99/month + usage
Website Contact Scraper | Enterprise-Grade | $12 / mo
The most reliable, enterprise-grade email scraper for any website. Uses a unique hybrid search (fast check + deep scan) to find public emails with an industry-leading success rate. Perfect for building high-quality lead lists for sales, marketing, and outreach.
Pricing
$11.99/month + usage
Rating
5.0
(2)
Developer

Fatih Tahta
Actor stats
4
Bookmarked
106
Total users
12
Monthly active users
a day ago
Last modified
Categories
Share
Enterprise Grade Website Contact Scraper
Slug: fatihtahta/email-scraper-deep
Overview
Enterprise Grade Website Contact Scraper collects structured contact signals from public websites, including email addresses, phone numbers, social profile links, and run-level metadata for each processed URL. It is designed for teams that need consistent, repeatable contact discovery across many domains without manual browsing. Each run produces normalized JSON records that are easy to filter, deduplicate, and load into analytics or CRM workflows. By automating collection in bulk, the actor reduces repetitive research effort and shortens time-to-insight for operational teams. The output format is stable and practical for both business users and technical pipelines.
Why Use This Actor
- Market research and analytics teams: Build contact coverage benchmarks across companies, regions, or segments, then track changes in available public contact channels over time.
- Product and content teams: Identify how organizations present customer contact options and social presence, with examples across service pages, footer links, and brand channels.
- Developers and data engineers: Ingest standardized JSON records into ETL jobs, enrichment pipelines, and internal data stores with minimal transformation.
- Lead generation and enrichment teams: Expand existing account lists with publicly listed email, phone, and social data to improve outbound preparation and account context.
- Monitoring and competitive tracking teams: Re-run target URL sets on a schedule to detect contact-surface updates and profile link changes.
Input Parameters
Provide any combination of URLs, queries, and filters to control what the actor processes.
| Parameter | Type | Description | Default |
|---|---|---|---|
baseUrls | string[] | List of authoritative public website URLs to process. Each URL should be a publicly accessible http or https address. The actor analyzes each URL and returns available contact signals such as emails, phone numbers, and social media profiles. Allowed values: valid public website URLs. | – |
Example Input
{"baseUrls": ["https://example-consulting.com","https://northwindlogistics.com","https://contosohealth.org"]}
Output
6.1 Output destination
The actor writes results to an Apify dataset as JSON records. And the dataset is designed for direct consumption by analytics tools, ETL pipelines, and downstream APIs without post-processing.
Record envelope
Each record contains a stable envelope for tracking and pipeline reliability:
- input_url (string, required): URL submitted in the actor input.
- source_url (string, optional): Final processed source URL returned in the result.
- trace_id (string, optional): Unique run trace value for diagnostics and lineage.
- status (string, required): Processing outcome (for example,
okorrequest_error).
Recommended idempotency key: input_url + ":" + trace_id
6.2 Examples
Example: website contact record
{"input_url": "https://example-enterprise.com","source_url": "https://www.example-enterprise.com/","trace_id": "a3d3f2fa-5d70-467e-b0eb-6ea7a2b1e55c","status": "ok","http_status": 200,"response_ms": 412,"bytes_read": 245890,"emails": ["contact@example-enterprise.com","partnerships@example-enterprise.com"],"phone_numbers": ["+1-415-555-0198","+1-415-555-0114"],"social_media": {"facebook": "https://www.facebook.com/exampleenterprise","instagram": "https://www.instagram.com/exampleenterprise","github": "https://github.com/exampleenterprise","google_maps": "https://maps.google.com/?q=500+Market+Street+San+Francisco+CA"},"raw_response": {"status": "ok","http_status": 200,"source_url": "https://www.example-enterprise.com/","response_ms": 412,"bytes_read": 245890,"email": {"values": ["contact@example-enterprise.com","partnerships@example-enterprise.com"]},"phone_number": {"values": ["+1-415-555-0198","+1-415-555-0114"]},"social_media": {"facebook": "https://www.facebook.com/exampleenterprise","instagram": "https://www.instagram.com/exampleenterprise","github": "https://github.com/exampleenterprise","google_maps": "https://maps.google.com/?q=500+Market+Street+San+Francisco+CA"},"trace_id": "a3d3f2fa-5d70-467e-b0eb-6ea7a2b1e55c"},"api_http_status": 200}
Field reference
- input_url (string, required): URL provided in actor input.
- source_url (string, optional): Resolved source URL used for extraction.
- trace_id (string, optional): Unique trace identifier for a processed record.
- status (string, required): High-level processing status.
- http_status (number, optional): HTTP status observed for the processed page.
- response_ms (number, optional): Response time in milliseconds.
- bytes_read (number, optional): Total response payload size in bytes.
- emails (array[string], optional): Discovered email addresses.
- phone_numbers (array[string], optional): Discovered phone numbers.
- social_media (object, optional): Discovered social/profile links.
- social_media.facebook (string, optional): Facebook profile or page URL.
- social_media.instagram (string, optional): Instagram profile URL.
- social_media.linkedin (string, optional): LinkedIn page or profile URL.
- social_media.twitter (string, optional): X/Twitter profile URL.
- social_media.youtube (string, optional): YouTube channel URL.
- social_media.github (string, optional): GitHub profile or organization URL.
- social_media.google_maps (string, optional): Google Maps location link.
- raw_response (object, optional): Original structured response block for auditing.
- raw_response.status (string, optional): Status value from the source response.
- raw_response.http_status (number, optional): HTTP status value from the source response.
- raw_response.source_url (string, optional): Source URL in raw payload.
- raw_response.response_ms (number, optional): Response time in raw payload.
- raw_response.bytes_read (number, optional): Payload size in raw payload.
- raw_response.email.values (array[string], optional): Email values in raw payload.
- raw_response.phone_number.values (array[string], optional): Phone values in raw payload.
- raw_response.social_media.facebook (string, optional): Facebook link in raw payload.
- raw_response.social_media.instagram (string, optional): Instagram link in raw payload.
- raw_response.social_media.linkedin (string, optional): LinkedIn link in raw payload.
- raw_response.social_media.twitter (string, optional): X/Twitter link in raw payload.
- raw_response.social_media.youtube (string, optional): YouTube link in raw payload.
- raw_response.social_media.github (string, optional): GitHub link in raw payload.
- raw_response.social_media.google_maps (string, optional): Google Maps link in raw payload.
- raw_response.trace_id (string, optional): Trace ID in raw payload.
- api_http_status (number, optional): HTTP status returned by the upstream API call.
- error (string, optional): Error description when processing fails.
Data guarantees & handling
- Best-effort extraction: fields may vary by region/session/availability/UI experiments.
- Optional fields: null-check in downstream code.
- Deduplication: recommend
type + ":" + id.
How to Run on Apify
- Open the Actor in Apify Console.
- Configure your search parameters (for this actor, provide one or more website URLs in
baseUrls). - Set the maximum number of outputs to collect.
- Click Start and wait for the run to finish.
- Download results in JSON, CSV, Excel, or other supported formats.
Scheduling & Automation
Scheduling
Automated Data Collection
You can schedule recurring runs to keep contact datasets fresh without manual execution. Scheduled runs are useful for ongoing monitoring, enrichment refreshes, and regular reporting cycles.
- Navigate to Schedules in Apify Console
- Create a new schedule (daily, weekly, or custom cron)
- Configure input parameters
- Enable notifications for run completion
- Add webhooks for automated processing
Integration Options
- Webhooks: Trigger downstream actions when a run completes.
- Zapier: Connect to 5,000+ apps without coding.
- Make (Integromat): Build multi-step automation workflows.
- Google Sheets: Export results to a spreadsheet.
- Slack/Discord: Receive notifications and summaries.
- Email: Send automated reports via email.
Performance
Estimated run times:
- Small runs (< 1,000 outputs): ~2–3 minutes
- Medium runs (1,000–5,000 outputs): ~5–15 minutes
- Large runs (5,000+ outputs): ~15–30 minutes
Execution time varies based on filters, result volume, and how much information is returned per record.
Compliance & Ethics
Responsible Data Collection
This actor collects publicly available website contact and social profile information from public business websites for legitimate business purposes, including:
- market intelligence research and market analysis
- contact data enrichment for internal business workflows
- directory and coverage monitoring for operational reporting
This section is informational and not legal advice.
Best Practices
- Use collected data in accordance with applicable laws, regulations, and the target site’s terms.
- Respect individual privacy and personal information.
- Use data responsibly and avoid disruptive or excessive collection.
- Do not use this actor for spamming, harassment, or other harmful purposes.
- Follow relevant data protection requirements where applicable (e.g., GDPR, CCPA).
Support
If you need help, open an issue on the actor’s Issues tab or contact via the actor page in Apify Console. Include the input used (redacted), the run ID, expected vs. actual behavior, and an optional small output sample so issues can be reproduced quickly.