Email, Phone, & Social Media Scraper avatar
Email, Phone, & Social Media Scraper

Pricing

$7.99 / 1,000 results

Go to Store
Email, Phone, & Social Media Scraper

Email, Phone, & Social Media Scraper

Developed by

Intelecta.ai

Intelecta.ai

Maintained by Community

A powerful Apify actor that scrapes emails, phone numbers, and social media profiles from a list of websites, following internal links for thorough contact extraction. Ideal for lead generation, research, and building structured contact databases.

5.0 (4)

Pricing

$7.99 / 1,000 results

4

Total users

10

Monthly users

10

Runs succeeded

96%

Last modified

3 days ago

Deep Web Scraper: Email, Phone, and Social Media Extractor

A robust web scraping actor for the Apify platform that extracts contact information—emails, phone numbers, and social media profiles—from a list of websites. This tool is ideal for lead generation, market research, and building contact databases.

What This Actor Does

  • Bulk Website Processing: Scrape multiple websites in a single run by providing a list of URLs.
  • Contact Extraction: Finds emails, phone numbers, and social media links from each website.
  • Intelligent Crawling: Follows internal links (contact, about, team pages, etc.) up to a configurable depth for more thorough extraction.
  • Duplicate Removal: Ensures only unique contact information is collected.
  • Flat Structured Output: Each contact found (email, phone, or social) is output as a separate record, not grouped by URL.
  • Error Handling: Gracefully skips failed requests and logs errors.
  • Proxy Support: Integrates with Apify's proxy services for reliable scraping.

How It Works

  1. Input:
    • Provide a list of website URLs (start_urls) and set the crawl depth (max_crawl_depth).
    • Example INPUT.json:
      {
      "start_urls": [
      "https://example.com",
      "https://another.com"
      ],
      "max_crawl_depth": 1
      }
  2. Crawling:
    • For each URL, the actor fetches the main page and follows internal links up to the specified depth, prioritizing pages likely to contain contact info.
  3. Extraction:
    • Extracts emails (using regex and mailto links), phone numbers (international and local formats), and social media links (from 15+ platforms).
  4. Output:
    • For each contact found, the actor outputs a separate record with the following fields:
      • type: "email", "phone", or "social"
      • value: The email address, phone number, or social handle/username
      • sourceUrl: The root URL where the contact was found
      • platform: (only for social) The social media platform name (e.g., "Instagram")
      • url: (only for social) The full profile URL
    • Example output (email):
      {
      "type": "email",
      "value": "info@example.com",
      "sourceUrl": "https://example.com/"
      }
    • Example output (social):
      {
      "type": "social",
      "value": "bbcafrica",
      "sourceUrl": "https://www.bbc.com/",
      "platform": "Instagram",
      "url": "https://www.instagram.com/bbcafrica/"
      }

Supported Social Media Platforms

  • Facebook
  • Instagram
  • LinkedIn
  • WhatsApp
  • Telegram
  • Twitter/X
  • TikTok
  • YouTube
  • Discord
  • Pinterest
  • GitHub
  • Reddit
  • Snapchat
  • Medium

Use Cases

  • Lead generation for sales and marketing
  • Building contact databases for outreach
  • Market and competitor research
  • Social media influencer discovery
  • Academic or business directory building

Limitations & Notes

  • No JavaScript Rendering: This actor does not use Playwright or a browser; it cannot extract data hidden behind JavaScript or dynamic content.
  • Basic Anti-Bot: Uses standard headers and Apify proxy, but not advanced anti-bot or CAPTCHA solving.
  • Regex-based Extraction: Email and phone detection is regex-based; some edge cases may be missed or false positives may occur.
  • Phone Detection: Supports international and local formats, including DACH-region numbers, but may not catch all regional variations.
  • No Cloudflare/obfuscated email decoding.
  • No advanced scheduling or incremental crawling.

How to Use

  1. Deploy or run the actor on Apify.
  2. Provide your list of URLs and crawl depth in the input.
  3. Run the actor and wait for completion.
  4. Download the results from the Apify dataset in JSON, CSV, or Excel format.

Example Input

{
"start_urls": [
"https://example.com",
"https://another.com"
],
"max_crawl_depth": 1
}

Example Output

{
"type": "email",
"value": "info@example.com",
"sourceUrl": "https://example.com/"
}
{
"type": "social",
"value": "bbcafrica",
"sourceUrl": "https://www.bbc.com/",
"platform": "Instagram",
"url": "https://www.instagram.com/bbcafrica/"
}

This actor is perfect for anyone needing to automate the collection of contact information from multiple websites, with a focus on reliability, clarity, and structured results.