Website Contact Extractor avatar

Website Contact Extractor

Pricing

Pay per usage

Go to Apify Store
Website Contact Extractor

Website Contact Extractor

Extract emails, phone numbers, and social media links from any website. Perfect for lead generation, sales prospecting, and contact discovery.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Praveen Kumar

Praveen Kumar

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

An Apify Actor that crawls websites and extracts contact information including email addresses, phone numbers, and social media profile links.

Built with Crawlee and Cheerio for fast, lightweight HTML parsing.

Features

  • Email extraction — Finds email addresses in page text and mailto: links, with filtering for common false positives (image/asset file extensions).
  • Phone number extraction — Detects Indian phone numbers (+91 format) and international numbers via tel: links and regex patterns.
  • Social media links — Extracts profile URLs from Facebook, Twitter/X, LinkedIn, Instagram, and YouTube.
  • Smart crawling — Follows same-domain links to discover contact pages, about pages, and footers automatically.
  • Proxy support — Rotates IP addresses via Apify Proxy to avoid blocking.

Input

FieldTypeDefaultDescription
startUrlsarrayrequiredList of website URLs to crawl
maxPagesinteger10Maximum number of pages to crawl per run
extractEmailsbooleantrueExtract email addresses
extractPhonesbooleantrueExtract phone numbers
extractSocialsbooleantrueExtract social media links

Example input

{
"startUrls": [{ "url": "https://example.com" }],
"maxPages": 20,
"extractEmails": true,
"extractPhones": true,
"extractSocials": true
}

Output

Each result in the dataset represents a page where contacts were found:

{
"url": "https://example.com/contact",
"title": "Contact Us - Example",
"emails": ["info@example.com", "support@example.com"],
"phones": ["+911234567890", "+14155551234"],
"socialLinks": [
"https://facebook.com/example",
"https://twitter.com/example",
"https://linkedin.com/company/example"
]
}

Only pages with at least one contact item are saved to keep the dataset clean.

Usage

Run locally

$apify run -p

Deploy to Apify

apify login
apify push

Use via API

curl "https://api.apify.com/v2/acts/<YOUR_ACTOR_ID>/runs" \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <YOUR_API_TOKEN>" \
-d '{
"startUrls": [{ "url": "https://example.com" }],
"maxPages": 10
}'

Regex Patterns

TypePattern
Emails[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
Indian phones(\+91[\-\s]?)?[0]?(91)?[789]\d{9}
International phones\+?[\d\s\-\(\)]{10,}

Project Structure

.actor/
├── actor.json # Actor configuration and metadata
├── input_schema.json # Input validation and Apify Console form
├── dataset_schema.json # Output dataset structure
└── output_schema.json # Output storage reference
src/
└── main.js # Crawler and extraction logic
Dockerfile # Container image definition

Resources