Website Email, Phone Number, Social Links & Link Scraper
Pricing
Pay per usage
Go to Apify Store
Website Email, Phone Number, Social Links & Link Scraper
Scrape emails, phone numbers, social links, internal links, external links, images, and files from websites page-wise with low-cost HTTP crawling.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Anas Nadeem
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Low-cost Apify Actor to scrape websites page-wise and extract:
- emails
- phone numbers
- social links
- internal/external links
- image URLs
- file URLs (PDF, DOCX, CSV, ZIP, etc.)
It supports two output modes:
full- one dataset item per crawled page with complete extracted dataemails_only- one dataset item per unique email (compatible with common email-only workflows)
Each run also emits:
- one
seed_summaryrow per input seed URL - one
run_summaryrow with global totals
Why this Actor
- HTTP-first crawling (
crawlerType: "http") for cheaper runs - optional browser mode (
crawlerType: "browser") for JS-heavy pages - shallow crawling with
maxDepthandmaxPagesto keep spend predictable - include/exclude URL globs for crawl control
Input
Core fields:
startUrls(required)mode(full|emails_only)crawlerType(http|browser)maxPages,maxDepthsameDomainOnly,includeSubdomainsincludeUrlGlobs,excludeUrlGlobsextractEmails,extractPhones,extractSocial,extractImages,extractFiles,extractLinks
See .actor/input_schema.json for the full schema.
Output examples
Full mode row
{"recordType": "page","seedUrl": "https://example.com/","pageUrl": "https://example.com/contact","depth": 1,"statusCode": 200,"title": "Contact","emails": ["hello@example.com"],"phoneNumbers": ["+12025550123"],"socialLinks": {"linkedin": [],"x": [],"facebook": [],"instagram": [],"youtube": [],"tiktok": [],"threads": [],"telegram": [],"whatsapp": [],"discord": [],"pinterest": [],"reddit": [],"github": []},"internalLinks": [],"externalLinks": [],"images": [],"files": [],"counts": {"emails": 1,"phoneNumbers": 1,"socialLinks": 0,"internalLinks": 0,"externalLinks": 0,"images": 0,"files": 0}}
Emails-only row
{"recordType": "email","seedUrl": "https://example.com/","url": "https://example.com/contact","email": "hello@example.com","depth": 1,"statusCode": 200}### Seed summary row```json{"recordType": "seed_summary","seedUrl": "https://example.com/","mode": "full","crawlerType": "http","pagesCrawled": 14,"failedRequests": 1,"uniqueEmails": 6,"statusCodeHistogram": {"200": 13,"404": 1},"totals": {"emails": 9,"phoneNumbers": 4,"socialLinks": 11,"internalLinks": 173,"externalLinks": 42,"images": 88,"files": 7}}
Run summary row
{"recordType": "run_summary","mode": "full","crawlerType": "http","seedsTotal": 3,"pagesCrawled": 39,"failedRequests": 2,"uniqueEmails": 15,"totals": {"emails": 24,"phoneNumbers": 11,"socialLinks": 29,"internalLinks": 513,"externalLinks": 126,"images": 244,"files": 19},"durationMs": 5841}
## Local development```bashnpm installnpm run buildnpm run devnpm test
Notes
- Invalid start URLs are skipped with a warning.
- 4xx/5xx pages may produce no data but do not crash the whole run.
- Use
sameDomainOnly: truefor cost-efficient, controlled crawls.