Extract Contacts avatar
Extract Contacts

Pricing

$12.00/month + usage

Go to Store
Extract Contacts

Extract Contacts

Developed by

Maged

Maged

Maintained by Community

his Apify Actor extracts contact information from specified web pages, including email addresses, phone numbers, social media profiles, and contact-related links.

5.0 (1)

Pricing

$12.00/month + usage

0

Total users

1

Monthly users

1

Runs succeeded

>99%

Last modified

2 days ago

Contact Information Extractor

Description

This Apify Actor extracts contact information from specified web pages, including email addresses, phone numbers, social media profiles, and contact-related links. It supports both static and dynamic content by offering two scraping methods: HTTP requests via httpx or browser-based scraping via drive for JavaScript-rendered pages. The actor uses Apify's proxy service (residential proxies, US-based) to enhance reliability and avoid rate-limiting. Extracted data is deduplicated and validated to ensure quality.

The actor is designed for tasks such as lead generation, contact list building, or market research, providing a structured dataset with emails, phone numbers, social media handles, and relevant contact page URLs.

Input Schema

The actor accepts the following input parameters in JSON format:

{
"urls": {
"title": "URLs to scrape",
"type": "array",
"description": "List of website URLs to extract contact information (emails, phone numbers, social media profiles, and contact links) from.",
"editor": "stringList",
"items": {
"type": "string",
"format": "uri"
}
},
"batch_size": {
"title": "Batch Size",
"type": "integer",
"description": "Number of URLs to process concurrently in a batch. Adjust to balance speed and server load.",
"editor": "number",
"default": 5,
"minimum": 1,
"maximum": 50
},
"use_driver": {
"title": "Use Nodriver",
"type": "boolean",
"description": "If true, uses nodriver for browser-based scraping to handle JavaScript-rendered content. If false, uses HTTP requests via httpx. Requires nodriver to be installed.",
"editor": "checkbox",
"default": false
}
}

Required Fields: urls

Example Input:

{
"urls": ["https://example.com", "https://apify.com"],
"batch_size": 5,
"use_driver": false
}

Dataset Schema

The actor outputs data in the following structure, stored in the Apify dataset:

{
"actorSpecification": 1,
"views": {
"overview": {
"title": "Contact Information",
"transformation": {
"fields": [
"type",
"value",
"platform",
"url",
"source",
"from_url"
]
},
"display": {
"component": "table",
"properties": {
"type": {
"label": "Contact Type",
"format": "text",
"description": "Type of contact information (email, phone, social, or contact_link)."
},
"value": {
"label": "Value",
"format": "text",
"description": "The extracted contact information (e.g., email address, phone number, social media username, or contact page URL)."
},
"platform": {
"label": "Social Platform",
"format": "text",
"description": "The social media platform (e.g., linkedin, twitter, facebook) if type is 'social'. Null for other types."
},
"url": {
"label": "Social Profile URL",
"format": "text",
"description": "The full URL of the social media profile if type is 'social'. Null for other types."
},
"source": {
"label": "Source",
"format": "text",
"description": "Where the contact information was found (e.g., 'page_content')."
},
"from_url": {
"label": "Source URL",
"format": "text",
"description": "The URL of the page where the contact information was extracted."
}
}
}
}
}
}

Output Fields:

  • type: The type of contact information (email, phone, social, or contact_link).
  • value: The extracted data (e.g., email address, phone number, social media username, or contact page URL).
  • platform: The social media platform (e.g., linkedin, twitter) if type is social; otherwise, null.
  • url: The full URL of the social media profile if type is social; otherwise, null.
  • source: The source of the data (e.g., page_content).
  • from_url: The URL of the page where the data was extracted.

Example Output:

[
{
"type": "email",
"value": "contact@example.com",
"platform": null,
"url": null,
"source": "page_content",
"from_url": "https://example.com"
},
{
"type": "phone",
"value": "+12025550123",
"platform": null,
"url": null,
"source": "page_content",
"from_url": "https://example.com"
},
{
"type": "social",
"value": "exampleuser",
"platform": "linkedin",
"url": "https://linkedin.com/in/exampleuser",
"source": "page_content",
"from_url": "https://example.com"
},
{
"type": "contact_link",
"value": "https://example.com/contact",
"platform": null,
"url": null,
"source": "page_content",
"from_url": "https://example.com"
}
]

Usage

  1. Provide Input: Specify a list of URLs to scrape, the batch size, and whether to use drive for JavaScript-rendered content.
  2. Run the Actor: The actor processes URLs in batches, using Apify's residential proxies (US-based) to avoid rate-limiting.
  3. Retrieve Data: The extracted contact information is stored in the Apify dataset, accessible via the Apify platform or API.
  4. Dynamic Content: Set use_driver to true for websites with JavaScript-rendered content (requires drive installed).

Example Run:

  • Input:
    {
    "urls": ["https://apify.com", "https://example.com"],
    "batch_size": 3,
    "use_driver": true
    }
  • Expected Behavior: The actor scrapes both URLs using drive with Apify proxies, extracting emails, phone numbers, social media profiles, and contact links, and stores the results in the dataset.

Notes

  • Proxy Usage: The actor uses Apify's residential proxies (US-based) to ensure reliable scraping.
  • Nodriver Dependency: If use_driver is true, ensure the drive library is installed in the actor's environment. If unavailable, the actor falls back to httpx.
  • Rate Limiting: The actor processes URLs in batches with a 3-second delay between batches to avoid overwhelming target servers.
  • Data Quality: Emails and phone numbers are validated and deduplicated. Social media profiles are extracted for platforms like LinkedIn, Twitter, Facebook, and more.

For further assistance, refer to the Apify documentation or contact support.