# Website Contact Scraper (`goat255/website-contact-scraper`) Actor

Crawl any list of websites and extract emails, phone numbers, and social media profiles. Visits the homepage plus contact, about, and team pages, then returns one clean deduped row per domain. No login.

- **URL**: https://apify.com/goat255/website-contact-scraper.md
- **Developed by:** [Goutam Soni](https://apify.com/goat255) (community)
- **Categories:** Lead generation, Business
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Website Contact Scraper

Extract emails, phone numbers, and social media profiles from any list of websites. No login, no API key required. Give it domains and it returns one clean, deduped contact row per site.

### What it does

This website contact scraper crawls each target site starting at the homepage, follows the pages most likely to hold contact details (contact, about, team), and pulls every public contact point into a single normalized row per domain.

### Features

- **Email extraction** from `mailto:` links and visible page text, with asset filenames, tracking ids, and placeholder addresses filtered out so you get real, usable emails.
- **Phone number extraction** from `tel:` links and page text, validated by digit count and pattern so years, prices, dates, and ids are never mistaken for phone numbers.
- **Social profile extraction** across Twitter / X, Facebook, Instagram, LinkedIn, YouTube, GitHub, TikTok, Pinterest, Telegram, and WhatsApp. Share widgets and intent links are skipped, and each profile is returned as a clean canonical URL plus handle.
- **Smart contact-page crawl.** Starts at the homepage, follows same-site links that look like contact pages, and probes conventional paths like `/contact` and `/about` even when they are not linked, up to the depth and page budget you choose.
- **One deduped row per domain.** Contacts found across every visited page are merged and deduplicated, with handy per-network columns and total counts.
- **Bulk friendly.** Pass hundreds of domains in one run with adjustable parallelism.

### Use cases

- **Lead generation.** Turn a list of company domains into an outreach-ready list of emails, phones, and social handles.
- **Lead list enrichment.** Add missing phone numbers and social profiles to an existing CRM or prospect list.
- **Market and competitor research.** Map the public contact footprint of companies in a niche or region.
- **Recruiting and partnerships.** Find the right inbox and social channels to reach a target organization.
- **Website audit.** Check which of your own sites publicly expose contact details, and where.

### Input

| Field | Type | Description |
|---|---|---|
| `websites` | array | Domains or URLs to crawl, with or without `https`. One result row is returned per entry. Required. |
| `crawlDepth` | integer | Link levels to follow from the homepage. `0` = homepage only, `1` also follows contact, about, and team links. Default `1`, max `3`. |
| `maxPagesPerSite` | integer | Hard cap on pages fetched per website. Keeps runs fast and predictable. Default `5`, max `30`. |
| `probeCommonPaths` | boolean | Also try `/contact`, `/about`, `/team` even when the homepage does not link them. Default `true`. |
| `concurrency` | integer | How many websites to process in parallel. Default `5`, max `20`. |
| `proxyConfig` | object | Proxy configuration. Residential is the default and most reliable option. |

#### Example input

```json
{
  "websites": ["example.com", "https://www.acme-co.example"],
  "crawlDepth": 1,
  "maxPagesPerSite": 5,
  "probeCommonPaths": true,
  "concurrency": 5
}
````

### Output

One clean row per website, with the most useful columns first:

```json
{
  "type": "contact",
  "domain": "example.com",
  "websiteUrl": "https://example.com/",
  "emailCount": 2,
  "phoneCount": 1,
  "socialCount": 2,
  "emails": ["hello@example.com", "sales@example.com"],
  "phones": ["+15550100199"],
  "socialProfiles": [
    { "network": "linkedin", "handle": "acme-co", "url": "https://www.linkedin.com/company/acme-co" },
    { "network": "twitter", "handle": "example_co", "url": "https://x.com/example_co" }
  ],
  "twitter": "https://x.com/example_co",
  "facebook": null,
  "instagram": null,
  "linkedin": "https://www.linkedin.com/company/acme-co",
  "youtube": null,
  "github": null,
  "pagesCrawled": 4,
  "scrapedAt": "2026-06-18T08:00:00.000Z"
}
```

**Key fields**

- `domain` and `websiteUrl` identify the site the row belongs to.
- `emailCount`, `phoneCount`, and `socialCount` let you sort and filter rows at a glance.
- `emails`, `phones`, and `socialProfiles` hold the full deduped lists.
- `twitter`, `facebook`, `instagram`, `linkedin`, `youtube`, `github` give a single primary URL per network for easy spreadsheet use.
- `pagesCrawled` and `scrapedAt` are run metadata. A field is `null` or an empty list when that contact type is not published on the site.

### FAQ

**Do I need a login or API key?**
No. The scraper reads only publicly visible pages, so there is nothing to log into and no key to supply.

**How many websites can I scrape?**
As many as you like in one run. Pass hundreds of domains in the `websites` list and raise `concurrency` to process more in parallel.

**How fast is it?**
Each site is fetched over plain HTTP with a small contact-page crawl, so throughput depends mainly on `concurrency` and `maxPagesPerSite`. Most sites finish in a few seconds each, and many run in parallel.

**Why are some emails or phones empty?**
Many sites publish contact forms instead of raw addresses, or hide details behind scripts. The row returns whatever is publicly visible at crawl time, and an empty list simply means that contact type was not exposed on the pages visited. Raising `crawlDepth` and `maxPagesPerSite` finds more.

**What is the pricing?**
Billing follows the actor's pricing shown on its Apify Store page. You pay per result returned, so cost scales with the number of domains you scrape.

**Which proxy should I use?**
Residential is the default and recommended option because it works reliably across the widest range of sites. You can change it in the proxy configuration.

# Actor input Schema

## `websites` (type: `array`):

Domains or URLs to crawl for contact details. With or without https. One result row is returned per entry. Example: example.com, https://www.example.com.

## `crawlDepth` (type: `integer`):

How many link levels to follow from the homepage when looking for contact pages. 0 scans the homepage only. 1 also follows contact, about, and team links. Higher values dig deeper.

## `maxPagesPerSite` (type: `integer`):

Hard cap on how many pages are fetched per website. Keeps runs fast and predictable.

## `probeCommonPaths` (type: `boolean`):

When on, conventional paths such as /contact, /about, and /team are tried even if the homepage does not link to them.

## `concurrency` (type: `integer`):

How many websites to process in parallel. Higher is faster but puts more load on proxies.

## `proxyConfig` (type: `object`):

Apify proxy. RESIDENTIAL is the default and recommended option for the most reliable results.

## Actor input object example

```json
{
  "websites": [
    "example.com"
  ],
  "crawlDepth": 1,
  "maxPagesPerSite": 5,
  "probeCommonPaths": true,
  "concurrency": 5,
  "proxyConfig": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "websites": [
        "example.com"
    ],
    "proxyConfig": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("goat255/website-contact-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "websites": ["example.com"],
    "proxyConfig": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("goat255/website-contact-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "websites": [
    "example.com"
  ],
  "proxyConfig": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call goat255/website-contact-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=goat255/website-contact-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Website Contact Scraper",
        "description": "Crawl any list of websites and extract emails, phone numbers, and social media profiles. Visits the homepage plus contact, about, and team pages, then returns one clean deduped row per domain. No login.",
        "version": "0.1",
        "x-build-id": "2NvoccH1KZnQpuEgq"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/goat255~website-contact-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-goat255-website-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/goat255~website-contact-scraper/runs": {
            "post": {
                "operationId": "runs-sync-goat255-website-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/goat255~website-contact-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-goat255-website-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "websites"
                ],
                "properties": {
                    "websites": {
                        "title": "Websites",
                        "type": "array",
                        "description": "Domains or URLs to crawl for contact details. With or without https. One result row is returned per entry. Example: example.com, https://www.example.com.",
                        "default": [
                            "example.com"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "crawlDepth": {
                        "title": "Crawl depth",
                        "minimum": 0,
                        "maximum": 3,
                        "type": "integer",
                        "description": "How many link levels to follow from the homepage when looking for contact pages. 0 scans the homepage only. 1 also follows contact, about, and team links. Higher values dig deeper.",
                        "default": 1
                    },
                    "maxPagesPerSite": {
                        "title": "Max pages per site",
                        "minimum": 1,
                        "maximum": 30,
                        "type": "integer",
                        "description": "Hard cap on how many pages are fetched per website. Keeps runs fast and predictable.",
                        "default": 5
                    },
                    "probeCommonPaths": {
                        "title": "Probe common contact paths",
                        "type": "boolean",
                        "description": "When on, conventional paths such as /contact, /about, and /team are tried even if the homepage does not link to them.",
                        "default": true
                    },
                    "concurrency": {
                        "title": "Concurrency",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "How many websites to process in parallel. Higher is faster but puts more load on proxies.",
                        "default": 5
                    },
                    "proxyConfig": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify proxy. RESIDENTIAL is the default and recommended option for the most reliable results.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
