# Website Contacts (Phone & Email) (`scraperduo/web-contacts`) Actor

Scrapes emails and phone numbers from websites. Prioritizes contact pages automatically.

- **URL**: https://apify.com/scraperduo/web-contacts.md
- **Developed by:** [Berker Ozer](https://apify.com/scraperduo) (community)
- **Categories:** Lead generation, Other, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $15.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Web Contacts Scraper

Extract email addresses and phone numbers from any website — automatically. This Actor visits a website's homepage, detects its contact page in **100+ languages**, and collects all publicly available contact information without any manual configuration.

### What does this Actor do?

Given a list of websites or domains, the Web Contacts Scraper:

1. Visits the homepage of each site
2. Scans all links for contact-related pages (e.g. `/contact`, `/iletisim`, `/kontakt`, `/nous-contacter`, `/contatto`, etc.)
3. If a contact page is found, navigates directly to it — no unnecessary crawling
4. If no contact page is detected, crawls the site's other pages to find contact information
5. Returns all discovered emails and phone numbers along with the exact page URL where each was found

### Features

- **Multilingual contact page detection** — Recognizes contact page slugs in 100+ languages and regions (English, Turkish, German, French, Spanish, Italian, Portuguese, Arabic, Japanese, Chinese, Korean, Russian, Hindi, and many more)
- **Smart crawl strategy** — Prioritizes contact pages to minimize requests and return results faster
- **Multiple extraction strategies** — `mailto:` links, `tel:` links, text node scanning, keyword proximity detection (Tel:, GSM:, Phone:, Fax:, etc.)
- **International phone number support** — E.164 international format, Turkish local numbers, North American NANP format, keyword-based detection in any language
- **Flexible input** — Accepts full URLs (`https://example.com`), bare domains (`example.com`), or `www.` prefixed domains
- **Deduplication** — Each email and phone number is returned only once per domain, even if it appears on multiple pages
- **Source tracking** — Every result includes the page URL where it was found

### Use cases

- **Lead generation** — Build contact lists from industry directories or competitor websites
- **B2B outreach** — Collect email and phone data for sales prospecting
- **Data enrichment** — Augment existing company lists with up-to-date contact information
- **Market research** — Gather contact details from a list of businesses in a specific niche
- **CRM population** — Automatically populate your CRM with contact data from company websites

### Input

| Field | Type | Description |
|-------|------|-------------|
| `startUrls` | Array | List of websites to scrape. Accepts full URLs or bare domains. |

#### Example input

```json
{
  "startUrls": [
    { "url": "https://apify.com" },
    { "url": "example.com" },
    { "url": "www.acme.org" }
  ]
}
````

All three formats are valid. Missing `https://` is added automatically.

### Output

Results are saved to the **default dataset** — one record per domain.

| Field | Type | Description |
|-------|------|-------------|
| `domain` | String | The website domain (e.g. `example.com`), `www.` stripped |
| `emails` | Array | Found email addresses, each with `value` and source `url` |
| `phones` | Array | Found phone numbers, each with `value` and source `url` |

#### Example output

```json
{
  "domain": "example.com",
  "emails": [
    {
      "value": "info@example.com",
      "url": "https://example.com/contact"
    },
    {
      "value": "support@example.com",
      "url": "https://example.com/contact"
    }
  ],
  "phones": [
    {
      "value": "+1 (800) 123-4567",
      "url": "https://example.com/contact"
    }
  ]
}
```

### How contact page detection works

The scraper maintains a list of contact-related URL slugs across 100+ languages. When the homepage is fetched, every link is checked against this list. If any match is found (e.g. a URL containing `contact`, `iletisim`, `kontakt`, `nous-contacter`, `contatti`, `hubungi`, `toiawase`, `sampark`, etc.), those pages are crawled first — and exclusively. This prevents unnecessary crawling of product pages, blog posts, or other irrelevant sections.

If no contact page is detected, the scraper falls back to crawling up to 24 additional pages from the homepage.

### Phone number detection strategies

| Strategy | Description |
|----------|-------------|
| `tel:` links | Most reliable — extracts the href value directly |
| International `+` prefix | Matches E.164 format for all country codes (`+1`, `+44`, `+90`, etc.) |
| Turkish local format | `05321234567`, `(0532) 123 45 67`, `0312 123 45 67` |
| NANP format | `(212) 555-1234` — US/Canada area code with parentheses |
| Keyword proximity | Detects numbers following labels like `Tel:`, `Phone:`, `GSM:`, `Fax:`, `Téléphone:`, `Telefon:`, `Handy:` in any language |

### Technical details

- Built with [Crawlee](https://crawlee.dev) `CheerioCrawler` — fast, no JavaScript rendering required
- Up to **5 websites** scraped in parallel
- Up to **5 concurrent requests** per website
- Maximum **25 pages** visited per website
- Automatically skips irrelevant sections: blog posts, product/shop pages, admin areas, login pages, etc.

### Limitations

- Does not execute JavaScript — contact information loaded dynamically via JS may not be extracted
- Does not bypass CAPTCHA or bot-detection systems
- Email and phone number detection is regex-based; obfuscated or image-based contacts will not be captured

# Actor input Schema

## `startUrls` (type: `array`):

Websites to scrape contacts from. Accepts full URLs (https://example.com) or bare domains (example.com, www.example.com). Protocol defaults to https:// when omitted.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.pazarama.com"
    }
  ]
}
```

# Actor output Schema

## `dataset` (type: `string`):

Dataset containing scraped emails and phone numbers grouped by domain

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.pazarama.com"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("scraperduo/web-contacts").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [{ "url": "https://www.pazarama.com" }] }

# Run the Actor and wait for it to finish
run = client.actor("scraperduo/web-contacts").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.pazarama.com"
    }
  ]
}' |
apify call scraperduo/web-contacts --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scraperduo/web-contacts",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Website Contacts (Phone & Email)",
        "description": "Scrapes emails and phone numbers from websites. Prioritizes contact pages automatically.",
        "version": "0.0",
        "x-build-id": "riBcpkPJHw0FcHlxt"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scraperduo~web-contacts/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scraperduo-web-contacts",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scraperduo~web-contacts/runs": {
            "post": {
                "operationId": "runs-sync-scraperduo-web-contacts",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scraperduo~web-contacts/run-sync": {
            "post": {
                "operationId": "run-sync-scraperduo-web-contacts",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Start URLs / Domains",
                        "type": "array",
                        "description": "Websites to scrape contacts from. Accepts full URLs (https://example.com) or bare domains (example.com, www.example.com). Protocol defaults to https:// when omitted.",
                        "default": [],
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
