# Contact Scraper (`sharker/contact-scraper`) Actor

This scraper finds emails, phones, contact forms, chat widgets, and social links, then captures the page signals AI agents and browser automation tools can use to find them again. Built for Claude, Cursor, Codex, and similar agents to click, fill, and navigate reliably.

- **URL**: https://apify.com/sharker/contact-scraper.md
- **Developed by:** [Akula](https://apify.com/sharker) (community)
- **Categories:** AI, Automation, Lead generation
- **Stats:** 5 total users, 3 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $100.00 / 1,000 site analyzeds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## AI Website Contact Scraper

This scraper finds not only emails and phone numbers, but also contact forms, chat widgets, social links, and the page signals a browser automation agent can use to find those elements again.

Built for browser automation, it helps agents like Claude, Cursor, Codex, and OpenClaw understand how to reach a site and what to click, fill, or submit.

### What it does

For each website, the scraper can find:

- emails
- phone numbers
- social profile URLs
- contact forms
- chat or support widgets
- automation-friendly selector hints for forms and widgets

In plain English: it tells your AI agent **who to contact, where to go, and what to click or fill**.

### Why this is useful

Most contact scrapers stop at collecting emails.

This one goes further. When it finds a form or widget, it also returns the clues an AI browser needs to locate that element on the live page later. That makes it useful for:

- AI outreach agents
- browser automation
- lead enrichment
- sales research
- support workflow mapping
- contact-flow QA

### Input

#### Apify-style input

```json
{
  "startUrls": [
    { "url": "https://example.com" }
  ],
  "maxPagesPerDomain": 20
}
````

#### Main input fields

- `startUrls` — one or more seed URLs
- `maxPagesPerDomain` — max pages to inspect per domain

### Output

The scraper returns a JSON array. Each item contains the input URL and the contact targets found across that site.

```json
[
  {
    "input_url": "https://example.com",
    "domain": "example.com",
    "targets": [
      {
        "page_url": "https://example.com/contact",
        "confidence": { "level": "VERIFIED" },
        "emails": [],
        "phones": [],
        "contact_forms": [],
        "social_profiles": [],
        "contact_widgets": []
      }
    ],
    "meta": {
      "site": null,
      "domain": null
    }
  }
]
```

### What you get back

#### Emails

Direct email addresses found on the page.

```json
[{ "value": "contact@example.com" }]
```

#### Phones

Phone numbers, normalized when possible.

```json
[{ "value": "+18668323090" }]
```

#### Social profiles

Social URLs like X, LinkedIn, Facebook, Instagram, and others.

```json
[{ "network": "x", "value": "https://twitter.com/example" }]
```

#### Contact forms

When a form is found, the scraper returns field and submit hints your automation can use later.

Typical fields include:

- `field_locator_hints`
- `form_locator_hints`
- `submit_locator_hints`
- `form_signature`
- `resolution_method`
- `usage_notes`

Short example:

```json
{
  "method": "POST",
  "resolution_method": "snapshot_ref",
  "field_locator_hints": {
    "name": { "label": "Your Name", "id": "contact_form_Name" },
    "email": { "label": "Email Address", "id": "contact_form_Email" },
    "message": { "label": "Message", "id": "contact_form_Message" }
  },
  "submit_locator_hints": {
    "tag": "button",
    "control_type": "submit",
    "text": "Send Message"
  }
}
```

#### Contact widgets

When a live chat or messaging widget is found, the scraper returns provider and launcher hints.

Typical providers include:

- LiveChat
- Crisp
- WhatsApp widget
- Intercom
- Drift
- Zendesk
- Tawk.to

Short example:

```json
{
  "provider": "LiveChat",
  "resolution_method": "snapshot_ref",
  "widget_locator_hints": {
    "provider": "LiveChat",
    "matched_by": ["script_host", "static_launcher"],
    "launcher_text": "Chat with us"
  }
}
```

### How AI browsers should use the selectors

Do not treat scraped CSS paths as permanent selectors.

Best workflow:

1. Open `page_url`
2. Take a fresh browser snapshot
3. Match elements using label, text, id, placeholder, name, role, or aria-label
4. Use the matched live element for fill/click actions

`debug_css_path` is only a backup clue for debugging.

### Confidence

Each contact target includes a confidence level.

- `VERIFIED` — strong contact signals confirmed
- `LIKELY` — good signal, but less certainty
- `POSSIBLE` or `PLAUSIBLE` — weaker or indirect signal

This helps your agent choose the best contact path first.

### Good use cases

- find the best way to contact a company
- map forms and chat widgets before automation
- enrich leads with emails, phones, and socials
- power AI outreach workflows
- monitor whether contact paths changed on a site

# Actor input Schema

## `startUrls` (type: `array`):

One or more website homepages or contact pages to scan.

## `maxPages` (type: `integer`):

Maximum number of same-site pages to inspect for each input URL.

## `timeoutSecs` (type: `number`):

HTTP and rendered-page timeout in seconds.

## `retries` (type: `integer`):

Retry count for transient request failures.

## `delaySecs` (type: `number`):

Polite delay between page fetches on the same site.

## `includePlausible` (type: `boolean`):

Include lower-confidence PLAUSIBLE results in addition to VERIFIED and STRONG matches.

## `includeMetadata` (type: `boolean`):

Extract public business metadata such as name, description, addresses, and sameAs profiles.

## `enrichDns` (type: `boolean`):

Resolve MX, SPF, DMARC, and root DNS records for the target and detected email domains.

## `dnsTimeoutSecs` (type: `number`):

Timeout in seconds for DNS lookups when DNS enrichment is enabled.

## `proxySettings` (type: `object`):

Optional. Enable Apify Proxy or supply your own HTTP(S) proxy URLs when needed.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.hummel.net/"
    }
  ],
  "maxPages": 12,
  "timeoutSecs": 12,
  "retries": 2,
  "delaySecs": 0.2,
  "includePlausible": false,
  "includeMetadata": false,
  "enrichDns": false,
  "dnsTimeoutSecs": 2.5,
  "proxySettings": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `results` (type: `string`):

Default dataset items, with the overview dataset view selected by default in Apify Console.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.hummel.net/"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("sharker/contact-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [{ "url": "https://www.hummel.net/" }] }

# Run the Actor and wait for it to finish
run = client.actor("sharker/contact-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.hummel.net/"
    }
  ]
}' |
apify call sharker/contact-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=sharker/contact-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Contact Scraper",
        "description": "This scraper finds emails, phones, contact forms, chat widgets, and social links, then captures the page signals AI agents and browser automation tools can use to find them again. Built for Claude, Cursor, Codex, and similar agents to click, fill, and navigate reliably.",
        "version": "0.1",
        "x-build-id": "ImsbshVAJeKiQfmWl"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/sharker~contact-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-sharker-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/sharker~contact-scraper/runs": {
            "post": {
                "operationId": "runs-sync-sharker-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/sharker~contact-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-sharker-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "One or more website homepages or contact pages to scan.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxPages": {
                        "title": "Max pages per site",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of same-site pages to inspect for each input URL.",
                        "default": 12
                    },
                    "timeoutSecs": {
                        "title": "Request timeout",
                        "minimum": 1,
                        "type": "number",
                        "description": "HTTP and rendered-page timeout in seconds.",
                        "default": 12
                    },
                    "retries": {
                        "title": "Retries",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Retry count for transient request failures.",
                        "default": 2
                    },
                    "delaySecs": {
                        "title": "Delay between pages",
                        "minimum": 0,
                        "type": "number",
                        "description": "Polite delay between page fetches on the same site.",
                        "default": 0.2
                    },
                    "includePlausible": {
                        "title": "Include plausible matches",
                        "type": "boolean",
                        "description": "Include lower-confidence PLAUSIBLE results in addition to VERIFIED and STRONG matches.",
                        "default": false
                    },
                    "includeMetadata": {
                        "title": "Include site metadata",
                        "type": "boolean",
                        "description": "Extract public business metadata such as name, description, addresses, and sameAs profiles.",
                        "default": false
                    },
                    "enrichDns": {
                        "title": "Enrich DNS",
                        "type": "boolean",
                        "description": "Resolve MX, SPF, DMARC, and root DNS records for the target and detected email domains.",
                        "default": false
                    },
                    "dnsTimeoutSecs": {
                        "title": "DNS timeout",
                        "minimum": 0.1,
                        "type": "number",
                        "description": "Timeout in seconds for DNS lookups when DNS enrichment is enabled.",
                        "default": 2.5
                    },
                    "proxySettings": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional. Enable Apify Proxy or supply your own HTTP(S) proxy URLs when needed.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
