# My Actor 1 (`fortuitous_pirate/my-actor-1`) Actor

- **URL**: https://apify.com/fortuitous\_pirate/my-actor-1.md
- **Developed by:** [Fortuitous Pirate](https://apify.com/fortuitous_pirate) (community)
- **Categories:** Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## TheKnot Wedding Vendor Scraper

Scrape wedding vendor listings from TheKnot.com marketplace. Extract comprehensive data including contact info, pricing, reviews, photos, and social links for photographers, venues, DJs, florists, and more.

### Features

- **Full pagination support** - Scrape entire categories, not limited to a few URLs
- **18 vendor categories** - Photographers, venues, DJs, florists, planners, caterers, etc.
- **Complete contact data** - Phone, email, website, social media links
- **Pricing information** - Starting prices and price ranges
- **Reviews and ratings** - Star ratings and review counts
- **Media extraction** - Photo URLs with dimensions
- **Business details** - Awards, service areas, owner bios

### Input Configuration

| Field | Type | Description |
|-------|------|-------------|
| `startUrls` | array | TheKnot marketplace URLs to scrape |
| `category` | string | Vendor category (photographers, venues, etc.) |
| `location` | string | City/state in URL format (e.g., `new-york-ny`) |
| `maxPages` | integer | Max listing pages (30 vendors/page). 0 = unlimited |
| `maxVendors` | integer | Max total vendors. 0 = unlimited |
| `scrapeDetails` | boolean | Visit detail pages for full contact info |
| `proxyConfiguration` | object | Proxy settings |

### Example Input

```json
{
    "category": "wedding-photographers",
    "location": "los-angeles-ca",
    "maxPages": 10,
    "maxVendors": 200,
    "scrapeDetails": true,
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": ["RESIDENTIAL"]
    }
}
````

### Output Data

Each vendor record includes:

```json
{
    "name": "Studio Name",
    "phone": "(555) 123-4567",
    "email": "contact@studio.com",
    "website": "https://studio.com",
    "city": "New York",
    "state": "NY",
    "serviceArea": "Tri-State area",
    "startingPrice": "$5,000+",
    "rating": 5,
    "reviewCount": 150,
    "category": "Wedding Photographers",
    "facebookUrl": "https://facebook.com/...",
    "instagramUsername": "studiogram",
    "photos": [
        { "url": "https://...", "width": 1600, "height": 1067 }
    ],
    "awards": ["2024 Best of Weddings Winner", "Hall of Fame"],
    "ownerName": "Jane Doe",
    "ownerRole": "Lead Photographer",
    "profileUrl": "https://www.theknot.com/marketplace/..."
}
```

### Available Categories

- `wedding-photographers`
- `wedding-videographers`
- `wedding-reception-venues`
- `wedding-djs`
- `live-wedding-bands`
- `florists`
- `wedding-planners`
- `wedding-cake-bakeries`
- `catering`
- `beauty-services`
- `bridal-salons`
- `wedding-officiants`
- `wedding-photo-booth-rentals`
- `transportation-services`
- `bar-services`
- `wedding-decor-shops`
- `rehearsal-dinners-bridal-showers`
- `jewelers`

### Usage Tips

1. **For bulk data**: Set `maxPages` high and `scrapeDetails: true` for complete contact info
2. **For speed**: Set `scrapeDetails: false` to skip detail pages (less data but 10x faster)
3. **Large cities**: NYC, LA, Chicago have 300+ vendors per category
4. **Proxies recommended**: Use residential proxies to avoid rate limiting

### Pricing

Uses pay-per-result pricing via `Actor.charge()`:

- Charged per vendor scraped
- No charge for failed requests

### Technical Notes

- Data source: `window.__INITIAL_STATE__` JSON embedded in pages
- No heavy anti-bot protection when using proper browser headers
- 30 vendors per listing page
- Includes 500ms delay between detail page requests to avoid rate limiting

# Actor input Schema

## `startUrls` (type: `array`):

TheKnot marketplace URLs to scrape. Can be category listings (e.g., wedding-photographers-new-york-ny) or individual vendor pages.

## `category` (type: `string`):

Type of wedding vendor to search for

## `location` (type: `string`):

City and state in URL format (e.g., 'new-york-ny', 'los-angeles-ca', 'chicago-il')

## `maxPages` (type: `integer`):

Maximum number of listing pages to scrape (30 vendors per page). Set to 0 for unlimited.

## `scrapeDetails` (type: `boolean`):

Visit each vendor's detail page to get full contact info (phone, email, social links). Slower but more complete data.

## `maxVendors` (type: `integer`):

Maximum total vendors to scrape. Set to 0 for unlimited.

## `proxyConfiguration` (type: `object`):

Proxy settings for avoiding blocks

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.theknot.com/marketplace/wedding-photographers-new-york-ny"
    }
  ],
  "location": "new-york-ny",
  "maxPages": 5,
  "scrapeDetails": true,
  "maxVendors": 100,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.theknot.com/marketplace/wedding-photographers-new-york-ny"
        }
    ],
    "location": "new-york-ny",
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("fortuitous_pirate/my-actor-1").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.theknot.com/marketplace/wedding-photographers-new-york-ny" }],
    "location": "new-york-ny",
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("fortuitous_pirate/my-actor-1").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.theknot.com/marketplace/wedding-photographers-new-york-ny"
    }
  ],
  "location": "new-york-ny",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call fortuitous_pirate/my-actor-1 --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=fortuitous_pirate/my-actor-1",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "My Actor 1",
        "description": "",
        "version": "0.0",
        "x-build-id": "Woj3yrh2sv5VIR0Wt"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/fortuitous_pirate~my-actor-1/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-fortuitous_pirate-my-actor-1",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/fortuitous_pirate~my-actor-1/runs": {
            "post": {
                "operationId": "runs-sync-fortuitous_pirate-my-actor-1",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/fortuitous_pirate~my-actor-1/run-sync": {
            "post": {
                "operationId": "run-sync-fortuitous_pirate-my-actor-1",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "TheKnot marketplace URLs to scrape. Can be category listings (e.g., wedding-photographers-new-york-ny) or individual vendor pages.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "category": {
                        "title": "Vendor Category",
                        "enum": [
                            "wedding-photographers",
                            "wedding-videographers",
                            "wedding-reception-venues",
                            "wedding-djs",
                            "live-wedding-bands",
                            "florists",
                            "wedding-planners",
                            "wedding-cake-bakeries",
                            "catering",
                            "beauty-services",
                            "bridal-salons",
                            "wedding-officiants",
                            "wedding-photo-booth-rentals",
                            "transportation-services",
                            "bar-services",
                            "wedding-decor-shops",
                            "rehearsal-dinners-bridal-showers",
                            "jewelers"
                        ],
                        "type": "string",
                        "description": "Type of wedding vendor to search for"
                    },
                    "location": {
                        "title": "Location",
                        "type": "string",
                        "description": "City and state in URL format (e.g., 'new-york-ny', 'los-angeles-ca', 'chicago-il')"
                    },
                    "maxPages": {
                        "title": "Max Pages",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum number of listing pages to scrape (30 vendors per page). Set to 0 for unlimited.",
                        "default": 5
                    },
                    "scrapeDetails": {
                        "title": "Scrape Vendor Details",
                        "type": "boolean",
                        "description": "Visit each vendor's detail page to get full contact info (phone, email, social links). Slower but more complete data.",
                        "default": true
                    },
                    "maxVendors": {
                        "title": "Max Vendors",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum total vendors to scrape. Set to 0 for unlimited.",
                        "default": 100
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings for avoiding blocks"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
