# Houzz Scraper (`crawlerbros/houzz-scraper`) Actor

Scrape public Houzz professional directory and profile pages. Extract business details, ratings, reviews, hires, services, location data, and profile URLs from Houzz contractors and design professionals.

- **URL**: https://apify.com/crawlerbros/houzz-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Developer tools, Lead generation, Automation
- **Stats:** 2 total users, 1 monthly users, 0.0% runs succeeded, 11 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Houzz Scraper

Scrape public Houzz professional directory and profile pages with an HTTP-first actor. Extract contractor and design professional details such as ratings, review counts, hires on Houzz, business contact fields, service lists, and location metadata.

### Modes

This actor supports four input modes — pick whichever is most convenient:

1. **Profile URLs** — paste direct Houzz pro profile URLs into `profileUrls`.
2. **Directory URLs** — paste a directory or category page URL (such as a category filtered by a city) into `directoryUrls`.
3. **By city + category** — set the `city` (e.g. `Los-Angeles--CA`, `London`, `Sydney`) and the `category` dropdown (e.g. `general-contractors`). The actor builds the directory URL automatically for the selected `country`.
4. **Hybrid** — combine any of the above; profile URLs always extract first, then directory URLs are crawled until `maxItems` is reached.

### Markets supported

Houzz operates in 16 markets and the actor supports all of them via the `country` dropdown:

US, UK, Canada, Australia, Germany, France, Ireland, Italy, Spain, New Zealand, India, Singapore, Japan, Russia, Denmark, Sweden.

When `country` + `city` + `category` are set, the actor constructs the matching country-domain directory URL (`www.houzz.com`, `www.houzz.co.uk`, `www.houzz.de`, etc.).

### Categories supported

The `category` dropdown exposes 25+ professional categories — general contractors, architects, interior designers, kitchen/bath designers, painters, landscapers, home builders, and many more.

### Anti-bot

By default, the actor runs without proxy from datacenter IPs (Houzz is normally reachable). If a fetch is blocked:

- `useProxy` (default `false`) — when enabled, all requests go through the configured `proxyConfiguration`.
- `autoEscalateOnBlock` (default `true`) — when a profile or directory fetch returns nothing, the actor automatically retries that single URL through Apify residential proxy. The directory-card data is still emitted as a fallback if the profile page itself is blocked.

### Input fields

| Field | Type | Description |
|---|---|---|
| `directoryUrls` | array | Houzz directory or category page URLs. |
| `profileUrls` | array | Direct Houzz professional profile URLs. |
| `city` | string | City + state slug, e.g. `Los-Angeles--CA`, `London`. |
| `category` | enum | Houzz professional category slug. |
| `country` | enum | Houzz market/domain (us, uk, ca, au, de, fr, ie, it, es, nz, in, sg, jp, ru, dk, se). |
| `maxItems` | integer (1-250) | Hard cap on emitted records. |
| `useProxy` | boolean | Route requests through Apify proxy. |
| `autoEscalateOnBlock` | boolean | Auto-retry blocked fetches via residential proxy. |
| `proxyConfiguration` | object | Apify proxy configuration (advanced). |

### Output

Each record can include:

- `title` / `businessName` / `description`
- `phoneNumber` / `websiteUrl` / `licenseNumber`
- `ratingValue` / `reviewCount` / `hiresOnHouzz` / `followersCount` / `projectCount`
- `verifiedLicense` / `awards`
- `typicalJobCost` / `servicesProvided` / `areaServed`
- `streetAddress` / `city` / `state` / `postalCode` / `countryCode` / `country` / `fullAddress`
- `latitude` / `longitude`
- `imageUrl` / `directorySnippet`
- `sourceUrl` / `profileUrl` / `canonicalUrl` / `slug`
- `siteName` / `recordType` / `recordSource` / `scrapedAt`

`recordSource` is `"full_profile"` when the profile page was successfully fetched (full set of fields including `description`, `licenseNumber`, `websiteUrl`, `servicesProvided`, etc.), or `"directory_card"` when only the directory-card subset was available (anti-bot fallback). Directory-card records typically lack `description`, `licenseNumber`, `websiteUrl`, `typicalJobCost`, `servicesProvided`, and `followersCount`.

Empty values are omitted from output (no `null`, blank strings, or empty arrays).

#### Field notes

- `phoneNumber` is returned as a human-readable formatted string exactly as Houzz publishes it (e.g. `"(424) 280-3130"`), not a digits-only normalized form. If you need a digits-only value, strip non-digits in your downstream pipeline (e.g. `re.sub(r"\D", "", phoneNumber)`).
- `ratingValue` is a numeric (JSON number) on a `1.0`–`5.0` scale (e.g. `4.9` or `5`). Whole numbers may serialize without a decimal in JSON; cast to float in your downstream pipeline if you need a uniform numeric type.

### Limitations

- Houzz page structure varies by professional category and locale, so some fields appear only when publicly shown on the target page.
- When a profile page is blocked even after escalation, the actor falls back to the directory-card data (a smaller but still useful subset).

# Actor input Schema

## `directoryUrls` (type: `array`):

Houzz directory or category URLs, such as a contractor directory filtered by city. Used directly without modification.
## `profileUrls` (type: `array`):

Direct Houzz professional profile URLs.
## `city` (type: `string`):

City + state slug for the byCategory mode, e.g. 'Los-Angeles--CA', 'New-York--NY', 'Chicago--IL'. When set together with `category`, the actor builds a directory URL automatically.
## `category` (type: `string`):

Houzz professional category slug. When set together with `city`, the actor builds a directory URL automatically.
## `country` (type: `string`):

Houzz market/country domain to use when building directory URLs from `city` + `category`. Defaults to the US site (www.houzz.com).
## `maxItems` (type: `integer`):

Hard cap on emitted Houzz professional records.
## `useProxy` (type: `boolean`):

Route requests through Apify rotating proxy. Recommended for cloud runs — Houzz blocks known datacenter IP ranges without proxy rotation.
## `autoEscalateOnBlock` (type: `boolean`):

If a profile fetch is blocked, automatically retry with residential Apify proxy. The directory-card data is still emitted as a fallback.
## `proxyConfiguration` (type: `object`):

Optional Apify proxy configuration. Only used when `useProxy` is enabled. Leave empty to let `useProxy` + `autoEscalateOnBlock` pick sane defaults.

## Actor input object example

```json
{
  "directoryUrls": [
    "https://www.houzz.com/professionals/general-contractors/c/Los-Angeles--CA"
  ],
  "profileUrls": [],
  "city": "",
  "category": "",
  "country": "us",
  "maxItems": 25,
  "useProxy": true,
  "autoEscalateOnBlock": true,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
````

# Actor output Schema

## `professionals` (type: `string`):

Dataset containing Houzz professional records.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "directoryUrls": [
        "https://www.houzz.com/professionals/general-contractors/c/Los-Angeles--CA"
    ],
    "profileUrls": [],
    "city": "",
    "category": "",
    "country": "us",
    "maxItems": 25,
    "useProxy": true,
    "autoEscalateOnBlock": true,
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/houzz-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "directoryUrls": ["https://www.houzz.com/professionals/general-contractors/c/Los-Angeles--CA"],
    "profileUrls": [],
    "city": "",
    "category": "",
    "country": "us",
    "maxItems": 25,
    "useProxy": True,
    "autoEscalateOnBlock": True,
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/houzz-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "directoryUrls": [
    "https://www.houzz.com/professionals/general-contractors/c/Los-Angeles--CA"
  ],
  "profileUrls": [],
  "city": "",
  "category": "",
  "country": "us",
  "maxItems": 25,
  "useProxy": true,
  "autoEscalateOnBlock": true,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call crawlerbros/houzz-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/houzz-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Houzz Scraper",
        "description": "Scrape public Houzz professional directory and profile pages. Extract business details, ratings, reviews, hires, services, location data, and profile URLs from Houzz contractors and design professionals.",
        "version": "1.0",
        "x-build-id": "OnbdtKIxWh86K5VaD"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~houzz-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-houzz-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~houzz-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-houzz-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~houzz-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-houzz-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "directoryUrls": {
                        "title": "Directory URLs",
                        "type": "array",
                        "description": "Houzz directory or category URLs, such as a contractor directory filtered by city. Used directly without modification.",
                        "default": [
                            "https://www.houzz.com/professionals/general-contractors/c/Los-Angeles--CA"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "profileUrls": {
                        "title": "Profile URLs",
                        "type": "array",
                        "description": "Direct Houzz professional profile URLs.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "city": {
                        "title": "City (optional, used with category to build a directory URL)",
                        "type": "string",
                        "description": "City + state slug for the byCategory mode, e.g. 'Los-Angeles--CA', 'New-York--NY', 'Chicago--IL'. When set together with `category`, the actor builds a directory URL automatically.",
                        "default": ""
                    },
                    "category": {
                        "title": "Professional category (optional, used with city)",
                        "enum": [
                            "",
                            "general-contractors",
                            "architects-and-building-designers",
                            "interior-designers-and-decorators",
                            "kitchen-and-bath-designers",
                            "kitchen-and-bath-remodelers",
                            "design-build-firms",
                            "home-builders",
                            "landscape-contractors",
                            "landscape-architects-and-designers",
                            "tile-and-stone-contractors",
                            "carpenters",
                            "cabinets-and-cabinetry",
                            "hardwood-flooring-dealers",
                            "custom-closet-designers",
                            "basement-remodelers",
                            "decks-and-patios",
                            "fencing-and-gates",
                            "driveways-and-paving",
                            "pools-and-spas",
                            "painters",
                            "roofing-and-gutters",
                            "professional-organizers",
                            "home-stagers",
                            "windows",
                            "furniture-and-accessories"
                        ],
                        "type": "string",
                        "description": "Houzz professional category slug. When set together with `city`, the actor builds a directory URL automatically.",
                        "default": ""
                    },
                    "country": {
                        "title": "Country (Houzz domain)",
                        "enum": [
                            "us",
                            "uk",
                            "ca",
                            "au",
                            "de",
                            "fr",
                            "ie",
                            "it",
                            "es",
                            "nz",
                            "in",
                            "sg",
                            "jp",
                            "ru",
                            "dk",
                            "se"
                        ],
                        "type": "string",
                        "description": "Houzz market/country domain to use when building directory URLs from `city` + `category`. Defaults to the US site (www.houzz.com).",
                        "default": "us"
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 250,
                        "type": "integer",
                        "description": "Hard cap on emitted Houzz professional records.",
                        "default": 25
                    },
                    "useProxy": {
                        "title": "Use Apify proxy",
                        "type": "boolean",
                        "description": "Route requests through Apify rotating proxy. Recommended for cloud runs — Houzz blocks known datacenter IP ranges without proxy rotation.",
                        "default": false
                    },
                    "autoEscalateOnBlock": {
                        "title": "Auto-escalate on block",
                        "type": "boolean",
                        "description": "If a profile fetch is blocked, automatically retry with residential Apify proxy. The directory-card data is still emitted as a fallback.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration (advanced)",
                        "type": "object",
                        "description": "Optional Apify proxy configuration. Only used when `useProxy` is enabled. Leave empty to let `useProxy` + `autoEscalateOnBlock` pick sane defaults.",
                        "default": {}
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
