Under maintenance

Pricing

from $3.00 / 1,000 results

Try for free

Go to Apify Store

Houzz Pro Scraper

Under maintenance

Try for free

Scrapes Houzz professional listings into a clean lead dataset (name, phone, address; optional rating/reviews/website/license). Two-tier crawl. Pay-Per-Result: priced per record (use maxResults to cap spend).

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Nathan Black

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Output fields & measured fill rates

Honesty note. Nulls are kept in the output — a null is an honest signal that Houzz didn't expose that field for that pro, not a dropped field. Fill rates below come from the real runs described under Coverage & caveats. Your numbers will vary by category and location.

Tier 1 — base record

Measured on 500 records, general-contractor + architects across Austin/Dallas/Houston (TX) plus a few IL/NC, 2026-06-22.

Field	Fill	Notes
`profile_id`	100%	The `pf~<id>`; stable dedup key.
`profile_url`	100%	Canonical Houzz profile URL (from JSON-LD `sameAs`).
`name`	100%	Business name.
`telephone`	98%	The field that makes this a lead list.
`locality` (city)	99%
`region` (state)	98%
`postal_code`	91%
`country`	100%
`street_address`	72%	Many pros list city/region only, no street.
`latitude` / `longitude`	80%	Geo present for ~4 of 5 pros.
`image`	100%	Primary photo URL.
`area_served`	100%

Dedup verified clean: 500 unique profile_ids out of 500 pushed (546 overlapping/ sponsored duplicates were dropped before push).

Tier 2 — enrichment (`enrich: true`)

Measured on 20 enriched records, general-contractor / Austin TX, full review history, 2026-06-22.

Field	Fill	Notes
`full_address`	100%	Full street address from the profile.
`about`	100%	Profile description.
`services`	100%	Service list.
`rating`	90%	Star rating (×10 on Houzz, normalized to e.g. 5.0).
`rating_count`	90%	Number of reviews behind the rating.
`reviews`	90%	Full review objects (see below).
`sub_ratings`	85%	Quality / responsiveness / value aspect ratings.
`website`	85%	The pro's own site (`rawDomain`); junk `http://` dropped.
`badges`	85%	Houzz merit / standout badges.
`cost_estimate`	75%	Typical project cost string.
`awards`	40%	Best-of-Houzz etc.
`estimate_details`	35%
`license_number` / `license_status` / `license_verified_status`	0% (in this sample)	Texas issues no state general-contractor license, so this is a true null, not a parse miss. The parser extracts license number/status where Houzz exposes it; fill in a licensed trade/state has not been separately measured — don't advertise it as guaranteed.

The 10% of pros with no rating/reviews are simply new pros with zero reviews — an honest null, not a gap.

Review objects (reviews[], measured over 407 real reviews): body, rating, author, created all 100% filled; review_id, relationship, project_date, modified, num_likes, status present; project_price only ~21% (Houzz's "Choose one" placeholder is dropped to null). Full review history is fetched by default in one request per pro (up to Houzz's 1024-review cap); use maxReviews to cap it.

See samples/sample_dataset.json (20 real enriched records) and samples/sample_record_enriched.json (one record, trimmed to 2 reviews, for a quick look).

Coverage & caveats

Fill rates are sample-specific. The Tier-1 numbers are TX-heavy: with a maxResults cap, the run fills from the first seeds, so Austin/Dallas dominated before later cities were reached. Different categories/regions will differ — re-measure for your slice.
Location resolution depends on Houzz's in-page region links. Locations are resolved by matching the slug against the region links Houzz renders on the category page. If a slug doesn't appear (e.g. san-antonio-tx-us did not resolve in testing), it's reported in unresolved_locations and skipped — it does not crash the run. Prefer major metros, or pass an explicit listing URL via startUrls.
totalResults is a pagination ceiling, not a true count. Houzz reports ~1499 for a category regardless of region. The crawl early-stops when a page yields no new pros rather than paginating to the ceiling, so a single city yields its real unique count (a few hundred), not 1499.
No "complete national database" claim. Coverage is whatever the crawl + dedup actually yields for the categories/locations you request, stated plainly.
License/awards/estimate fields are genuinely partial. They reflect what each pro filled in on Houzz; treat the rates above as the expectation, not a guarantee.

Reliability & volume (what was actually exercised)

Recon (R4) found no rate-limit / anti-bot infrastructure (Fastly + Envoy; no Cloudflare, no x-ratelimit-*, no 429). Build 6 validated this with a gradual real volume ramp, watching for the first non-200:

Run	Scope	Pages	Pushed	Result
50	GC / Austin	4	50	clean — 0 failures, 0 non-200
200	GC / Austin+Dallas+Houston	15	200	clean — 0 failures, 0 non-200
500	GC + architects / 5 metros (8 seeds)	39	500	clean — 0 failures, 0 non-200
20 (enrich)	GC / Austin, full reviews	2 + 40	20	clean — 407 reviews, 0 failures

Honest limit of this validation: the largest clean run exercised here was 500 Tier-1 records / 39 pages and a 20-pro full enrichment. Sustained tens-of-thousands / full multi-geo crawls were not exercised (respectful-volume guardrail) — do not assume 10k+ works untested. Ramp gradually and watch for the first non-200.

Resilience built in (Builds 4): jittered ~1–3s pacing (configurable), exponential backoff + retry on transient errors / 429 / 5xx, per-seed isolation (one bad seed doesn't kill the run), and an optional proxy-rotation fallback for IP blocks (proxyConfiguration). Proxy is not required at modest volume — it's a tested contingency.

Pricing — Pay-Per-Result

The actor is monetized per result: one price per unique record written to the dataset, via Apify's built-in apify-default-dataset-item event. There is no per-event accounting — you pay per lead.

maxResults is your spend cap. The run stops after that many unique pros.
Only clean, unique, non-junk records are pushed (dedup by profile_id; records with no profile id are skipped), so you're never billed for duplicates or junk.
enrich: true yields richer rows at the same per-result price (it costs ~2 extra requests/pro behind the scenes; the per-result price is set at publish time).

Usage

Input

Field	Type	Default	Meaning
`categories`	string[]	`["general-contractor"]`	Houzz category slugs. Resolved to the canonical `probr0-bo~t_<id>` listing URL.
`locations`	string[]	—	Location slugs (e.g. `austin-tx-us`). Empty = national listing.
`startUrls`	url[]	—	Explicit Houzz listing URLs, instead of / in addition to categories+locations.
`enrich`	bool	`false`	Fetch profiles for Tier-2 fields.
`maxReviews`	int	`0`	Per-pro review cap; `0` = full history.
`maxResults`	int	`0`	Spend cap (unique pros); `0` = until exhausted.
`minDelaySecs` / `maxDelaySecs`	int	`1` / `3`	Jittered pacing bounds. Keep `min ≥ 1` to be a good citizen.
`proxyConfiguration`	object	off	Optional proxy / IP-block contingency.

Example (inputs/example.json):

{
  "categories": ["general-contractor"],
  "locations": ["austin-tx-us"],
  "enrich": false,
  "maxResults": 30,
  "minDelaySecs": 1,
  "maxDelaySecs": 3
}

Run locally (without the Apify platform)

pip install -r requirements.txt            # ideally in a venv (Python 3.13)
scripts/run_local.sh                       # uses inputs/example.json
scripts/run_local.sh inputs/enrich_smoke.json   # an enrichment example

Results land in storage/datasets/default/ (gitignored); logs print a per-run summary (pages_fetched, records_seen, duplicates, pushed, failed_seeds, enriched, reviews_fetched).

Tests

pip install -r requirements-dev.txt
python -m pytest -q          # 41 tests, fixture-based, no network

Legal / Terms note

Houzz's Terms of Use prohibit scraping and commercial reuse of listings; robots.txt allowing the paths does not override that contract (CA jurisdiction). Publishing this as a paid public actor is a deliberate, owner-accepted business-risk decision (see docs/PLAN.md §7). The actor is built to be a good citizen — polite pacing, gradual ramp, honest spec — to reduce both ethical and enforcement risk.

How this project is built

LLM-driven, multi-session project. GitHub Issues are the source of truth. See CLAUDE.md (operating rules), docs/PLAN.md (the approved plan), docs/WORKFLOW.md (conventions), and pinned issue #1 (state & handoff log).

Houzz Pro Scraper – Contractor & Designer Leads with Emails

malikgen/houzz-pro-scraper

Turn Houzz into a ready-to-use B2B lead list. Pick a category, type a city, and export contractors, architects, interior designers and remodelers with phone, address, rating, reviews, Best-of-Houzz badges and services — plus the verified email addresses no other Houzz scraper provides.

Malikgen

5.0

Houzz Pro Scraper

velvety_bedbug/houzz-pro-scraper

Scrapes professional contractor and home service listings from Houzz. Extract business names, phone numbers, addresses, ratings, review counts, and profile URLs. Search by category and location.

Peters Bugs

Houzz Pro Scraper

fortuitous_pirate/houzz-pro-scraper

Scrapes professional contractor and home service listings from Houzz. Extract business names, phone numbers, addresses, ratings, review counts, and profile URLs. Search by category and location.

Fortuitous Pirate

Houzz Scraper

crawlerbros/houzz-scraper

Scrape public Houzz professional directory and profile pages. Extract business details, ratings, reviews, hires, services, location data, and profile URLs from Houzz contractors and design professionals.

Crawler Bros

Houzz Pros Scraper

khadinakbar/houzz-scraper

Scrape Houzz professionals and products with contact, rating, and price data. MCP/API-ready.

Khadin Akbar

Houzz Products Scraper 🏠

easyapi/houzz-products-scraper

Scrape product listings from Houzz.com including prices, descriptions, reviews, and images. Perfect for market research, price monitoring, and product analysis.

EasyApi

Houzz Scraper - Home Pro Directory & Reviews

lulzasaur/houzz-scraper

Scrape Houzz professional directory listings. Extract contractor names, phone numbers, addresses, GPS coordinates, star ratings, review counts, social media links, and profile URLs. Supports all pro categories and locations.

lulz bot

Houzz Professional Scraper 🏠

easyapi/houzz-professional-scraper

Scrape professional designers' profiles from Houzz.com, including their contact information, reviews, badges, and business details. Perfect for market research and lead generation in the home improvement industry.

EasyApi

227