Home Service Business Lead Scraper
Pricing
from $0.20 / 1,000 results
Home Service Business Lead Scraper
Scrape publicly available home service business leads from Houzz, Yellow Pages Directory, and BuildZoom.
Pricing
from $0.20 / 1,000 results
Rating
0.0
(0)
Developer
DigitalNomadPH
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
2
Monthly active users
18 hours ago
Last modified
Categories
Share
Collect verified contact data — phone numbers, addresses, websites, ratings, and service details — for home service contractors across the United States. The actor simultaneously scrapes Houzz, Yellow Pages Directory, and BuildZoom, deduplicates records across sources, and scores each lead by data completeness. Use it to build targeted outreach lists for plumbers, electricians, HVAC companies, roofers, landscapers, and more — no code required.
Features
- Scrapes four validated public directories: Houzz, Yellow Pages / YP, Yellow Pages Directory, and BuildZoom
- Supports 8 home service trade categories
- Deduplicates records across sources by phone, domain, name + location
- Quality scoring: each lead receives a 0–100 quality score and a
high/medium/lowband - Configurable result cap (1–1,000 records) with per-source distribution logic
- Optional email extraction (only from explicitly labeled email fields)
- Optional website extraction
- Apify Proxy support for reliable access
- Debug mode for diagnosing source-level parsing issues
Why use Home Service Business Lead Scraper?
This actor is useful for anyone who needs a list of local home service contractors:
- Sales teams prospecting HVAC, roofing, or plumbing companies for B2B outreach
- Marketing agencies building contact lists for local service verticals
- Local SEO tools seeding contractor data for a new market
- Aggregator platforms bootstrapping a contractor directory without a manual data entry effort
- Researchers studying the density and distribution of trade contractors by metro area
How much will it cost?
This actor uses Cheerio (fast HTTP scraping) for Yellow Pages Directory and BuildZoom, and Playwright (headless browser) for Houzz. Playwright runs cost more than plain HTTP requests.
| Run size | Approx. compute units | Approx. cost (pay-as-you-go) |
|---|---|---|
| 30 results | 0.05–0.20 CU | ~$0.01–$0.04 |
| 100 results | 0.20–0.60 CU | ~$0.04–$0.12 |
| 500 results | 0.80–2.50 CU | ~$0.16–$0.50 |
Proxy usage (Apify Proxy residential) adds approximately $0.40/GB. Typical runs consume under 20 MB per 100 results. Costs vary based on selected sources and proxy tier.
How to use
- Go to the Apify Store page for this actor and click Try for free.
- In the Input tab, select a Category (e.g., Plumber) and enter a Location (e.g.,
Chicago, IL). - Adjust Maximum results and select which Sources to scrape.
- Click Start to run the actor.
- When the run completes, open the Storage tab to download your leads as JSON, CSV, or Excel.
You can also run this actor via the Apify API or schedule it to run on a recurring basis from the Schedules tab.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
category | string | Yes | plumber | Trade category to search. One of: plumber, electrician, hvac, roofer, landscaper, cleaning_service, handyman, general_contractor |
location | string | Yes | Austin, TX | US city or metro area. Example: Denver, CO or Portland, OR |
maxResults | integer | No | 30 | Maximum unique records to save (1–1,000). Distributed evenly across sources. |
sources | array | No | ["houzz", "yp", "yellow_pages_directory", "build_zoom"] | Directories to scrape. Select one or more. |
includeEmails | boolean | No | false | Extract emails from explicitly labeled email fields on directory pages. |
includeWebsite | boolean | No | true | Extract business website URLs. |
deduplicate | boolean | No | true | Merge duplicate businesses found across sources. |
debugMode | boolean | No | false | Log detailed source and parsing diagnostics to the run log. |
proxyConfiguration | object | No | { "useApifyProxy": true } | Apify Proxy settings. Recommended to keep enabled for reliable scraping. |
Example input
{"category": "plumber","location": "Austin, TX","maxResults": 30,"includeEmails": false,"includeWebsite": true,"deduplicate": true,"sources": ["houzz", "yp", "yellow_pages_directory", "build_zoom"],"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Output
Each saved dataset item follows this schema:
{"businessName": "ABC Plumbing LLC","sector": "home_services","category": "plumber","trade": "plumber","phone": "+15125551212","email": null,"website": "https://example.com","profileUrl": "https://www.buildzoom.com/contractor/abc-plumbing-llc","address": "123 Main St","city": "Austin","region": "TX","postalCode": "78701","country": "US","rating": 4.7,"reviewCount": 128,"description": "Local plumbing company serving Austin and surrounding areas.","services": ["Drain cleaning", "Water heater repair"],"serviceArea": ["Austin", "Round Rock"],"licenseNumber": null,"yearsInBusiness": null,"emergencyService": true,"source": "build_zoom","sourceName": "BuildZoom","sourcesSeen": ["build_zoom", "houzz"],"profileUrlsSeen": ["https://www.buildzoom.com/contractor/abc-plumbing-llc","https://www.houzz.com/professionals/abc-plumbing"],"scrapedAt": "2026-06-15T00:00:00.000Z","qualityScore": 87,"qualityBand": "high"}
Field notes:
qualityScore— 0–100 based on data completeness (phone, address, website, email, rating, description).qualityBand—high(≥80),medium(≥50), orlow(<50).sourcesSeen— all source IDs where this business was found (populated after deduplication).profileUrlsSeen— all directory profile URLs found for this business.emergencyService—trueif the description mentions 24/7 or emergency service;nullif unknown.
Supported Sources
| Source ID | Display Name | Crawler type |
|---|---|---|
houzz | Houzz | Playwright |
yp | Yellow Pages / YP | Playwright |
yellow_pages_directory | Yellow Pages Directory | Cheerio |
build_zoom | BuildZoom | Cheerio |
Houzz and Yellow Pages / YP use a headless browser (Playwright) due to JavaScript rendering and bot protection. Yellow Pages Directory and BuildZoom are scraped via fast HTTP requests (Cheerio).
Supported Categories
| Input value | Description |
|---|---|
plumber | Plumbers |
electrician | Electricians |
hvac | HVAC companies |
roofer | Roofers |
landscaper | Landscapers |
cleaning_service | Cleaning services |
handyman | Handyman services |
general_contractor | General contractors |
Category aliases are accepted (e.g. "plumbing contractor" normalizes to "plumber").
Tips
- Per-source cap: When multiple sources are selected, the actor applies a per-source limit of
ceil(maxResults / numberOfSources)to prevent a single fast source from filling the entire result quota before slower sources contribute. - Proxy: Residential proxy is recommended (and set by default). Yellow Pages / YP and Houzz block datacenter IPs on the Apify platform. Yellow Pages Directory and BuildZoom work with datacenter proxy if you want to lower costs.
- Yellow Pages / YP reliability: YP actively rotates bot defenses (403, rate limits, empty pages). Expect 24–28 records on a 30-result run — when YP is blocked, the other three sources still deliver results cleanly. This is a target-site limitation, not an actor bug.
- Location format: Use
City, STformat (e.g.Austin, TX,Miami, FL). Multi-word cities work fine:San Antonio, TX. - Email extraction: Enable
includeEmailsonly when you specifically need email addresses. Emails are only extracted from labeled email fields on profile pages — not guessed from description text. - Deduplication: When
deduplicateistrue, records found across multiple sources are merged and thesourcesSeenarray reflects all sources where the business was found.
Limitations
- US locations only (MVP scope).
- Public pages only — no login support.
- No CAPTCHA solving.
- No deep website crawling.
- No email verification.
- No CRM integrations.
- Source selectors may require maintenance when directory layouts change.
Responsible Use
This actor extracts publicly available business information from supported directory pages. Users are responsible for ensuring that their use of the scraped data complies with applicable laws, platform terms of service, privacy regulations (including CAN-SPAM, GDPR where applicable), and marketing rules. The actor must not be used to collect private, login-gated, sensitive, or restricted information.
Local Development
npm installnpx playwright install chromium # required for Houzz (Playwright source)npm testnpm run buildnpm start
Local runs use Apify local storage under storage/. Set APIFY_HEADLESS=1 to run Playwright in headless mode locally.
Support
Found a bug or want to request a new source or category? Open an issue on the actor's GitHub repository or contact the author through Apify.