Pricing

from $3.00 / 1,000 domain prospects

Domain Leads Scraper

Domain Prospecting & Enrichment Actor discovers, normalises, scores, and enriches domain leads from manual lists, CSV uploads, search queries, or DomainLeads sources. It provides SEO signals, lead quality scores, website checks, and CRM-ready export data for prospecting workflows.

Pricing

from $3.00 / 1,000 domain prospects

Rating

0.0

(0)

Developer

Sovanza

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Domain Leads Scraper – Extract Emails & Business Data from Websites

Extract valuable domain-level leads: candidate websites, normalized domains, marketplace/listing signals, and—when enabled—public homepage metadata (titles, descriptions, optional contact/about URLs, social profile links). Ideal for marketers, agencies, and lead-gen teams who build prospect lists before outreach enrichment. Export clean, structured data to JSON, CSV, or Excel on Apify—no coding required.

What this actor is (and isn’t): It discovers and scores domains from multiple sources and can check public webpages for contact/about signals and social links. It does not parse mailto: addresses, scrape inline email text, or extract phone numbers from page HTML—you can plug this dataset into a separate enrichment step for phones/emails if needed.

Easily turn any list of domains (or CSV rows, or search-keyword discovery) into a ranked prospect table with quality and SEO-style heuristics.

Start scraping now

👉 No manual coding — configure input in Apify Console.
👉 Stable modes: manual domains and CSV uploads.
👉 Structured rows for spreadsheets, CRMs, and pipelines.

✅ What this actor does

Takes domains or discovers them from configurable sources, then outputs one prospect row per normalized domain candidate with:

Category	Examples
Identity & sourcing	`domain`, `rootDomain`, `normalizedRootDomain`, `subdomain`, `tld`, `searchKeyword`, ranks, source URLs
Business / listing signals	Titles/snippets, marketplace fields when present (`saleStatus`, prices, listing URLs — mode-dependent)
Website enrichment (optional)	`websiteReachable`, `pageTitle`, `metaDescription`, `socialLinksFound`, `contactLinkFound`, `aboutLinkFound`, guessed public contact/about page URLs (`includeWebsiteCheck` + `includeContactPageGuess`)
Prioritization	`leadQualityScore`, keyword/SEO/commercial heuristic fields, extraction confidence

⚡ Key features

Multiple source modes: manual domains, CSV paste/upload, search-engine discovery, and optional DomainLeads browsing (experimental)
Shared pipeline: normalization, deduplication (searchKeyword + normalizedRootDomain), scoring
Optional public website checks for lightweight, policy-safe signals (no scraping of private areas)
Proxy-ready with Apify Proxy, optional cookiesText for difficult sessions
Debug tooling (optional): snapshots, screenshots, parsed row payloads
Export: JSON / CSV / Excel from the Apify dataset

🎯 Use cases

Use case	What you get
Agency lead lists	Normalized domains + prioritization scores to feed CRM or enrichment tools
B2B prospecting	Keyword-driven discovery + TLD filtering + lead quality sorting
Competitor & market scans	Search-engine discovery slices by niche/query
Domain research / SEO scouting	SEO-style domain signals (`tldQualityTier`, readability, keyword match proxies)
Operations & compliance	Public-only enrichment; structured confidence fields for QA

🛠️ How to use

Set sourceMode: recommended defaults are manual_domains, csv_upload, or search_engine_discovery. DomainLeads mode is optional and can be brittle.
Add domains and/or csvText, or searchEngineQueries + limits.
Toggle includeWebsiteCheck / includeContactPageGuess if you want homepage + public contact/about path checks.
Configure proxyConfiguration (residential-quality proxy often helps on protected surfaces).
Click Run → Dataset → download CSV / JSON / Excel.

Apify Console (health check & quality score)

After deploying build 0.3+:

Input prefills are tuned for health checks: sourceMode: manual_domains, domains: ["apify.com", "example.com"], includeWebsiteCheck: false. This path skips Playwright entirely and finishes in under 1 minute on Apify.
Use residential proxy via proxyConfiguration when running domainleads or search_engine_discovery (default in schema).
Re-run Try actor with default prefilled input — should return normalized domain rows without browser startup.

For heavier discovery runs, use search_engine_discovery or domainleads with conservative resultsPerKeyword, maxPagesPerKeyword, and optional includeWebsiteCheck.

Input highlights

Full schema: INPUT_SCHEMA.json. Example:

{
  "sourceMode": "manual_domains",
  "searchKeywords": ["ai tools"],
  "domains": ["aitools.com", "futurecrm.io", "smartworkflows.ai"],
  "csvText": "",
  "csvFile": "",
  "discoverFromKeywords": false,
  "searchEngineQueries": [],
  "resultsPerKeyword": 75,
  "maxPagesPerKeyword": 3,
  "tldFilter": ["com", "io", "ai"],
  "includeSubdomains": false,
  "includeDerivedMetrics": true,
  "includeWhoisLikeSignals": false,
  "includeSeoSignals": true,
  "includeWebsiteCheck": false,
  "includeContactPageGuess": false,
  "websiteCheckTimeoutSecs": 15,
  "languageHint": "en",
  "maxConcurrency": 8,
  "requestTimeoutSecs": 60,
  "challengeWaitSecs": 15,
  "cookiesText": "",
  "proxyCountry": "AUTO_SELECT_PROXY_COUNTRY",
  "blockAssets": true,
  "proxyConfiguration": { "useApifyProxy": true },
  "saveHtmlSnapshot": false,
  "debugMode": false,
  "saveScreenshot": false,
  "saveParsedRows": false
}

Every field is documented in INPUT_SCHEMA.json (also shown in Apify Console when you edit input).

Environment variables (optional)

Variable	Purpose
`APIFY_TOKEN`	Lets the Apify SDK attach proxy credentials automatically when available.
`APIFY_PROXY_PASSWORD`	Optional direct proxy password for local debugging.
`PLAYWRIGHT_HEADLESS`	Set to `0` / `false` for headed Chromium during local debugging.

📦 Output (what each row may include)

Exports match .actor/dataset_schema.json. Common fields:

Prospect IDs: normalized domain/root, ranks, snippet/title
URLs: listingUrl, websiteUrl, contactUrl (when the source exposes them — not harvested from arbitrary HTML emails)
Optional homepage pass: reachability, pageTitle, metaDescription, socialLinksFound, publicContactPages
Scores: leadQualityScore, keywordMatchScore, SEO/business heuristic fields
Diagnostics: extraction confidence, challenge metadata (DomainLeads / protected pages), error rows
type: __summary__: run aggregates (counts, dedup stats, blocked pages)

Example record shape (illustrative):

{
  "searchKeyword": "ai tools",
  "domain": "aitools.com",
  "normalizedRootDomain": "aitools.com",
  "tld": "com",
  "websiteUrl": "https://aitools.com",
  "listingUrl": "https://www.domainleads.com/domain/aitools.com",
  "title": "AI Tools",
  "extractionConfidence": "high",
  "leadQualityScore": 0.86,
  "keywordMatchScore": 0.92,
  "timestamp": "2026-03-27T19:00:00+00:00"
}

Why this scraper stands out

Unlike simplistic “binary” tooling, this actor separates discovery, normalization, trust/extraction confidence, and lead prioritization. That makes downstream outreach safer: your team sorts by signal strength and confidence before spending human review time.

Frequently asked questions

How does it “find contacts”?

With includeWebsiteCheck, the actor loads the public homepage and collects visible metadata plus common contact/about URLs (when includeContactPageGuess is enabled). It does not mine email bodies from arbitrary pages.

Can it extract emails from every site?

No. If a domain hides contacts behind scripts, login walls, or forms, enrichment will be incomplete—like any respectful public crawler.

Does it crawl beyond the homepage?

Optionally—the contact/about guesses check a small set of public paths only when enabled.

How many domains per run?

Controlled by resultsPerKeyword, maxPagesPerKeyword, and overall Apify runtime/resources. Prefer conservative concurrency for reliability.

Why are some domains empty?

They may fail DNS/TLS/blocks/timeouts (websiteReachable: false), or fall below extraction confidence thresholds.

Does it deduplicate duplicates?

Yes—deduplication tracks searchKeyword + normalizedRootDomain (domain-level uniqueness for this pipeline—not email-level dedup).

Compliance for outreach?

Only consume publicly available listings and webpage metadata responsibly. Cold email and privacy rules (GDPR, CAN-SPAM, etc.) remain your obligation.

Integrations & API

Run on Apify, export datasets, automate with API/webhooks/Make/Zapier. See routes/domain_leads_search.example.js for a sample integration pattern.

Run locally

cd domain-leads-search-scraper
pip install -r requirements.txt
playwright install chromium
cp INPUT.example.json INPUT.json
python main.py

Set APIFY_TOKEN / proxy env vars for realistic network conditions.

SEO keywords

domain leads scraper, domain prospecting scraper, domain enrichment actor, seo domain scraper, b2b lead generation domains, competitor domain discovery, keyword domain extractor, apify domain leads

Actor permissions

Runs with typical limited actor permissions: input + default dataset (+ optional KV for debug snapshots when enabled).

Limitations

Source layouts and anti-bot rules change—selector maintenance may be needed (domainleads mode especially).
Heuristic scores are directional—not financial valuations.
Website checks are intentionally lightweight public passes.

Domain Rating Checker 🔍

shahidirfan/Domain-Rating-Checker

Analyze domain authority in seconds. Extract Domain Rating, backlinks, trust metrics & traffic data. Perfect for SEO audits, competitive intelligence, link prospecting & domain valuation. Bulk processing, real-time results, structured JSON output.

Shahid Irfan

Domain Availability Checker

thescrappa/domain-availability-checker

Check domain availability in bulk using RDAP. Great for brand research, SEO projects, domain lists, and registrar pre-checks.

Scrappa

Lead Enrichment Monitoring

ironjellyfish/lead-enrichment-monitoring

Finds and enriches company leads from public sources with monitoring-ready output.

te wilson

Domain WHOIS & DNS Lookup - Bulk Domain Intelligence

santamaria-automations/domain-whois-dns

Bulk domain intelligence — WHOIS records (registrar, dates, contacts), DNS records (A, MX, TXT, NS), and email provider detection. Perfect for B2B prospecting, security audits, domain monitoring, and data enrichment.

NanoScrape

B2B Company Data Scraper - Website & Domain Enrichment API

pink_comic/website-company-enrichment

B2B company data scraper and domain enrichment API. Extract page-reported company names, descriptions, public contact strings, social profiles, tech signals and SEO fields for lead enrichment, CRM cleanup and sales research. Structured JSON with pay-per-result pricing.

Ava Torres

Website Email & Contact Extractor

wishful_knowledge/website-contact-tech-scanner

Extract public business emails, contact links, social profiles, tech stack signals, and outreach scores from domain or URL lists.

sanfeng zhang

Bulk Domain Checker

kawsar/bulk-domain-checker

Bulk domain checker that extracts domain availability status across 80+ extensions so you can find unregistered names fast. A powerful domain finder tool that helps you secure the best domain for your business or personal website.

Kawsar

Domain to LinkedIn URL Resolver

automation-lab/domain-to-linkedin-url-resolver

Resolve company domains to LinkedIn company URLs with evidence for CRM enrichment, RevOps, sales prospecting, and lead-generation workflows.

Stas Persiianenko

Moz Domain Authority Checker

jdtpnjtp/moz-domain-authority-checker

The MOZ Domain Authority API provides comprehensive domain analysis and SEO metrics through a simple REST interface. Analyze any domain to retrieve critical SEO indicators including Domain Authority (DA), Page Authority