Domain Leads Scraper avatar

Domain Leads Scraper

Under maintenance

Pricing

from $3.00 / 1,000 domain prospects

Go to Apify Store
Domain Leads Scraper

Domain Leads Scraper

Under maintenance

Domain Prospecting & Enrichment Actor discovers, normalises, scores, and enriches domain leads from manual lists, CSV uploads, search queries, or DomainLeads sources. It provides SEO signals, lead quality scores, website checks, and CRM-ready export data for prospecting workflows.

Pricing

from $3.00 / 1,000 domain prospects

Rating

0.0

(0)

Developer

Sovanza

Sovanza

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

8 days ago

Last modified

Share

Domain Leads Scraper – Extract Emails & Business Data from Websites

Extract valuable domain-level leads: candidate websites, normalized domains, marketplace/listing signals, and—when enabled—public homepage metadata (titles, descriptions, optional contact/about URLs, social profile links). Ideal for marketers, agencies, and lead-gen teams who build prospect lists before outreach enrichment. Export clean, structured data to JSON, CSV, or Excel on Apify—no coding required.

What this actor is (and isn’t): It discovers and scores domains from multiple sources and can check public webpages for contact/about signals and social links. It does not parse mailto: addresses, scrape inline email text, or extract phone numbers from page HTML—you can plug this dataset into a separate enrichment step for phones/emails if needed.

Easily turn any list of domains (or CSV rows, or search-keyword discovery) into a ranked prospect table with quality and SEO-style heuristics.

Start scraping now

👉 No manual coding — configure input in Apify Console.
👉 Stable modes: manual domains and CSV uploads.
👉 Structured rows for spreadsheets, CRMs, and pipelines.


✅ What this actor does

Takes domains or discovers them from configurable sources, then outputs one prospect row per normalized domain candidate with:

CategoryExamples
Identity & sourcingdomain, rootDomain, normalizedRootDomain, subdomain, tld, searchKeyword, ranks, source URLs
Business / listing signalsTitles/snippets, marketplace fields when present (saleStatus, prices, listing URLs — mode-dependent)
Website enrichment (optional)websiteReachable, pageTitle, metaDescription, socialLinksFound, contactLinkFound, aboutLinkFound, guessed public contact/about page URLs (includeWebsiteCheck + includeContactPageGuess)
PrioritizationleadQualityScore, keyword/SEO/commercial heuristic fields, extraction confidence

⚡ Key features

  • Multiple source modes: manual domains, CSV paste/upload, search-engine discovery, and optional DomainLeads browsing (experimental)
  • Shared pipeline: normalization, deduplication (searchKeyword + normalizedRootDomain), scoring
  • Optional public website checks for lightweight, policy-safe signals (no scraping of private areas)
  • Proxy-ready with Apify Proxy, optional cookiesText for difficult sessions
  • Debug tooling (optional): snapshots, screenshots, parsed row payloads
  • Export: JSON / CSV / Excel from the Apify dataset

🎯 Use cases

Use caseWhat you get
Agency lead listsNormalized domains + prioritization scores to feed CRM or enrichment tools
B2B prospectingKeyword-driven discovery + TLD filtering + lead quality sorting
Competitor & market scansSearch-engine discovery slices by niche/query
Domain research / SEO scoutingSEO-style domain signals (tldQualityTier, readability, keyword match proxies)
Operations & compliancePublic-only enrichment; structured confidence fields for QA

🛠️ How to use

  1. Set sourceMode: recommended defaults are manual_domains, csv_upload, or search_engine_discovery. DomainLeads mode is optional and can be brittle.
  2. Add domains and/or csvText, or searchEngineQueries + limits.
  3. Toggle includeWebsiteCheck / includeContactPageGuess if you want homepage + public contact/about path checks.
  4. Configure proxyConfiguration (residential-quality proxy often helps on protected surfaces).
  5. Click RunDataset → download CSV / JSON / Excel.

Input highlights

Full schema: INPUT_SCHEMA.json. Example:

{
"sourceMode": "manual_domains",
"searchKeywords": ["ai tools"],
"domains": ["aitools.com", "futurecrm.io", "smartworkflows.ai"],
"csvText": "",
"csvFile": "",
"discoverFromKeywords": false,
"searchEngineQueries": [],
"resultsPerKeyword": 75,
"maxPagesPerKeyword": 3,
"tldFilter": ["com", "io", "ai"],
"includeSubdomains": false,
"includeDerivedMetrics": true,
"includeWhoisLikeSignals": false,
"includeSeoSignals": true,
"includeWebsiteCheck": false,
"includeContactPageGuess": false,
"websiteCheckTimeoutSecs": 15,
"languageHint": "en",
"maxConcurrency": 8,
"requestTimeoutSecs": 60,
"challengeWaitSecs": 15,
"cookiesText": "",
"proxyCountry": "AUTO_SELECT_PROXY_COUNTRY",
"blockAssets": true,
"proxyConfiguration": { "useApifyProxy": true },
"saveHtmlSnapshot": false,
"debugMode": false,
"saveScreenshot": false,
"saveParsedRows": false
}

Every field is documented in INPUT_SCHEMA.json (also shown in Apify Console when you edit input).

Environment variables (optional)

VariablePurpose
APIFY_TOKENLets the Apify SDK attach proxy credentials automatically when available.
APIFY_PROXY_PASSWORDOptional direct proxy password for local debugging.
PLAYWRIGHT_HEADLESSSet to 0 / false for headed Chromium during local debugging.

📦 Output (what each row may include)

Exports match .actor/dataset_schema.json. Common fields:

  • Prospect IDs: normalized domain/root, ranks, snippet/title
  • URLs: listingUrl, websiteUrl, contactUrl (when the source exposes them — not harvested from arbitrary HTML emails)
  • Optional homepage pass: reachability, pageTitle, metaDescription, socialLinksFound, publicContactPages
  • Scores: leadQualityScore, keywordMatchScore, SEO/business heuristic fields
  • Diagnostics: extraction confidence, challenge metadata (DomainLeads / protected pages), error rows
  • type: __summary__: run aggregates (counts, dedup stats, blocked pages)

Example record shape (illustrative):

{
"searchKeyword": "ai tools",
"domain": "aitools.com",
"normalizedRootDomain": "aitools.com",
"tld": "com",
"websiteUrl": "https://aitools.com",
"listingUrl": "https://www.domainleads.com/domain/aitools.com",
"title": "AI Tools",
"extractionConfidence": "high",
"leadQualityScore": 0.86,
"keywordMatchScore": 0.92,
"timestamp": "2026-03-27T19:00:00+00:00"
}

Why this scraper stands out

Unlike simplistic “binary” tooling, this actor separates discovery, normalization, trust/extraction confidence, and lead prioritization. That makes downstream outreach safer: your team sorts by signal strength and confidence before spending human review time.


Frequently asked questions

How does it “find contacts”?

With includeWebsiteCheck, the actor loads the public homepage and collects visible metadata plus common contact/about URLs (when includeContactPageGuess is enabled). It does not mine email bodies from arbitrary pages.

Can it extract emails from every site?

No. If a domain hides contacts behind scripts, login walls, or forms, enrichment will be incomplete—like any respectful public crawler.

Does it crawl beyond the homepage?

Optionally—the contact/about guesses check a small set of public paths only when enabled.

How many domains per run?

Controlled by resultsPerKeyword, maxPagesPerKeyword, and overall Apify runtime/resources. Prefer conservative concurrency for reliability.

Why are some domains empty?

They may fail DNS/TLS/blocks/timeouts (websiteReachable: false), or fall below extraction confidence thresholds.

Does it deduplicate duplicates?

Yes—deduplication tracks searchKeyword + normalizedRootDomain (domain-level uniqueness for this pipeline—not email-level dedup).

Compliance for outreach?

Only consume publicly available listings and webpage metadata responsibly. Cold email and privacy rules (GDPR, CAN-SPAM, etc.) remain your obligation.


Integrations & API

Run on Apify, export datasets, automate with API/webhooks/Make/Zapier. See routes/domain_leads_search.example.js for a sample integration pattern.


Run locally

cd domain-leads-search-scraper
pip install -r requirements.txt
playwright install chromium
cp INPUT.example.json INPUT.json
python main.py

Set APIFY_TOKEN / proxy env vars for realistic network conditions.


SEO keywords

domain leads scraper, domain prospecting scraper, domain enrichment actor, seo domain scraper, b2b lead generation domains, competitor domain discovery, keyword domain extractor, apify domain leads


Actor permissions

Runs with typical limited actor permissions: input + default dataset (+ optional KV for debug snapshots when enabled).


Limitations

  • Source layouts and anti-bot rules change—selector maintenance may be needed (domainleads mode especially).
  • Heuristic scores are directional—not financial valuations.
  • Website checks are intentionally lightweight public passes.