Domain Leads Scraper
Under maintenancePricing
from $3.00 / 1,000 domain prospects
Domain Leads Scraper
Under maintenanceDomain Prospecting & Enrichment Actor discovers, normalises, scores, and enriches domain leads from manual lists, CSV uploads, search queries, or DomainLeads sources. It provides SEO signals, lead quality scores, website checks, and CRM-ready export data for prospecting workflows.
Pricing
from $3.00 / 1,000 domain prospects
Rating
0.0
(0)
Developer
Sovanza
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
Domain Leads Scraper – Extract Emails & Business Data from Websites
Extract valuable domain-level leads: candidate websites, normalized domains, marketplace/listing signals, and—when enabled—public homepage metadata (titles, descriptions, optional contact/about URLs, social profile links). Ideal for marketers, agencies, and lead-gen teams who build prospect lists before outreach enrichment. Export clean, structured data to JSON, CSV, or Excel on Apify—no coding required.
What this actor is (and isn’t): It discovers and scores domains from multiple sources and can check public webpages for contact/about signals and social links. It does not parse mailto: addresses, scrape inline email text, or extract phone numbers from page HTML—you can plug this dataset into a separate enrichment step for phones/emails if needed.
Easily turn any list of domains (or CSV rows, or search-keyword discovery) into a ranked prospect table with quality and SEO-style heuristics.
Start scraping now
👉 No manual coding — configure input in Apify Console.
👉 Stable modes: manual domains and CSV uploads.
👉 Structured rows for spreadsheets, CRMs, and pipelines.
✅ What this actor does
Takes domains or discovers them from configurable sources, then outputs one prospect row per normalized domain candidate with:
| Category | Examples |
|---|---|
| Identity & sourcing | domain, rootDomain, normalizedRootDomain, subdomain, tld, searchKeyword, ranks, source URLs |
| Business / listing signals | Titles/snippets, marketplace fields when present (saleStatus, prices, listing URLs — mode-dependent) |
| Website enrichment (optional) | websiteReachable, pageTitle, metaDescription, socialLinksFound, contactLinkFound, aboutLinkFound, guessed public contact/about page URLs (includeWebsiteCheck + includeContactPageGuess) |
| Prioritization | leadQualityScore, keyword/SEO/commercial heuristic fields, extraction confidence |
⚡ Key features
- Multiple source modes: manual domains, CSV paste/upload, search-engine discovery, and optional DomainLeads browsing (experimental)
- Shared pipeline: normalization, deduplication (
searchKeyword+normalizedRootDomain), scoring - Optional public website checks for lightweight, policy-safe signals (no scraping of private areas)
- Proxy-ready with Apify Proxy, optional
cookiesTextfor difficult sessions - Debug tooling (optional): snapshots, screenshots, parsed row payloads
- Export: JSON / CSV / Excel from the Apify dataset
🎯 Use cases
| Use case | What you get |
|---|---|
| Agency lead lists | Normalized domains + prioritization scores to feed CRM or enrichment tools |
| B2B prospecting | Keyword-driven discovery + TLD filtering + lead quality sorting |
| Competitor & market scans | Search-engine discovery slices by niche/query |
| Domain research / SEO scouting | SEO-style domain signals (tldQualityTier, readability, keyword match proxies) |
| Operations & compliance | Public-only enrichment; structured confidence fields for QA |
🛠️ How to use
- Set
sourceMode: recommended defaults aremanual_domains,csv_upload, orsearch_engine_discovery. DomainLeads mode is optional and can be brittle. - Add
domainsand/orcsvText, orsearchEngineQueries+ limits. - Toggle
includeWebsiteCheck/includeContactPageGuessif you want homepage + public contact/about path checks. - Configure
proxyConfiguration(residential-quality proxy often helps on protected surfaces). - Click Run → Dataset → download CSV / JSON / Excel.
Input highlights
Full schema: INPUT_SCHEMA.json. Example:
{"sourceMode": "manual_domains","searchKeywords": ["ai tools"],"domains": ["aitools.com", "futurecrm.io", "smartworkflows.ai"],"csvText": "","csvFile": "","discoverFromKeywords": false,"searchEngineQueries": [],"resultsPerKeyword": 75,"maxPagesPerKeyword": 3,"tldFilter": ["com", "io", "ai"],"includeSubdomains": false,"includeDerivedMetrics": true,"includeWhoisLikeSignals": false,"includeSeoSignals": true,"includeWebsiteCheck": false,"includeContactPageGuess": false,"websiteCheckTimeoutSecs": 15,"languageHint": "en","maxConcurrency": 8,"requestTimeoutSecs": 60,"challengeWaitSecs": 15,"cookiesText": "","proxyCountry": "AUTO_SELECT_PROXY_COUNTRY","blockAssets": true,"proxyConfiguration": { "useApifyProxy": true },"saveHtmlSnapshot": false,"debugMode": false,"saveScreenshot": false,"saveParsedRows": false}
Every field is documented in INPUT_SCHEMA.json (also shown in Apify Console when you edit input).
Environment variables (optional)
| Variable | Purpose |
|---|---|
APIFY_TOKEN | Lets the Apify SDK attach proxy credentials automatically when available. |
APIFY_PROXY_PASSWORD | Optional direct proxy password for local debugging. |
PLAYWRIGHT_HEADLESS | Set to 0 / false for headed Chromium during local debugging. |
📦 Output (what each row may include)
Exports match .actor/dataset_schema.json. Common fields:
- Prospect IDs: normalized domain/root, ranks, snippet/title
- URLs:
listingUrl,websiteUrl,contactUrl(when the source exposes them — not harvested from arbitrary HTML emails) - Optional homepage pass: reachability,
pageTitle,metaDescription,socialLinksFound,publicContactPages - Scores:
leadQualityScore,keywordMatchScore, SEO/business heuristic fields - Diagnostics: extraction confidence, challenge metadata (DomainLeads / protected pages),
errorrows type: __summary__: run aggregates (counts, dedup stats, blocked pages)
Example record shape (illustrative):
{"searchKeyword": "ai tools","domain": "aitools.com","normalizedRootDomain": "aitools.com","tld": "com","websiteUrl": "https://aitools.com","listingUrl": "https://www.domainleads.com/domain/aitools.com","title": "AI Tools","extractionConfidence": "high","leadQualityScore": 0.86,"keywordMatchScore": 0.92,"timestamp": "2026-03-27T19:00:00+00:00"}
Why this scraper stands out
Unlike simplistic “binary” tooling, this actor separates discovery, normalization, trust/extraction confidence, and lead prioritization. That makes downstream outreach safer: your team sorts by signal strength and confidence before spending human review time.
Frequently asked questions
How does it “find contacts”?
With includeWebsiteCheck, the actor loads the public homepage and collects visible metadata plus common contact/about URLs (when includeContactPageGuess is enabled). It does not mine email bodies from arbitrary pages.
Can it extract emails from every site?
No. If a domain hides contacts behind scripts, login walls, or forms, enrichment will be incomplete—like any respectful public crawler.
Does it crawl beyond the homepage?
Optionally—the contact/about guesses check a small set of public paths only when enabled.
How many domains per run?
Controlled by resultsPerKeyword, maxPagesPerKeyword, and overall Apify runtime/resources. Prefer conservative concurrency for reliability.
Why are some domains empty?
They may fail DNS/TLS/blocks/timeouts (websiteReachable: false), or fall below extraction confidence thresholds.
Does it deduplicate duplicates?
Yes—deduplication tracks searchKeyword + normalizedRootDomain (domain-level uniqueness for this pipeline—not email-level dedup).
Compliance for outreach?
Only consume publicly available listings and webpage metadata responsibly. Cold email and privacy rules (GDPR, CAN-SPAM, etc.) remain your obligation.
Integrations & API
Run on Apify, export datasets, automate with API/webhooks/Make/Zapier. See routes/domain_leads_search.example.js for a sample integration pattern.
Run locally
cd domain-leads-search-scraperpip install -r requirements.txtplaywright install chromiumcp INPUT.example.json INPUT.jsonpython main.py
Set APIFY_TOKEN / proxy env vars for realistic network conditions.
SEO keywords
domain leads scraper, domain prospecting scraper, domain enrichment actor, seo domain scraper, b2b lead generation domains, competitor domain discovery, keyword domain extractor, apify domain leads
Actor permissions
Runs with typical limited actor permissions: input + default dataset (+ optional KV for debug snapshots when enabled).
Limitations
- Source layouts and anti-bot rules change—selector maintenance may be needed (
domainleadsmode especially). - Heuristic scores are directional—not financial valuations.
- Website checks are intentionally lightweight public passes.