Lead Scraper ✅ — Verified B2B Emails & Phones  Finder ✅ avatar
Lead Scraper ✅ — Verified B2B Emails & Phones Finder ✅

Pricing

Pay per event

Go to Apify Store
Lead Scraper ✅ — Verified B2B Emails & Phones  Finder ✅

Lead Scraper ✅ — Verified B2B Emails & Phones Finder ✅

A global lead scraper that searches the web for real business websites and extracts emails, phones, socials, and contact names. Supports any keyword and location, crawls domains directly, and returns clean, verified leads up to your chosen limit.

Pricing

Pay per event

Rating

5.0

(1)

Developer

Manish

Manish

Maintained by Community

Actor stats

0

Bookmarked

7

Total users

4

Monthly active users

6 hours ago

Last modified

Share

🌍 Global B2B Lead Generator

A powerful, location-agnostic lead generation Actor that finds real business websites worldwide and extracts comprehensive contact information including emails, phones, social profiles, and key personnel.

🎯 What This Actor Does

This Actor searches the web for businesses matching your keywords and location, then crawls their websites to extract:

  • ✉️ Email addresses (primary + all found)
  • 📞 Phone numbers (primary + all found, validated for international formats)
  • 🔗 Social media profiles (LinkedIn, Facebook, Instagram, Twitter, etc.)
  • 👥 Contact people (names, roles, direct contact info)
  • 🏢 Business details (name, website, address)
  • 📄 Page metadata (titles, structured data)

✨ Key Features

🌐 Truly Global

  • Works in any country: US, UK, Australia, Canada, India, Germany, and more
  • Supports any language and location format
  • Auto-detects regional TLDs and phone formats

🎯 Industry Agnostic

  • Works for any niche: HVAC contractors, lawyers, digital agencies, plumbers, consultants, etc.
  • Configurable search depth and request limits
  • Smart filtering to avoid directories and aggregators

🚀 Intelligent Crawling

  • Multi-source search: Combines Bing + DuckDuckGo for maximum coverage
  • Listicle extraction: Automatically detects "Top 10" lists and extracts business links
  • Deep crawling: Visits contact/about/team pages for richer data
  • Smart deduplication: One lead per domain

🧹 High-Quality Data

  • Advanced email validation (filters out image files, test emails)
  • International phone number validation (8-15 digits)
  • Structured data extraction (JSON-LD, microdata)
  • Clean business name extraction

📥 Input Configuration

{
"keywords": "digital marketing agency",
"location": "Sydney, Australia",
"maxLeads": 100,
"maxDepth": 1,
"maxRequestsPerCrawl": 500,
"proxy": false,
"useSearchApi": true,
"strictLocationMatch": false
}

Input Parameters

ParameterTypeRequiredDefaultDescription
keywordsString✅ Yes-Business type or niche. Use commas to separate multiple: "agency 1, agency 2"
locationString✅ Yes-City, state, region, or country: "Dallas, TX", "Sydney, AU", "Toronto, Canada"
maxLeadsIntegerNo50Maximum number of leads to collect (1-1000)
maxDepthIntegerNo1Crawl depth per site. 0=homepage only, 1-3=include subpages
maxRequestsPerCrawlIntegerNo300Total HTTP request limit. For large runs, use maxLeads × 5
proxyBooleanNofalseUse Apify proxy (recommended for large runs or to avoid rate limits)
useSearchApiBooleanNotrueEnable Bing search API. If false, uses DuckDuckGo only
strictLocationMatchBooleanNofalseEnforce location filtering (may reduce results)

📤 Output Format

Each lead contains:

{
"businessName": "Acme Digital Agency",
"website": "https://acmedigital.com",
"primaryEmail": "contact@acmedigital.com",
"primaryPhone": "+61 2 1234 5678",
"emails": ["contact@acmedigital.com", "sales@acmedigital.com"],
"phones": ["+61 2 1234 5678", "+61 400 123 456"],
"socials": [
"https://linkedin.com/company/acmedigital",
"https://twitter.com/acmedigital"
],
"pageTitle": "Acme Digital - Leading Marketing Agency",
"addressText": "123 Main St, Sydney NSW 2000, Australia",
"contactPeople": [
{
"name": "John Smith",
"role": "CEO & Founder",
"email": "john@acmedigital.com",
"phone": "+61 400 123 456"
}
]
}

🎓 Usage Examples

Example 1: Local Service Business

{
"keywords": "hvac contractor",
"location": "Dallas, TX",
"maxLeads": 50,
"maxDepth": 1
}

Example 2: Professional Services (Multiple Keywords)

{
"keywords": "personal injury lawyer, car accident attorney",
"location": "Los Angeles, CA",
"maxLeads": 100,
"maxDepth": 2,
"maxRequestsPerCrawl": 600
}
{
"keywords": "digital marketing agency, seo agency",
"location": "Melbourne, Australia",
"maxLeads": 200,
"maxRequestsPerCrawl": 1000,
"proxy": true
}

Example 4: Quick Scan (Homepage Only)

{
"keywords": "electrician",
"location": "Toronto, Canada",
"maxLeads": 30,
"maxDepth": 0,
"maxRequestsPerCrawl": 150
}

🔧 Configuration Tips

For Best Results

1. Use Specific Keywords

  • ✅ Good: "residential hvac contractor"
  • ❌ Too broad: "contractor"

2. Include Multiple Related Keywords

{
"keywords": "plumber, plumbing contractor, emergency plumber"
}

This searches each keyword separately, giving 3× coverage!

3. Specify Location Precisely

  • ✅ City + State: "Austin, TX"
  • ✅ City + Country: "Brisbane, AU"
  • ⚠️ Too broad: "USA" (may return mixed results)

4. Scale Request Budget with Lead Count

  • 50 leads → maxRequestsPerCrawl: 300
  • 100 leads → maxRequestsPerCrawl: 600
  • 200+ leads → maxRequestsPerCrawl: 1000+

5. Enable Proxy for Large Runs

{
"maxLeads": 200,
"proxy": true
}

This avoids rate limiting on search engines.

6. Adjust Depth Based on Data Needs

  • maxDepth: 0 - Fast, homepage only (basic contact info)
  • maxDepth: 1 - Balanced, includes /contact and /about pages
  • maxDepth: 2 - Thorough, includes /team pages (more people data)

📊 Performance Expectations

Typical Success Rates

ScenarioSites ProcessedLeads FoundSuccess Rate
Local services (HVAC, plumber)5030-4060-80%
Professional services (lawyers)5025-3550-70%
Digital agencies5020-3040-60%
Niche B2B services5015-2530-50%

Run Times

  • 50 leads: 2-5 minutes
  • 100 leads: 5-10 minutes
  • 200 leads: 10-20 minutes

Note: Times vary based on maxDepth, proxy usage, and site response times.

🛡️ Data Quality Features

Email Validation

  • ❌ Filters: logo@2x.png, image@example.com, tracking pixels
  • ✅ Validates: Format, domain, local part length
  • ✅ Prioritizes: Contact emails over info/support emails

Phone Validation

  • ❌ Filters: Dates (20241116), extensions (x123), sequences (12345)
  • ✅ Validates: 8-15 digits, international formats
  • ✅ Supports: All country codes (+1, +44, +61, etc.)

Business Name Extraction

  • Tries JSON-LD structured data first (most reliable)
  • Falls back to Open Graph tags
  • Cleans page titles (removes "| Home", "- Welcome")
  • Last resort: Capitalizes domain name

Listicle Handling

When the Actor encounters pages like "Top 30 Digital Marketing Agencies":

  1. Detects it's a listicle (won't save as a lead)
  2. Extracts all business links from the page (20-50 links typically)
  3. Queues those businesses for crawling
  4. Result: One listicle page → 15-30 quality leads!

🚫 What Gets Filtered Out

The Actor automatically skips:

  • Directories: Yelp, Yellow Pages, Clutch, GoodFirms
  • Social media: Facebook, LinkedIn, Twitter (saved as socials, not leads)
  • Job boards: Indeed, Glassdoor
  • Marketplaces: Thumbtack, HomeAdvisor, Bark
  • Login/signup pages
  • PDF/document files

🐛 Troubleshooting

Issue: Few or No Results

Possible causes:

  1. Keywords too specific
  2. Location too broad or misspelled
  3. maxRequestsPerCrawl too low

Solutions:

{
"keywords": "broader term, alternative term",
"location": "specific city, state",
"maxRequestsPerCrawl": 1000,
"proxy": true
}

Issue: Bing Returning 0 Results

This is a known issue with the Bing Actor. The Actor will automatically continue with DuckDuckGo only.

Workaround: Disable Bing search:

{
"useSearchApi": false
}

This uses DuckDuckGo exclusively, which is more reliable.

Issue: Missing Contact Information

Solution: Increase crawl depth:

{
"maxDepth": 2
}

This ensures the Actor visits /contact and /team pages.

Issue: Rate Limiting / Timeouts

Solution: Enable proxy and reduce concurrency:

{
"proxy": true,
"maxRequestsPerCrawl": 500
}

💰 Cost Optimization

Budget-Friendly Settings

{
"maxLeads": 50,
"maxDepth": 0,
"maxRequestsPerCrawl": 200,
"proxy": false,
"useSearchApi": false
}

Cost: ~$0.10-0.20 per run

Balanced Settings

{
"maxLeads": 100,
"maxDepth": 1,
"maxRequestsPerCrawl": 500,
"proxy": false
}

Cost: ~$0.30-0.50 per run

Premium Quality Settings

{
"maxLeads": 200,
"maxDepth": 2,
"maxRequestsPerCrawl": 1500,
"proxy": true
}

Cost: ~$0.80-1.50 per run

🔗 Integration Examples

Export to CSV

  1. Run the Actor
  2. Go to Storage → Dataset
  3. Click "Export" → CSV
  4. Open in Excel/Google Sheets

Use in Workflows

const run = await Actor.call("your-actor-name", {
keywords: "digital marketing agency",
location: "New York, NY",
maxLeads: 50
});
const dataset = await Actor.openDataset(run.defaultDatasetId);
const { items } = await dataset.getData();
// Process leads
for (const lead of items) {
console.log(`${lead.businessName}: ${lead.primaryEmail}`);
}

API Access

curl -X POST https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"keywords": "hvac contractor",
"location": "Austin, TX",
"maxLeads": 100
}'

📈 Advanced Use Cases

Multi-Location Campaign

Run the Actor multiple times with different locations:

const locations = ["Dallas, TX", "Houston, TX", "Austin, TX"];
for (const location of locations) {
await Actor.call("your-actor-name", {
keywords: "hvac contractor",
location: location,
maxLeads: 50
});
}

Competitive Analysis

Find all businesses in a niche:

{
"keywords": "saas marketing agency, b2b marketing agency",
"location": "United States",
"maxLeads": 500,
"maxRequestsPerCrawl": 2500
}

Lead Enrichment

Combine with other Actors:

  1. Use this Actor to find business websites
  2. Pipe results to LinkedIn scraper for company profiles
  3. Enrich with Clearbit/Hunter.io for verification

📝 Output Schema

Complete field descriptions:

FieldTypeDescription
businessNameStringCompany/business name
websiteStringPrimary website URL
primaryEmailStringMain contact email (first found)
primaryPhoneStringMain phone number (first found)
emailsArrayAll email addresses found
phonesArrayAll phone numbers found
socialsArraySocial media profile URLs
pageTitleStringHTML page title
addressTextStringExtracted address information
contactPeopleArrayKey personnel with roles and contact info

Contact People Structure

{
"name": "Jane Doe",
"role": "VP of Marketing",
"email": "jane@company.com",
"phone": "+1 555 123 4567"
}

⚡ Performance Tips

  1. Start small - Test with 20-30 leads first
  2. Use multiple keywords - Better coverage than one broad term
  3. Enable proxy for 100+ leads to avoid rate limits
  4. Adjust maxRequestsPerCrawl - Should be ≥ maxLeads × 5
  5. Check dataset during run - Monitor quality in real-time
  6. Use specific locations - "City, State" better than just "State"

🆘 Support

If you encounter issues:

  1. Check the Actor logs for specific error messages
  2. Reduce maxLeads and test with smaller runs
  3. Try disabling Bing: "useSearchApi": false
  4. Enable proxy: "proxy": true

📜 License

This Actor is provided as-is for lead generation purposes. Users are responsible for complying with applicable data protection laws and website terms of service.

🎉 Credits

Built with:

  • Apify SDK
  • Crawlee
  • Bing Search API (via tri_angle/bing-search-scraper)
  • DuckDuckGo HTML search

Happy Lead Hunting! 🎯