Website Lead Intelligence avatar

Website Lead Intelligence

Pricing

from $28.00 / 1,000 lead enricheds

Go to Apify Store
Website Lead Intelligence

Website Lead Intelligence

Crawl any website and turn it into a sales-ready lead profile—no APIs needed. AI identifies industries, detects 50+ technologies, extracts emails and phones, estimates company size, and scores leads 0–100 against your ICP. Built for B2B sales teams and marketers qualifying leads at scale.

Pricing

from $28.00 / 1,000 lead enricheds

Rating

0.0

(0)

Developer

Juyeop Park

Juyeop Park

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

9 days ago

Last modified

Share

Turn any company website into actionable lead intelligence — with AI-powered industry classification, technology detection, contact extraction, company size estimation, and ICP-based lead scoring. No external API keys required.

What It Does

Give this Actor a list of company domains or URLs. It crawls each website (homepage + key subpages), then returns a rich, structured dataset for every lead:

FeatureDescription
Industry ClassificationAuto-classifies into 20+ industries using AI (zero-shot) or keyword analysis
Technology DetectionIdentifies 50+ technologies across 8 categories (frameworks, CMS, analytics, marketing, etc.)
Contact ExtractionFinds emails, phone numbers, and social media profiles (LinkedIn, Twitter, Facebook, etc.)
Company Size EstimationEstimates employee count from text patterns, careers pages, team pages, and tech sophistication
Lead ScoringScores each lead 0-100 against your Ideal Customer Profile (ICP)

How It Works

Input: ["stripe.com", "hubspot.com", "shopify.com"]
|
+-----------+-----------+
| | |
Crawl Crawl Crawl
(homepage (homepage (homepage
+ /about + /about + /about
+ /contact + /contact + /contact
+ /careers) + /careers) + /careers)
| | |
+-----+-----+-----+----+
| |
Extract Contacts Detect Tech Stack
| |
Classify Industry Estimate Size
| |
Score vs ICP
|
Structured Output (sorted by score)
  1. Crawl — Fetches the homepage plus up to 10 subpages (/about, /contact, /careers, /pricing, /team, etc.)
  2. Extract — Pulls emails (from text + mailto: links), phones (from tel: links), and social profiles from anchor tags
  3. Detect — Matches 50+ technology signatures in HTML and HTTP headers (React, WordPress, HubSpot, Stripe, AWS, etc.)
  4. Classify — Determines the company's industry using a hybrid approach:
    • AI mode (default): Runs a zero-shot classification model (distilbert-base-uncased-mnli via transformers.js) locally — no API keys needed
    • Keyword mode: Fast regex-based matching against 20+ industry keyword dictionaries
    • Results are cross-validated between both methods for maximum accuracy
  5. Estimate — Determines company size from employee count patterns, job listings, team profiles, and tech stack sophistication
  6. Score — Calculates a 0-100 lead score based on how well the company matches your ICP criteria

Input

FieldTypeDefaultDescription
urlsstring[](required)Website URLs or domains to analyze (e.g., stripe.com, https://hubspot.com)
maxPagesPerDomaininteger5Max subpages to crawl per domain (1-10)
concurrencyinteger3Number of domains to process in parallel (1-10)
enableAiClassificationbooleantrueUse AI model for industry classification. Disable for faster keyword-only mode
targetIndustriesstring[][]ICP: Target industries for lead scoring (e.g., ["SaaS", "E-commerce"])
targetSizeMininteger10ICP: Minimum preferred employee count
targetSizeMaxinteger500ICP: Maximum preferred employee count
targetTechnologiesstring[][]ICP: Target technologies (e.g., ["React", "HubSpot"])
requiredContactTypesstring[]["email"]ICP: Required contact types — email, phone, linkedin, twitter

Example Input

{
"urls": ["stripe.com", "hubspot.com", "shopify.com", "coursera.org"],
"enableAiClassification": true,
"maxPagesPerDomain": 5,
"concurrency": 3,
"targetIndustries": ["SaaS and Software"],
"targetSizeMin": 50,
"targetSizeMax": 1000,
"targetTechnologies": ["React", "Node.js"],
"requiredContactTypes": ["email"]
}

Output

Each lead is returned as a structured JSON object with the following fields:

{
"domain": "stripe.com",
"url": "https://stripe.com",
"title": "Stripe | Financial Infrastructure for the Internet",
"description": "Stripe powers online and in-person payment processing...",
"emails": ["sales@stripe.com"],
"phones": ["+1-888-926-2289"],
"socialLinks": {
"linkedin": "https://www.linkedin.com/company/stripe",
"twitter": "https://twitter.com/stripe",
"facebook": "https://www.facebook.com/StripeHQ",
"instagram": null,
"github": "https://github.com/stripe",
"youtube": null
},
"industry": {
"primary": "SaaS and Software",
"secondary": "E-commerce and Retail",
"confidence": 0.485,
"allScores": [
{ "label": "SaaS and Software", "score": 0.485 },
{ "label": "E-commerce and Retail", "score": 0.389 }
],
"method": "keyword"
},
"companySize": {
"estimate": "enterprise",
"employeeRange": "1000+",
"confidence": 0.25,
"signals": ["Enterprise-level language detected"]
},
"technologies": ["React", "Next.js", "AWS", "Vercel", "Fastly", "Bootstrap", "Stripe"],
"techCategories": {
"frameworks": ["React", "Next.js"],
"cms": [],
"analytics": [],
"marketing": [],
"infrastructure": ["AWS", "Vercel", "Fastly"],
"libraries": ["Bootstrap"],
"ecommerce": ["Stripe"],
"payments": []
},
"leadScore": 34,
"scoreBreakdown": {
"industryMatch": 15,
"sizeMatch": 0,
"techMatch": 10,
"contactQuality": 0,
"webPresence": 9
},
"pagesAnalyzed": 3,
"analyzedAt": "2026-03-08T08:03:53.138Z"
}

Output Field Reference

FieldTypeDescription
domainstringNormalized domain name
urlstringFull URL of the homepage
titlestringPage title from <title> tag
descriptionstringMeta description
emailsstring[]Extracted email addresses (deduplicated, validated)
phonesstring[]Phone numbers from tel: links (max 10, deduplicated)
socialLinksobjectSocial media profile URLs (LinkedIn, Twitter/X, Facebook, Instagram, GitHub, YouTube)
industry.primarystringTop industry classification
industry.secondarystring | nullSecond-best industry (if confidence > threshold)
industry.confidencenumberClassification confidence (0-1)
industry.methodstring"ai" or "keyword" — which method produced the final result
companySize.estimatestringSize category: micro, small, medium, large, enterprise, or unknown
companySize.employeeRangestringHuman-readable range (e.g., "51-200")
companySize.signalsstring[]Evidence used for estimation
technologiesstring[]All detected technologies
techCategoriesobjectTechnologies grouped by category
leadScoreintegerICP match score (0-100), higher = better fit
scoreBreakdownobjectScore components: industryMatch, sizeMatch, techMatch, contactQuality, webPresence
pagesAnalyzedintegerNumber of pages successfully crawled
analyzedAtstringISO 8601 timestamp

Supported Industries

The classifier recognizes 20+ industries:

IndustryExample Companies
SaaS and SoftwareStripe, HubSpot, GitHub, Salesforce
E-commerce and RetailShopify, Amazon, Etsy
Marketing and AdvertisingCreative agencies, ad networks
Finance and PaymentsBanks, investment platforms, fintech
Healthcare and BiotechHospitals, pharma, telehealth
Education and TrainingCoursera, universities, edtech
Manufacturing and IndustrialFactories, industrial equipment
Consulting and Professional ServicesMcKinsey, Deloitte
Media and EntertainmentStreaming, publishing, podcasts
Travel & HospitalityAirbnb, hotels, tourism
Real EstateProperty management, mortgage
Food & BeverageRestaurants, food delivery
TelecommunicationsTelecom, ISPs
EnergySolar, oil & gas, utilities
Non-ProfitCharities, foundations, NGOs
LegalLaw firms, legal services
ConstructionContractors, architecture firms
Transportation & LogisticsShipping, freight, fleet management
CybersecurityInfoSec, threat detection
AI & Machine LearningAI research, ML platforms

Detected Technologies (50+)

CategoryTechnologies
FrameworksReact, Next.js, Vue.js, Nuxt.js, Angular, Svelte, Gatsby, Remix
CMSWordPress, Shopify, Squarespace, Wix, Webflow, Drupal, Ghost, Contentful
AnalyticsGoogle Analytics, Google Tag Manager, Facebook Pixel, Hotjar, Mixpanel, Segment, Amplitude, Plausible
MarketingHubSpot, Mailchimp, Intercom, Drift, Zendesk, Crisp, Salesforce, Marketo, ActiveCampaign
InfrastructureCloudflare, AWS, Vercel, Netlify, Heroku, Google Cloud, Azure, Fastly
LibrariesjQuery, Bootstrap, Tailwind CSS, Material UI, Font Awesome, GSAP, Three.js, D3.js
E-commerceStripe, PayPal, WooCommerce, BigCommerce, Magento
PaymentsStripe Payments, PayPal Checkout, Square, Braintree

Lead Score Breakdown

The lead score (0-100) is composed of five weighted dimensions:

DimensionMax PointsWhat It Measures
Industry Match30How well the company's industry matches your target industries
Size Match20How close the company size is to your target range
Tech Match20How many of your target technologies the company uses
Contact Quality15Availability of emails, phones, and social profiles
Web Presence15Tech sophistication, analytics usage, marketing tools, active website

Company Size Categories

CategoryEmployee RangeMidpoint
Micro1-105
Small11-5030
Medium51-200125
Large201-1000600
Enterprise1000+2000

Use Cases

  • Sales Prospecting — Score and prioritize leads from cold outreach lists
  • Market Research — Map competitors' tech stacks and company sizes at scale
  • Lead Qualification — Auto-filter leads that match your ICP before CRM import
  • Data Enrichment — Append industry, size, and tech data to existing lead lists
  • Competitive Intelligence — Monitor which technologies competitors are adopting

Integration Tips

Chain with Other Scrapers

Use this Actor as a post-processing step after collecting domains from:

  • Google Maps Email Extractor
  • LinkedIn Company Scraper
  • Any domain list scraper

Export to CRM

Results can be exported as JSON, CSV, or Excel directly from the Apify dataset. Key fields map naturally to CRM fields:

  • emails → Contact Email
  • phones → Contact Phone
  • industry.primary → Company Industry
  • companySize.employeeRange → Company Size
  • leadScore → Lead Score / Priority

Webhook Integration

Set up an Apify webhook to automatically send enriched leads to your CRM, Slack, or any HTTP endpoint when a run completes.

Performance

MetricValue
Speed~10-15 seconds per domain (with AI), ~5-8 seconds (keyword-only)
MemoryRecommended 4096 MB (for AI model). Minimum 1024 MB (keyword-only)
ConcurrencyUp to 10 domains in parallel
AI Modeldistilbert-base-uncased-mnli (~100MB, loaded once, runs locally)

Limitations

  • Contact extraction relies on publicly visible information (emails in HTML, tel: links, social media links)
  • Company size estimation is approximate — based on text patterns, careers pages, and tech sophistication
  • Some websites may block automated crawlers, resulting in fewer pages analyzed
  • AI classification requires 4096 MB memory; use keyword-only mode for lower memory usage

Cost

This Actor runs on the Apify platform. Costs depend on compute usage:

  • Memory: 4096 MB recommended (AI mode) or 1024 MB (keyword mode)
  • Compute units: ~0.01-0.02 CU per domain analyzed
  • No external API keys or additional costs required