Website Lead Intelligence
Pricing
from $28.00 / 1,000 lead enricheds
Website Lead Intelligence
Crawl any website and turn it into a sales-ready lead profile—no APIs needed. AI identifies industries, detects 50+ technologies, extracts emails and phones, estimates company size, and scores leads 0–100 against your ICP. Built for B2B sales teams and marketers qualifying leads at scale.
Pricing
from $28.00 / 1,000 lead enricheds
Rating
0.0
(0)
Developer

Juyeop Park
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
9 days ago
Last modified
Categories
Share
Turn any company website into actionable lead intelligence — with AI-powered industry classification, technology detection, contact extraction, company size estimation, and ICP-based lead scoring. No external API keys required.
What It Does
Give this Actor a list of company domains or URLs. It crawls each website (homepage + key subpages), then returns a rich, structured dataset for every lead:
| Feature | Description |
|---|---|
| Industry Classification | Auto-classifies into 20+ industries using AI (zero-shot) or keyword analysis |
| Technology Detection | Identifies 50+ technologies across 8 categories (frameworks, CMS, analytics, marketing, etc.) |
| Contact Extraction | Finds emails, phone numbers, and social media profiles (LinkedIn, Twitter, Facebook, etc.) |
| Company Size Estimation | Estimates employee count from text patterns, careers pages, team pages, and tech sophistication |
| Lead Scoring | Scores each lead 0-100 against your Ideal Customer Profile (ICP) |
How It Works
Input: ["stripe.com", "hubspot.com", "shopify.com"]|+-----------+-----------+| | |Crawl Crawl Crawl(homepage (homepage (homepage+ /about + /about + /about+ /contact + /contact + /contact+ /careers) + /careers) + /careers)| | |+-----+-----+-----+----+| |Extract Contacts Detect Tech Stack| |Classify Industry Estimate Size| |Score vs ICP|Structured Output (sorted by score)
- Crawl — Fetches the homepage plus up to 10 subpages (/about, /contact, /careers, /pricing, /team, etc.)
- Extract — Pulls emails (from text + mailto: links), phones (from tel: links), and social profiles from anchor tags
- Detect — Matches 50+ technology signatures in HTML and HTTP headers (React, WordPress, HubSpot, Stripe, AWS, etc.)
- Classify — Determines the company's industry using a hybrid approach:
- AI mode (default): Runs a zero-shot classification model (distilbert-base-uncased-mnli via transformers.js) locally — no API keys needed
- Keyword mode: Fast regex-based matching against 20+ industry keyword dictionaries
- Results are cross-validated between both methods for maximum accuracy
- Estimate — Determines company size from employee count patterns, job listings, team profiles, and tech stack sophistication
- Score — Calculates a 0-100 lead score based on how well the company matches your ICP criteria
Input
| Field | Type | Default | Description |
|---|---|---|---|
urls | string[] | (required) | Website URLs or domains to analyze (e.g., stripe.com, https://hubspot.com) |
maxPagesPerDomain | integer | 5 | Max subpages to crawl per domain (1-10) |
concurrency | integer | 3 | Number of domains to process in parallel (1-10) |
enableAiClassification | boolean | true | Use AI model for industry classification. Disable for faster keyword-only mode |
targetIndustries | string[] | [] | ICP: Target industries for lead scoring (e.g., ["SaaS", "E-commerce"]) |
targetSizeMin | integer | 10 | ICP: Minimum preferred employee count |
targetSizeMax | integer | 500 | ICP: Maximum preferred employee count |
targetTechnologies | string[] | [] | ICP: Target technologies (e.g., ["React", "HubSpot"]) |
requiredContactTypes | string[] | ["email"] | ICP: Required contact types — email, phone, linkedin, twitter |
Example Input
{"urls": ["stripe.com", "hubspot.com", "shopify.com", "coursera.org"],"enableAiClassification": true,"maxPagesPerDomain": 5,"concurrency": 3,"targetIndustries": ["SaaS and Software"],"targetSizeMin": 50,"targetSizeMax": 1000,"targetTechnologies": ["React", "Node.js"],"requiredContactTypes": ["email"]}
Output
Each lead is returned as a structured JSON object with the following fields:
{"domain": "stripe.com","url": "https://stripe.com","title": "Stripe | Financial Infrastructure for the Internet","description": "Stripe powers online and in-person payment processing...","emails": ["sales@stripe.com"],"phones": ["+1-888-926-2289"],"socialLinks": {"linkedin": "https://www.linkedin.com/company/stripe","twitter": "https://twitter.com/stripe","facebook": "https://www.facebook.com/StripeHQ","instagram": null,"github": "https://github.com/stripe","youtube": null},"industry": {"primary": "SaaS and Software","secondary": "E-commerce and Retail","confidence": 0.485,"allScores": [{ "label": "SaaS and Software", "score": 0.485 },{ "label": "E-commerce and Retail", "score": 0.389 }],"method": "keyword"},"companySize": {"estimate": "enterprise","employeeRange": "1000+","confidence": 0.25,"signals": ["Enterprise-level language detected"]},"technologies": ["React", "Next.js", "AWS", "Vercel", "Fastly", "Bootstrap", "Stripe"],"techCategories": {"frameworks": ["React", "Next.js"],"cms": [],"analytics": [],"marketing": [],"infrastructure": ["AWS", "Vercel", "Fastly"],"libraries": ["Bootstrap"],"ecommerce": ["Stripe"],"payments": []},"leadScore": 34,"scoreBreakdown": {"industryMatch": 15,"sizeMatch": 0,"techMatch": 10,"contactQuality": 0,"webPresence": 9},"pagesAnalyzed": 3,"analyzedAt": "2026-03-08T08:03:53.138Z"}
Output Field Reference
| Field | Type | Description |
|---|---|---|
domain | string | Normalized domain name |
url | string | Full URL of the homepage |
title | string | Page title from <title> tag |
description | string | Meta description |
emails | string[] | Extracted email addresses (deduplicated, validated) |
phones | string[] | Phone numbers from tel: links (max 10, deduplicated) |
socialLinks | object | Social media profile URLs (LinkedIn, Twitter/X, Facebook, Instagram, GitHub, YouTube) |
industry.primary | string | Top industry classification |
industry.secondary | string | null | Second-best industry (if confidence > threshold) |
industry.confidence | number | Classification confidence (0-1) |
industry.method | string | "ai" or "keyword" — which method produced the final result |
companySize.estimate | string | Size category: micro, small, medium, large, enterprise, or unknown |
companySize.employeeRange | string | Human-readable range (e.g., "51-200") |
companySize.signals | string[] | Evidence used for estimation |
technologies | string[] | All detected technologies |
techCategories | object | Technologies grouped by category |
leadScore | integer | ICP match score (0-100), higher = better fit |
scoreBreakdown | object | Score components: industryMatch, sizeMatch, techMatch, contactQuality, webPresence |
pagesAnalyzed | integer | Number of pages successfully crawled |
analyzedAt | string | ISO 8601 timestamp |
Supported Industries
The classifier recognizes 20+ industries:
| Industry | Example Companies |
|---|---|
| SaaS and Software | Stripe, HubSpot, GitHub, Salesforce |
| E-commerce and Retail | Shopify, Amazon, Etsy |
| Marketing and Advertising | Creative agencies, ad networks |
| Finance and Payments | Banks, investment platforms, fintech |
| Healthcare and Biotech | Hospitals, pharma, telehealth |
| Education and Training | Coursera, universities, edtech |
| Manufacturing and Industrial | Factories, industrial equipment |
| Consulting and Professional Services | McKinsey, Deloitte |
| Media and Entertainment | Streaming, publishing, podcasts |
| Travel & Hospitality | Airbnb, hotels, tourism |
| Real Estate | Property management, mortgage |
| Food & Beverage | Restaurants, food delivery |
| Telecommunications | Telecom, ISPs |
| Energy | Solar, oil & gas, utilities |
| Non-Profit | Charities, foundations, NGOs |
| Legal | Law firms, legal services |
| Construction | Contractors, architecture firms |
| Transportation & Logistics | Shipping, freight, fleet management |
| Cybersecurity | InfoSec, threat detection |
| AI & Machine Learning | AI research, ML platforms |
Detected Technologies (50+)
| Category | Technologies |
|---|---|
| Frameworks | React, Next.js, Vue.js, Nuxt.js, Angular, Svelte, Gatsby, Remix |
| CMS | WordPress, Shopify, Squarespace, Wix, Webflow, Drupal, Ghost, Contentful |
| Analytics | Google Analytics, Google Tag Manager, Facebook Pixel, Hotjar, Mixpanel, Segment, Amplitude, Plausible |
| Marketing | HubSpot, Mailchimp, Intercom, Drift, Zendesk, Crisp, Salesforce, Marketo, ActiveCampaign |
| Infrastructure | Cloudflare, AWS, Vercel, Netlify, Heroku, Google Cloud, Azure, Fastly |
| Libraries | jQuery, Bootstrap, Tailwind CSS, Material UI, Font Awesome, GSAP, Three.js, D3.js |
| E-commerce | Stripe, PayPal, WooCommerce, BigCommerce, Magento |
| Payments | Stripe Payments, PayPal Checkout, Square, Braintree |
Lead Score Breakdown
The lead score (0-100) is composed of five weighted dimensions:
| Dimension | Max Points | What It Measures |
|---|---|---|
| Industry Match | 30 | How well the company's industry matches your target industries |
| Size Match | 20 | How close the company size is to your target range |
| Tech Match | 20 | How many of your target technologies the company uses |
| Contact Quality | 15 | Availability of emails, phones, and social profiles |
| Web Presence | 15 | Tech sophistication, analytics usage, marketing tools, active website |
Company Size Categories
| Category | Employee Range | Midpoint |
|---|---|---|
| Micro | 1-10 | 5 |
| Small | 11-50 | 30 |
| Medium | 51-200 | 125 |
| Large | 201-1000 | 600 |
| Enterprise | 1000+ | 2000 |
Use Cases
- Sales Prospecting — Score and prioritize leads from cold outreach lists
- Market Research — Map competitors' tech stacks and company sizes at scale
- Lead Qualification — Auto-filter leads that match your ICP before CRM import
- Data Enrichment — Append industry, size, and tech data to existing lead lists
- Competitive Intelligence — Monitor which technologies competitors are adopting
Integration Tips
Chain with Other Scrapers
Use this Actor as a post-processing step after collecting domains from:
- Google Maps Email Extractor
- LinkedIn Company Scraper
- Any domain list scraper
Export to CRM
Results can be exported as JSON, CSV, or Excel directly from the Apify dataset. Key fields map naturally to CRM fields:
emails→ Contact Emailphones→ Contact Phoneindustry.primary→ Company IndustrycompanySize.employeeRange→ Company SizeleadScore→ Lead Score / Priority
Webhook Integration
Set up an Apify webhook to automatically send enriched leads to your CRM, Slack, or any HTTP endpoint when a run completes.
Performance
| Metric | Value |
|---|---|
| Speed | ~10-15 seconds per domain (with AI), ~5-8 seconds (keyword-only) |
| Memory | Recommended 4096 MB (for AI model). Minimum 1024 MB (keyword-only) |
| Concurrency | Up to 10 domains in parallel |
| AI Model | distilbert-base-uncased-mnli (~100MB, loaded once, runs locally) |
Limitations
- Contact extraction relies on publicly visible information (emails in HTML, tel: links, social media links)
- Company size estimation is approximate — based on text patterns, careers pages, and tech sophistication
- Some websites may block automated crawlers, resulting in fewer pages analyzed
- AI classification requires 4096 MB memory; use keyword-only mode for lower memory usage
Cost
This Actor runs on the Apify platform. Costs depend on compute usage:
- Memory: 4096 MB recommended (AI mode) or 1024 MB (keyword mode)
- Compute units: ~0.01-0.02 CU per domain analyzed
- No external API keys or additional costs required