Website Intelligence Extractor
Pricing
from $10.00 / 1,000 results
Website Intelligence Extractor
A powerful Apify actor that crawls websites to extract key intelligence, including emails, phone numbers, social media profiles, technology stack, SEO metadata, and structured data (JSON-LD). Ideal for lead generation, competitive analysis, marketing research, and SEO audits.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
Jamshaid Arif
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
17 days ago
Last modified
Categories
Share
π Website Intelligence Extractor
A powerful Apify actor that crawls any website and extracts actionable intelligence β emails, phone numbers, social media profiles, technology stack, SEO metadata, and structured data (JSON-LD).
Perfect for lead generation, competitive analysis, marketing research, and SEO auditing.
β¨ What It Extracts
| Category | Details |
|---|---|
| π§ Emails | All email addresses found on pages + mailto: links, with junk filtering |
| π Phones | Phone numbers from text + tel: links, international format support |
| π Social Media | Facebook, Twitter/X, LinkedIn, Instagram, YouTube, GitHub, TikTok, Reddit, Threads, Bluesky, and 15+ more platforms |
| βοΈ Tech Stack | CMS (WordPress, Shopify, Webflowβ¦), Frameworks (React, Next.js, Vueβ¦), Analytics (GA, Mixpanel, PostHogβ¦), Marketing tools (HubSpot, Intercomβ¦), CDN, Hosting, Payments β 60+ technologies |
| π SEO Data | Title, meta description, canonical URL, OG tags, Twitter cards, heading hierarchy, word count, image alt audit, internal/external links, and a computed SEO Score (0-100) |
| π Structured Data | JSON-LD schemas (Organization, Product, Article, FAQ, etc.) |
π Quick Start
Input Example
{"startUrls": [{ "url": "https://example.com" }],"maxPages": 30,"maxDepth": 3,"extractEmails": true,"extractPhones": true,"extractSocials": true,"detectTechStack": true,"extractSEO": true,"extractStructuredData": true,"proxyConfiguration": {"useApifyProxy": true}}
Input Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
startUrls | array | required | URLs to crawl |
maxPages | integer | 20 | Max pages per run (1β500) |
maxDepth | integer | 3 | Link depth to follow (0β10) |
extractEmails | boolean | true | Find email addresses |
extractPhones | boolean | true | Find phone numbers |
extractSocials | boolean | true | Find social media links |
detectTechStack | boolean | true | Identify technologies |
extractSEO | boolean | true | Collect SEO metadata |
extractStructuredData | boolean | true | Parse JSON-LD |
proxyConfiguration | object | Apify proxy | Proxy settings |
π¦ Output Format
Per-Page Dataset Record
{"url": "https://example.com/about","statusCode": 200,"crawledAt": "2025-01-15T10:30:00.000Z","title": "About Us β Example Corp","metaDescription": "Learn about Example Corp...","seoScore": 82,"wordCount": 1450,"emails": ["hello@example.com", "careers@example.com"],"phones": ["+1 (555) 123-4567"],"socialLinks": {"twitter": ["https://twitter.com/examplecorp"],"linkedin": ["https://linkedin.com/company/example"],"github": ["https://github.com/example"]},"techStack": [{ "name": "Next.js", "category": "Framework" },{ "name": "Vercel", "category": "Hosting" },{ "name": "Google Analytics", "category": "Analytics" },{ "name": "Stripe", "category": "Payments" }],"seo": {"title": "About Us β Example Corp","titleLength": 25,"metaDescription": "Learn about Example Corp...","metaDescriptionLength": 145,"canonicalUrl": "https://example.com/about","language": "en","openGraph": { "title": "...", "image": "..." },"headings": {"h1": ["About Example Corp"],"h2": ["Our Mission", "Our Team", "Contact"]},"totalImages": 12,"imagesWithoutAlt": 2,"internalLinks": 34,"externalLinks": 8,"seoScore": 82},"structuredData": [{"@type": "Organization","name": "Example Corp","url": "https://example.com"}]}
Domain Summary (Key-Value Store β DOMAIN_SUMMARY)
After crawling completes, a rolled-up summary is saved:
{"totalPagesCrawled": 25,"totalUniqueEmails": ["hello@example.com", "sales@example.com"],"totalUniquePhones": ["+1 (555) 123-4567"],"socialProfiles": {"twitter": ["https://twitter.com/examplecorp"],"linkedin": ["https://linkedin.com/company/example"]},"technologiesDetected": [{ "name": "Next.js", "category": "Framework" },{ "name": "Stripe", "category": "Payments" }]}
π― Use Cases
- Lead Generation β Crawl prospect websites to harvest contact emails and phone numbers
- Competitive Analysis β Discover what tech stack competitors use
- SEO Auditing β Bulk-audit SEO health across hundreds of pages
- Market Research β Map social media presence across an industry
- Sales Intelligence β Enrich CRM records with fresh website data
- Content Analysis β Extract structured data and content metrics
π License
MIT β see LICENSE for details.