Website Tech & Contact Scanner
Pricing
from $5.00 / 1,000 website scan results
Website Tech & Contact Scanner
Detect the tech stack (Shopify, WordPress, Next.js…) and extract email, phone & social profiles (FB, IG, TikTok, Pinterest…) from any website. Bypasses Cloudflare. Bulk scan up to 1000 URLs in ~2 min. Ideal for lead enrichment & competitor research.
Pricing
from $5.00 / 1,000 website scan results
Rating
5.0
(1)
Developer
Alexandre Manguis
Actor stats
2
Bookmarked
4
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Scan websites to detect technology stacks, extract contact details, and discover social media profiles — with structured, consistent dataset output suitable for lead generation, market research, and competitive analysis.
No proxy required. No API keys required. Works out of the box with the prefilled input.
What it does
For each URL you provide, the actor:
- Fetches the homepage and optionally crawls relevant subpages (
/contact,/about, etc.) - Extracts emails, phone numbers, and social media links
- Detects the technology stack from HTML signatures, script tags, and HTTP headers
- Calculates business presence signals
- Pushes a structured, schema-consistent item to the dataset
Key features
- Self-contained — no external API calls, no API keys, works on every Apify plan
- Multi-platform social detection — Facebook, Twitter/X, LinkedIn, TikTok, Instagram, Pinterest, Yelp, Google Maps
- Categorised tech detection — CMS, frontend framework, analytics, tag manager, ads/tracking, payments, chat, email marketing, CDN, hosting
- Subpage crawling — discovers
/contact,/about,/privacy,/imprintfor richer contact extraction - Graceful failure — one unreachable URL never stops the run; a valid dataset item is always emitted
- Fast — HTTP-first (no browser), easily processes hundreds of URLs per run
- Stable output — every item follows a fixed schema whether the scan succeeds or fails
Input
| Field | Type | Default | Description |
|---|---|---|---|
startUrls | array | [{url:"https://example.com"},{url:"https://en.wikipedia.org"}] | URLs to scan. Each item must be {"url":"..."}. |
maxPagesPerSite | integer | 3 | Max pages per site (homepage + subpages). |
requestTimeoutSecs | integer | 30 | Per-request HTTP timeout. |
detectTechnologies | boolean | true | Detect CMS, frameworks, analytics, etc. |
detectContacts | boolean | true | Extract emails and phone numbers. |
detectSocialMedia | boolean | true | Extract social media profile links. |
includeSubpages | boolean | true | Crawl contact/about subpages. |
maxConcurrency | integer | 5 | Parallel requests (max 20). |
debugMode | boolean | false | Enable verbose logging. |
Example input
{"startUrls": [{ "url": "https://allbirds.com" },{ "url": "https://framer.com" }],"maxPagesPerSite": 3,"requestTimeoutSecs": 30,"detectTechnologies": true,"detectContacts": true,"detectSocialMedia": true,"includeSubpages": true,"maxConcurrency": 5}
Output
One dataset item per input URL. The schema is always consistent.
| Field | Type | Description |
|---|---|---|
inputUrl | string | URL as provided in input |
normalizedUrl | string | URL with scheme normalised |
finalUrl | string | null | Final URL after redirects |
statusCode | integer | null | HTTP response code |
title | string | null | HTML page title |
metaDescription | string | null | <meta name="description"> content |
language | string | null | HTML lang attribute (e.g. en) |
emails | string[] | Unique emails found across all scanned pages |
phoneNumbers | string[] | Phone numbers found |
contactPages | string[] | Discovered /contact subpage URLs |
aboutPages | string[] | Discovered /about subpage URLs |
socialLinks | object | Social links by platform (see below) |
technologies | object | Tech stack by category (see below) |
businessSignals | object | Boolean presence flags (see below) |
scanTimestamp | string | ISO 8601 timestamp |
success | boolean | false if the site was unreachable |
error | string | null | Error message, null on success |
socialLinks structure
{"facebook": ["https://www.facebook.com/mybrand"],"twitter": ["https://twitter.com/mybrand"],"linkedin": ["https://www.linkedin.com/company/mybrand"],"tiktok": ["https://www.tiktok.com/@mybrand"],"instagram": ["https://www.instagram.com/mybrand"],"pinterest": ["https://www.pinterest.com/mybrand"],"yelp": ["https://www.yelp.com/biz/mybrand"],"google": ["https://maps.google.com/maps?q=mybrand"]}
technologies structure
{"cms": ["WordPress"],"frontendFramework":["Next.js"],"analytics": ["Google Analytics 4"],"tagManager": ["Google Tag Manager"],"adsTracking": ["Meta Pixel"],"payment": ["Stripe"],"chatWidget": ["Intercom"],"emailMarketing": ["Mailchimp"],"cdn": ["Cloudflare"],"hosting": ["Vercel"],"misc": ["jQuery", "Google Fonts"]}
businessSignals structure
{"hasContactPage": true,"hasAboutPage": false,"hasPhone": true,"hasEmail": true,"hasSocialPresence": true,"hasEcommerceSignals": false,"hasTrackingPixels": true,"hasChatWidget": false}
Example output item
{"inputUrl": "https://allbirds.com","normalizedUrl": "https://allbirds.com/","finalUrl": "https://www.allbirds.com/","statusCode": 200,"title": "Allbirds | Sustainable & Comfortable Shoes, Clothing & More","metaDescription": "Allbirds makes the world's most comfortable shoes using natural, sustainable materials.","language": "en","emails": ["help@allbirds.com"],"phoneNumbers": [],"contactPages": ["https://www.allbirds.com/contact"],"aboutPages": ["https://www.allbirds.com/about"],"socialLinks": {"facebook": ["https://www.facebook.com/allbirds"],"twitter": ["https://twitter.com/allbirds"],"linkedin": ["https://www.linkedin.com/company/allbirds"],"tiktok": ["https://www.tiktok.com/@allbirds"],"instagram": ["https://www.instagram.com/allbirds"],"pinterest": [],"yelp": [],"google": []},"technologies": {"cms": ["Shopify"],"frontendFramework": [],"analytics": ["Google Analytics 4"],"tagManager": ["Google Tag Manager"],"adsTracking": ["Meta Pixel"],"payment": [],"chatWidget": [],"emailMarketing": ["Klaviyo"],"cdn": ["Cloudflare"],"hosting": [],"misc": ["Google Fonts"]},"businessSignals": {"hasContactPage": true,"hasAboutPage": true,"hasPhone": false,"hasEmail": true,"hasSocialPresence": true,"hasEcommerceSignals": true,"hasTrackingPixels": true,"hasChatWidget": false},"scanTimestamp": "2026-04-01T12:00:00.000Z","success": true,"error": null}
Supported social media platforms
| Platform | Profiles detected |
|---|---|
| Company pages, personal pages | |
| Twitter / X | twitter.com and x.com profiles |
/company/, /in/, /school/ profiles | |
| TikTok | @handle profiles |
| User profiles (excludes post and reel links) | |
| Board and profile pages | |
| Yelp | Business listing pages (/biz/) |
| Google Maps, Google Business profile, and Maps short links |
Technology detection
CMS: WordPress, Shopify, Wix, Squarespace, Webflow, Drupal, Joomla, Magento, BigCommerce, Ghost, Framer, PrestaShop, TYPO3, HubSpot CMS, Contentful, Notion
Frontend Frameworks: Next.js, Nuxt.js, Gatsby, Angular, Vue.js, React, Svelte, Astro, Remix, Ember
Analytics: Google Analytics 4, Google Analytics UA, Segment, Hotjar, Mixpanel, Amplitude, Plausible, Matomo, Heap, FullStory, Microsoft Clarity, Yandex.Metrika
Tag Managers: Google Tag Manager, Adobe Launch, Tealium, Segment
Ads / Tracking Pixels: Meta Pixel, Google Ads, Twitter Pixel, TikTok Pixel, LinkedIn Pixel, Pinterest Pixel, Criteo, Outbrain, Taboola
Payments: Stripe, PayPal, Braintree, Square, Mollie, Klarna, Adyen
Chat Widgets: Intercom, Drift, Zendesk, Crisp, Tidio, Tawk.to, LiveChat, Freshchat, Olark
Email Marketing: Mailchimp, Klaviyo, Brevo, ActiveCampaign, Mailerlite, ConvertKit, Omnisend, HubSpot Forms, Drip
CDN: Cloudflare, AWS CloudFront, Fastly, jsDelivr, unpkg, Bunny CDN, Akamai
Hosting: Vercel, Netlify, GitHub Pages, Firebase, Heroku, Render, AWS S3
Misc: Bootstrap, jQuery, Tailwind CSS, Font Awesome, Google Fonts, reCAPTCHA
Detection is based on HTML source signatures and HTTP headers. Results are best-effort heuristics; confidence is not guaranteed.
Use cases
- Lead generation — enrich a list of prospect websites with tech stack and contact info
- Market research — identify which CMS or analytics tools competitors use
- Sales intelligence — filter leads by detected technology (e.g. Shopify stores)
- Agency outreach — find contact emails and social profiles for cold outreach
- Competitive analysis — compare tech stacks across multiple companies
Performance notes
- The actor uses HTTP requests only (no browser/Playwright).
- With default settings (
maxConcurrency=5,maxPagesPerSite=3), expect roughly 10–30 URLs per minute depending on target response times. - For large lists, increase
maxConcurrencyup to 20 (monitor memory usage). - Sites protected by Cloudflare or similar WAFs may return partial content; the actor will still emit a success item with whatever data was available.
- Set
includeSubpages: falseto scan only homepages and maximise throughput.
Limitations
- Single-page apps (SPA) — pages that require JavaScript execution may yield fewer results, since the actor uses plain HTTP requests. Tech signals embedded in JS bundles are often still detectable via script src attributes.
- CAPTCHA / bot protection — heavily protected sites may block requests or return incomplete pages.
- Private pages — login-protected content is not accessible.
- Phone number accuracy — phone extraction uses pattern matching and may produce false positives on some pages.
Privacy & compliance
This actor fetches publicly accessible pages only. It does not:
- Submit forms
- Log in to any service
- Store or transmit personal data beyond the configured Apify dataset
- Bypass authentication or access controls
Users are responsible for ensuring their use complies with applicable data protection regulations (GDPR, CCPA, etc.) and the terms of service of the target websites.
Changelog
v2.0 (2026-04)
- Full rewrite — actor is now completely self-contained, no external API required
- New input format:
startUrls(array of{url}objects) replacing flaturlsarray - New structured output: categorised
technologies,socialLinks,businessSignals - Added Yelp and Google Maps social link detection
- Added
/contact,/about,/privacy,/imprintsubpage crawling - Added 64 unit tests + integration test
- Fixed input schema:
required: []with safe prefilled values for automated testing - Improved Magento detection (reduced false positives)
v1.0 (2026-03)
- Initial release
Keywords
website scanner, tech stack detector, technology detection, CMS detection, contact extractor, email extractor, email scraper, phone number extractor, social media scraper, social links extractor, Facebook scraper, LinkedIn scraper, Twitter scraper, TikTok scraper, Instagram scraper, Pinterest scraper, Yelp scraper, Google Maps scraper, lead enrichment, lead generation, B2B data enrichment, prospect research, sales intelligence, competitive analysis, market research, website analysis, SEO tool, WordPress detector, Shopify detector, Wix detector, Squarespace detector, Next.js detector, React detector, Google Analytics detector, Google Tag Manager detector, Meta Pixel detector, Stripe detector, Intercom detector, Klaviyo detector, Cloudflare detector, Vercel detector, Netlify detector, bulk URL scanner, batch website scanner, web scraping, data extraction, HTML parser, HTTP scraper, no proxy required, no API key required, startup tool, agency tool, SaaS tool, ecommerce detector, WooCommerce detector, Magento detector, BigCommerce detector, PrestaShop detector, Hotjar detector, Mixpanel detector, Segment detector, Plausible detector, Matomo detector, PayPal detector, Braintree detector, Drift detector, Zendesk detector, Crisp detector, Tidio detector, Tawk.to detector, Mailchimp detector, ActiveCampaign detector, Brevo detector, ConvertKit detector, contact page finder, about page finder, business signals, digital presence analysis, outbound sales, cold outreach, domain enrichment, URL enrichment