Website Tech & Contact Scanner avatar

Website Tech & Contact Scanner

Pricing

from $5.00 / 1,000 website scan results

Go to Apify Store
Website Tech & Contact Scanner

Website Tech & Contact Scanner

Detect the tech stack (Shopify, WordPress, Next.js…) and extract email, phone & social profiles (FB, IG, TikTok, Pinterest…) from any website. Bypasses Cloudflare. Bulk scan up to 1000 URLs in ~2 min. Ideal for lead enrichment & competitor research.

Pricing

from $5.00 / 1,000 website scan results

Rating

5.0

(1)

Developer

Alexandre Manguis

Alexandre Manguis

Maintained by Community

Actor stats

2

Bookmarked

4

Total users

1

Monthly active users

4 days ago

Last modified

Share

Scan websites to detect technology stacks, extract contact details, and discover social media profiles — with structured, consistent dataset output suitable for lead generation, market research, and competitive analysis.

No proxy required. No API keys required. Works out of the box with the prefilled input.


What it does

For each URL you provide, the actor:

  1. Fetches the homepage and optionally crawls relevant subpages (/contact, /about, etc.)
  2. Extracts emails, phone numbers, and social media links
  3. Detects the technology stack from HTML signatures, script tags, and HTTP headers
  4. Calculates business presence signals
  5. Pushes a structured, schema-consistent item to the dataset

Key features

  • Self-contained — no external API calls, no API keys, works on every Apify plan
  • Multi-platform social detection — Facebook, Twitter/X, LinkedIn, TikTok, Instagram, Pinterest, Yelp, Google Maps
  • Categorised tech detection — CMS, frontend framework, analytics, tag manager, ads/tracking, payments, chat, email marketing, CDN, hosting
  • Subpage crawling — discovers /contact, /about, /privacy, /imprint for richer contact extraction
  • Graceful failure — one unreachable URL never stops the run; a valid dataset item is always emitted
  • Fast — HTTP-first (no browser), easily processes hundreds of URLs per run
  • Stable output — every item follows a fixed schema whether the scan succeeds or fails

Input

FieldTypeDefaultDescription
startUrlsarray[{url:"https://example.com"},{url:"https://en.wikipedia.org"}]URLs to scan. Each item must be {"url":"..."}.
maxPagesPerSiteinteger3Max pages per site (homepage + subpages).
requestTimeoutSecsinteger30Per-request HTTP timeout.
detectTechnologiesbooleantrueDetect CMS, frameworks, analytics, etc.
detectContactsbooleantrueExtract emails and phone numbers.
detectSocialMediabooleantrueExtract social media profile links.
includeSubpagesbooleantrueCrawl contact/about subpages.
maxConcurrencyinteger5Parallel requests (max 20).
debugModebooleanfalseEnable verbose logging.

Example input

{
"startUrls": [
{ "url": "https://allbirds.com" },
{ "url": "https://framer.com" }
],
"maxPagesPerSite": 3,
"requestTimeoutSecs": 30,
"detectTechnologies": true,
"detectContacts": true,
"detectSocialMedia": true,
"includeSubpages": true,
"maxConcurrency": 5
}

Output

One dataset item per input URL. The schema is always consistent.

FieldTypeDescription
inputUrlstringURL as provided in input
normalizedUrlstringURL with scheme normalised
finalUrlstring | nullFinal URL after redirects
statusCodeinteger | nullHTTP response code
titlestring | nullHTML page title
metaDescriptionstring | null<meta name="description"> content
languagestring | nullHTML lang attribute (e.g. en)
emailsstring[]Unique emails found across all scanned pages
phoneNumbersstring[]Phone numbers found
contactPagesstring[]Discovered /contact subpage URLs
aboutPagesstring[]Discovered /about subpage URLs
socialLinksobjectSocial links by platform (see below)
technologiesobjectTech stack by category (see below)
businessSignalsobjectBoolean presence flags (see below)
scanTimestampstringISO 8601 timestamp
successbooleanfalse if the site was unreachable
errorstring | nullError message, null on success
{
"facebook": ["https://www.facebook.com/mybrand"],
"twitter": ["https://twitter.com/mybrand"],
"linkedin": ["https://www.linkedin.com/company/mybrand"],
"tiktok": ["https://www.tiktok.com/@mybrand"],
"instagram": ["https://www.instagram.com/mybrand"],
"pinterest": ["https://www.pinterest.com/mybrand"],
"yelp": ["https://www.yelp.com/biz/mybrand"],
"google": ["https://maps.google.com/maps?q=mybrand"]
}

technologies structure

{
"cms": ["WordPress"],
"frontendFramework":["Next.js"],
"analytics": ["Google Analytics 4"],
"tagManager": ["Google Tag Manager"],
"adsTracking": ["Meta Pixel"],
"payment": ["Stripe"],
"chatWidget": ["Intercom"],
"emailMarketing": ["Mailchimp"],
"cdn": ["Cloudflare"],
"hosting": ["Vercel"],
"misc": ["jQuery", "Google Fonts"]
}

businessSignals structure

{
"hasContactPage": true,
"hasAboutPage": false,
"hasPhone": true,
"hasEmail": true,
"hasSocialPresence": true,
"hasEcommerceSignals": false,
"hasTrackingPixels": true,
"hasChatWidget": false
}

Example output item

{
"inputUrl": "https://allbirds.com",
"normalizedUrl": "https://allbirds.com/",
"finalUrl": "https://www.allbirds.com/",
"statusCode": 200,
"title": "Allbirds | Sustainable & Comfortable Shoes, Clothing & More",
"metaDescription": "Allbirds makes the world's most comfortable shoes using natural, sustainable materials.",
"language": "en",
"emails": ["help@allbirds.com"],
"phoneNumbers": [],
"contactPages": ["https://www.allbirds.com/contact"],
"aboutPages": ["https://www.allbirds.com/about"],
"socialLinks": {
"facebook": ["https://www.facebook.com/allbirds"],
"twitter": ["https://twitter.com/allbirds"],
"linkedin": ["https://www.linkedin.com/company/allbirds"],
"tiktok": ["https://www.tiktok.com/@allbirds"],
"instagram": ["https://www.instagram.com/allbirds"],
"pinterest": [],
"yelp": [],
"google": []
},
"technologies": {
"cms": ["Shopify"],
"frontendFramework": [],
"analytics": ["Google Analytics 4"],
"tagManager": ["Google Tag Manager"],
"adsTracking": ["Meta Pixel"],
"payment": [],
"chatWidget": [],
"emailMarketing": ["Klaviyo"],
"cdn": ["Cloudflare"],
"hosting": [],
"misc": ["Google Fonts"]
},
"businessSignals": {
"hasContactPage": true,
"hasAboutPage": true,
"hasPhone": false,
"hasEmail": true,
"hasSocialPresence": true,
"hasEcommerceSignals": true,
"hasTrackingPixels": true,
"hasChatWidget": false
},
"scanTimestamp": "2026-04-01T12:00:00.000Z",
"success": true,
"error": null
}

Supported social media platforms

PlatformProfiles detected
FacebookCompany pages, personal pages
Twitter / Xtwitter.com and x.com profiles
LinkedIn/company/, /in/, /school/ profiles
TikTok@handle profiles
InstagramUser profiles (excludes post and reel links)
PinterestBoard and profile pages
YelpBusiness listing pages (/biz/)
GoogleGoogle Maps, Google Business profile, and Maps short links

Technology detection

CMS: WordPress, Shopify, Wix, Squarespace, Webflow, Drupal, Joomla, Magento, BigCommerce, Ghost, Framer, PrestaShop, TYPO3, HubSpot CMS, Contentful, Notion

Frontend Frameworks: Next.js, Nuxt.js, Gatsby, Angular, Vue.js, React, Svelte, Astro, Remix, Ember

Analytics: Google Analytics 4, Google Analytics UA, Segment, Hotjar, Mixpanel, Amplitude, Plausible, Matomo, Heap, FullStory, Microsoft Clarity, Yandex.Metrika

Tag Managers: Google Tag Manager, Adobe Launch, Tealium, Segment

Ads / Tracking Pixels: Meta Pixel, Google Ads, Twitter Pixel, TikTok Pixel, LinkedIn Pixel, Pinterest Pixel, Criteo, Outbrain, Taboola

Payments: Stripe, PayPal, Braintree, Square, Mollie, Klarna, Adyen

Chat Widgets: Intercom, Drift, Zendesk, Crisp, Tidio, Tawk.to, LiveChat, Freshchat, Olark

Email Marketing: Mailchimp, Klaviyo, Brevo, ActiveCampaign, Mailerlite, ConvertKit, Omnisend, HubSpot Forms, Drip

CDN: Cloudflare, AWS CloudFront, Fastly, jsDelivr, unpkg, Bunny CDN, Akamai

Hosting: Vercel, Netlify, GitHub Pages, Firebase, Heroku, Render, AWS S3

Misc: Bootstrap, jQuery, Tailwind CSS, Font Awesome, Google Fonts, reCAPTCHA

Detection is based on HTML source signatures and HTTP headers. Results are best-effort heuristics; confidence is not guaranteed.


Use cases

  • Lead generation — enrich a list of prospect websites with tech stack and contact info
  • Market research — identify which CMS or analytics tools competitors use
  • Sales intelligence — filter leads by detected technology (e.g. Shopify stores)
  • Agency outreach — find contact emails and social profiles for cold outreach
  • Competitive analysis — compare tech stacks across multiple companies

Performance notes

  • The actor uses HTTP requests only (no browser/Playwright).
  • With default settings (maxConcurrency=5, maxPagesPerSite=3), expect roughly 10–30 URLs per minute depending on target response times.
  • For large lists, increase maxConcurrency up to 20 (monitor memory usage).
  • Sites protected by Cloudflare or similar WAFs may return partial content; the actor will still emit a success item with whatever data was available.
  • Set includeSubpages: false to scan only homepages and maximise throughput.

Limitations

  • Single-page apps (SPA) — pages that require JavaScript execution may yield fewer results, since the actor uses plain HTTP requests. Tech signals embedded in JS bundles are often still detectable via script src attributes.
  • CAPTCHA / bot protection — heavily protected sites may block requests or return incomplete pages.
  • Private pages — login-protected content is not accessible.
  • Phone number accuracy — phone extraction uses pattern matching and may produce false positives on some pages.

Privacy & compliance

This actor fetches publicly accessible pages only. It does not:

  • Submit forms
  • Log in to any service
  • Store or transmit personal data beyond the configured Apify dataset
  • Bypass authentication or access controls

Users are responsible for ensuring their use complies with applicable data protection regulations (GDPR, CCPA, etc.) and the terms of service of the target websites.


Changelog

v2.0 (2026-04)

  • Full rewrite — actor is now completely self-contained, no external API required
  • New input format: startUrls (array of {url} objects) replacing flat urls array
  • New structured output: categorised technologies, socialLinks, businessSignals
  • Added Yelp and Google Maps social link detection
  • Added /contact, /about, /privacy, /imprint subpage crawling
  • Added 64 unit tests + integration test
  • Fixed input schema: required: [] with safe prefilled values for automated testing
  • Improved Magento detection (reduced false positives)

v1.0 (2026-03)

  • Initial release

Keywords

website scanner, tech stack detector, technology detection, CMS detection, contact extractor, email extractor, email scraper, phone number extractor, social media scraper, social links extractor, Facebook scraper, LinkedIn scraper, Twitter scraper, TikTok scraper, Instagram scraper, Pinterest scraper, Yelp scraper, Google Maps scraper, lead enrichment, lead generation, B2B data enrichment, prospect research, sales intelligence, competitive analysis, market research, website analysis, SEO tool, WordPress detector, Shopify detector, Wix detector, Squarespace detector, Next.js detector, React detector, Google Analytics detector, Google Tag Manager detector, Meta Pixel detector, Stripe detector, Intercom detector, Klaviyo detector, Cloudflare detector, Vercel detector, Netlify detector, bulk URL scanner, batch website scanner, web scraping, data extraction, HTML parser, HTTP scraper, no proxy required, no API key required, startup tool, agency tool, SaaS tool, ecommerce detector, WooCommerce detector, Magento detector, BigCommerce detector, PrestaShop detector, Hotjar detector, Mixpanel detector, Segment detector, Plausible detector, Matomo detector, PayPal detector, Braintree detector, Drift detector, Zendesk detector, Crisp detector, Tidio detector, Tawk.to detector, Mailchimp detector, ActiveCampaign detector, Brevo detector, ConvertKit detector, contact page finder, about page finder, business signals, digital presence analysis, outbound sales, cold outreach, domain enrichment, URL enrichment