Website Tech Stack Detector avatar

Website Tech Stack Detector

Pricing

from $8.00 / 1,000 results

Go to Apify Store
Website Tech Stack Detector

Website Tech Stack Detector

Detect technologies on any website using the real Wappalyzer browser extension via Playwright โ€” not HTTP guessing. Identifies CMS, JS frameworks, analytics, CDN, payments, and 1,000+ more. Built for bulk lead qualification, competitive analysis, and tech market research.

Pricing

from $8.00 / 1,000 results

Rating

0.0

(0)

Developer

Saregaa

Saregaa

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

8 days ago

Last modified

Share

๐Ÿ” Website Tech Stack Detector (Wappalyzer Engine)

High-precision technology stack detection powered by the real Wappalyzer browser extension and Playwright.
A free, open-source alternative to paywalled BuiltWith and Wappalyzer APIs โ€” pay only for what you scan.


What it does

This actor launches a real headless Chromium browser with the Wappalyzer extension injected, visits each target URL, and returns a full technology fingerprint โ€” exactly like running Wappalyzer in your own browser, but at scale and via API.

It detects CMS, JavaScript frameworks, analytics tools, CDN providers, marketing pixels, databases, web servers, ecommerce platforms, and more โ€” with confidence scores, version numbers where available, and optional security risk flags.


โœจ Key features

  • Real browser fingerprinting โ€” uses the actual Wappalyzer extension via Playwright, not HTTP header guessing
  • Confidence scoring โ€” every technology tagged as high, medium, or low confidence
  • Version detection โ€” captures version strings where Wappalyzer exposes them (e.g. Next.js 15.1.12)
  • Category grouping โ€” results grouped by category (Analytics, CMS, CDN, etc.) for easy filtering
  • Security risk flags โ€” optional checks for outdated jQuery, CMS without CDN/WAF, forms without CAPTCHA, missing security headers, and exposed X-Powered-By
  • Bulk processing โ€” scan hundreds of URLs in a single run
  • Run summary โ€” final dataset record with tech distribution stats and cost breakdown
  • Graceful error handling โ€” failed URLs are logged with error details, run never crashes

๐Ÿ“ฅ Input

FieldTypeDefaultDescription
urlsstring[]โ€”List of URLs to scan. Prefix with https:// or leave bare โ€” normalization is automatic
include_riskbooleantrueRun security risk checks on each result

Example input:

{
"urls": [
"https://apify.com",
"react.dev",
"https://www.gymshark.com"
],
"include_risk": true
}

๐Ÿ”Œ Proxy

A proxy is required for reliable operation. The actor uses a single browser instance and rotates requests through datacenter proxies to avoid blocks and rate limits. Datacenter proxies are sufficient โ€” residential proxies are not needed.

You can use:

  • Apify Proxy (datacenter) โ€” available directly in the actor's proxy settings
  • Your own proxy โ€” pass via the standard Apify proxy configuration

Without a proxy, many sites will block or rate-limit the scanner after just a few requests.


โฑ๏ธ Performance

The actor runs a single browser instance sequentially through the URL list.

MetricValue
Throughput~120 URLs / hour
100 URLs~50 min
500 URLs~4 hours
1,000 URLs~8 hours

For large batches, consider splitting across multiple actor runs.


๐Ÿ“ค Output

Each scanned URL produces one JSON record in the dataset:

{
"url": "https://apify.com",
"url_normalized": "apify.com",
"scanned_at": "2026-05-26T17:02:05+00:00",
"fetch_method": "wappalyzer_playwright",
"status_code": 200,
"technologies": [
{
"name": "Next.js",
"category": "JavaScript frameworks",
"confidence": "high",
"version": "16.2.6",
"detected_by": ["wappalyzer_browser_extension"]
}
],
"categories": {
"JavaScript frameworks": ["React", "Next.js"],
"Analytics": ["Google Analytics", "Microsoft Clarity"]
},
"risk_flags": [],
"tech_count": 22,
"error": null
}

The last record in every dataset is a run_summary with aggregated stats, top technology distribution across all scanned sites, and estimated cost breakdown.

Output fields

FieldDescription
urlOriginal input URL
url_normalizedHostname only, for deduplication
scanned_atISO 8601 timestamp
fetch_methodAlways wappalyzer_playwright
status_code200 if technologies detected, 0 on failure
technologiesArray of detected tech objects with name, category, confidence, version
categoriesTechnologies grouped by category
http_headersSecurity-relevant headers (server, CSP, HSTS, etc.)
risk_flagsArray of security risk objects โ€” see below
tech_countTotal number of detected technologies
errornull on success, error object on failure

๐Ÿ›ก๏ธ Security risk flags

When include_risk: true, the actor runs lightweight security checks and adds flags to each result:

CodeLevelTrigger
CMS_WITHOUT_CDNmediumWordPress/Drupal/Joomla detected without Cloudflare, Fastly, Akamai, or similar
OUTDATED_JQUERYmediumjQuery version < 3.x (known XSS vectors)
JQUERY_ON_ECOMMERCElowjQuery detected on a shop with unconfirmed version
FORMS_WITHOUT_CAPTCHAlowGravity Forms / Typeform / Formstack without reCAPTCHA or hCaptcha
EXPOSED_X_POWERED_BYlowX-Powered-By header reveals PHP/ASP.NET/Express version
MISSING_SECURITY_HEADERSlowTwo or more of CSP, X-Frame-Options, X-Content-Type-Options, HSTS are absent

๐Ÿ’ก Use cases

  • Lead enrichment & sales prospecting โ€” identify which CRM, chat, or marketing stack a prospect uses before outreach
  • Competitive analysis โ€” benchmark your tech choices against competitors or industry peers
  • Market research โ€” map technology adoption across a list of domains
  • Security audits โ€” surface missing headers and outdated dependencies at scale
  • Tech stack migrations โ€” inventory what needs replacing before a platform switch

โš–๏ธ How it compares

This actornexgendata/wappalyzer-replacementmisterkhan/tech-stack-scannerWappalyzer APIBuiltWith
Detection engineReal browser + extensionOSS fingerprint rules (HTTP)Multi-tier HTTPClosed-sourceProprietary
Browser renderingโœ… YesโŒ NoโŒ Noโœ… Yesโœ… Yes
Confidence scoresโœ…โŒโœ…โœ…โœ…
Version detectionโœ…Partialโœ…โœ…โœ…
Security risk flagsโœ… Built-inโŒโŒโŒโŒ
Price / 1,000 URLs~$8~$10~$5$250/mo cap$500+/mo
Open sourceโœ…โœ…โŒโŒโŒ

The key differentiator: HTTP-only detectors miss technologies that are injected client-side (analytics pixels, chat widgets, A/B testing tools). A real browser with the Wappalyzer extension catches them reliably.


๐Ÿ’ฐ Pricing

This actor uses pay-per-result pricing on the Apify platform โ€” no subscription, no minimums.

VolumeEstimated cost
100 URLs~$0.84
1,000 URLs~$8.04
10,000 URLs~$80.04

Costs include the Apify platform run fee ($0.035/run) plus $0.008 per scanned URL.

New to Apify? The free tier includes enough credits to scan ~60 URLs before any payment is required.


๐Ÿ”ง Tips for best results

  • Always configure a proxy โ€” datacenter proxies are sufficient, residential are not required
  • Include https:// in URLs for fastest processing (normalization adds a small overhead)
  • Large batches (500+ URLs) run best split across multiple actor runs for better reliability
  • SPAs and JS-heavy sites (React, Next.js, Angular) are where this actor shines over HTTP-only alternatives โ€” the real browser executes all client-side code before fingerprinting

๐Ÿ“‹ Example results

Detected on https://apify.com (22 technologies): React, Next.js 16.2.6, Algolia, styled-components, Turbopack, Sentry, Amazon CloudFront, Amazon S3, Google Analytics, Google Tag Manager, Intercom, HubSpot, Segment, Microsoft Clarity, LinkedIn Insight Tag, and more.

Detected on https://www.whitehouse.gov (13 technologies): WordPress, MySQL, PHP, Nginx, Yoast SEO, Google Analytics, Google Tag Manager, PWA, Preact, Parse.ly โ€” plus a CMS_WITHOUT_CDN risk flag.

Detected on https://www.gymshark.com (14 technologies): Shopify, Next.js 15.5.18, React, Algolia, Amazon CloudFront, Intercom, Braze, Datadog, mParticle, LinkedIn Insight Tag.


๐Ÿ“„ License

Actor code is open source. The Wappalyzer fingerprint ruleset is MIT-licensed via the community fork. Output data is yours to use commercially โ€” check the target site's Terms of Service for any scraping restrictions.