Website Tech Stack Detector
Pricing
from $8.00 / 1,000 results
Website Tech Stack Detector
Detect technologies on any website using the real Wappalyzer browser extension via Playwright โ not HTTP guessing. Identifies CMS, JS frameworks, analytics, CDN, payments, and 1,000+ more. Built for bulk lead qualification, competitive analysis, and tech market research.
Pricing
from $8.00 / 1,000 results
Rating
0.0
(0)
Developer
Saregaa
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
๐ Website Tech Stack Detector (Wappalyzer Engine)
High-precision technology stack detection powered by the real Wappalyzer browser extension and Playwright.
A free, open-source alternative to paywalled BuiltWith and Wappalyzer APIs โ pay only for what you scan.
What it does
This actor launches a real headless Chromium browser with the Wappalyzer extension injected, visits each target URL, and returns a full technology fingerprint โ exactly like running Wappalyzer in your own browser, but at scale and via API.
It detects CMS, JavaScript frameworks, analytics tools, CDN providers, marketing pixels, databases, web servers, ecommerce platforms, and more โ with confidence scores, version numbers where available, and optional security risk flags.
โจ Key features
- Real browser fingerprinting โ uses the actual Wappalyzer extension via Playwright, not HTTP header guessing
- Confidence scoring โ every technology tagged as
high,medium, orlowconfidence - Version detection โ captures version strings where Wappalyzer exposes them (e.g.
Next.js 15.1.12) - Category grouping โ results grouped by category (Analytics, CMS, CDN, etc.) for easy filtering
- Security risk flags โ optional checks for outdated jQuery, CMS without CDN/WAF, forms without CAPTCHA, missing security headers, and exposed
X-Powered-By - Bulk processing โ scan hundreds of URLs in a single run
- Run summary โ final dataset record with tech distribution stats and cost breakdown
- Graceful error handling โ failed URLs are logged with error details, run never crashes
๐ฅ Input
| Field | Type | Default | Description |
|---|---|---|---|
urls | string[] | โ | List of URLs to scan. Prefix with https:// or leave bare โ normalization is automatic |
include_risk | boolean | true | Run security risk checks on each result |
Example input:
{"urls": ["https://apify.com","react.dev","https://www.gymshark.com"],"include_risk": true}
๐ Proxy
A proxy is required for reliable operation. The actor uses a single browser instance and rotates requests through datacenter proxies to avoid blocks and rate limits. Datacenter proxies are sufficient โ residential proxies are not needed.
You can use:
- Apify Proxy (datacenter) โ available directly in the actor's proxy settings
- Your own proxy โ pass via the standard Apify proxy configuration
Without a proxy, many sites will block or rate-limit the scanner after just a few requests.
โฑ๏ธ Performance
The actor runs a single browser instance sequentially through the URL list.
| Metric | Value |
|---|---|
| Throughput | ~120 URLs / hour |
| 100 URLs | ~50 min |
| 500 URLs | ~4 hours |
| 1,000 URLs | ~8 hours |
For large batches, consider splitting across multiple actor runs.
๐ค Output
Each scanned URL produces one JSON record in the dataset:
{"url": "https://apify.com","url_normalized": "apify.com","scanned_at": "2026-05-26T17:02:05+00:00","fetch_method": "wappalyzer_playwright","status_code": 200,"technologies": [{"name": "Next.js","category": "JavaScript frameworks","confidence": "high","version": "16.2.6","detected_by": ["wappalyzer_browser_extension"]}],"categories": {"JavaScript frameworks": ["React", "Next.js"],"Analytics": ["Google Analytics", "Microsoft Clarity"]},"risk_flags": [],"tech_count": 22,"error": null}
The last record in every dataset is a run_summary with aggregated stats, top technology distribution across all scanned sites, and estimated cost breakdown.
Output fields
| Field | Description |
|---|---|
url | Original input URL |
url_normalized | Hostname only, for deduplication |
scanned_at | ISO 8601 timestamp |
fetch_method | Always wappalyzer_playwright |
status_code | 200 if technologies detected, 0 on failure |
technologies | Array of detected tech objects with name, category, confidence, version |
categories | Technologies grouped by category |
http_headers | Security-relevant headers (server, CSP, HSTS, etc.) |
risk_flags | Array of security risk objects โ see below |
tech_count | Total number of detected technologies |
error | null on success, error object on failure |
๐ก๏ธ Security risk flags
When include_risk: true, the actor runs lightweight security checks and adds flags to each result:
| Code | Level | Trigger |
|---|---|---|
CMS_WITHOUT_CDN | medium | WordPress/Drupal/Joomla detected without Cloudflare, Fastly, Akamai, or similar |
OUTDATED_JQUERY | medium | jQuery version < 3.x (known XSS vectors) |
JQUERY_ON_ECOMMERCE | low | jQuery detected on a shop with unconfirmed version |
FORMS_WITHOUT_CAPTCHA | low | Gravity Forms / Typeform / Formstack without reCAPTCHA or hCaptcha |
EXPOSED_X_POWERED_BY | low | X-Powered-By header reveals PHP/ASP.NET/Express version |
MISSING_SECURITY_HEADERS | low | Two or more of CSP, X-Frame-Options, X-Content-Type-Options, HSTS are absent |
๐ก Use cases
- Lead enrichment & sales prospecting โ identify which CRM, chat, or marketing stack a prospect uses before outreach
- Competitive analysis โ benchmark your tech choices against competitors or industry peers
- Market research โ map technology adoption across a list of domains
- Security audits โ surface missing headers and outdated dependencies at scale
- Tech stack migrations โ inventory what needs replacing before a platform switch
โ๏ธ How it compares
| This actor | nexgendata/wappalyzer-replacement | misterkhan/tech-stack-scanner | Wappalyzer API | BuiltWith | |
|---|---|---|---|---|---|
| Detection engine | Real browser + extension | OSS fingerprint rules (HTTP) | Multi-tier HTTP | Closed-source | Proprietary |
| Browser rendering | โ Yes | โ No | โ No | โ Yes | โ Yes |
| Confidence scores | โ | โ | โ | โ | โ |
| Version detection | โ | Partial | โ | โ | โ |
| Security risk flags | โ Built-in | โ | โ | โ | โ |
| Price / 1,000 URLs | ~$8 | ~$10 | ~$5 | $250/mo cap | $500+/mo |
| Open source | โ | โ | โ | โ | โ |
The key differentiator: HTTP-only detectors miss technologies that are injected client-side (analytics pixels, chat widgets, A/B testing tools). A real browser with the Wappalyzer extension catches them reliably.
๐ฐ Pricing
This actor uses pay-per-result pricing on the Apify platform โ no subscription, no minimums.
| Volume | Estimated cost |
|---|---|
| 100 URLs | ~$0.84 |
| 1,000 URLs | ~$8.04 |
| 10,000 URLs | ~$80.04 |
Costs include the Apify platform run fee ($0.035/run) plus $0.008 per scanned URL.
New to Apify? The free tier includes enough credits to scan ~60 URLs before any payment is required.
๐ง Tips for best results
- Always configure a proxy โ datacenter proxies are sufficient, residential are not required
- Include
https://in URLs for fastest processing (normalization adds a small overhead) - Large batches (500+ URLs) run best split across multiple actor runs for better reliability
- SPAs and JS-heavy sites (React, Next.js, Angular) are where this actor shines over HTTP-only alternatives โ the real browser executes all client-side code before fingerprinting
๐ Example results
Detected on https://apify.com (22 technologies):
React, Next.js 16.2.6, Algolia, styled-components, Turbopack, Sentry, Amazon CloudFront, Amazon S3, Google Analytics, Google Tag Manager, Intercom, HubSpot, Segment, Microsoft Clarity, LinkedIn Insight Tag, and more.
Detected on https://www.whitehouse.gov (13 technologies):
WordPress, MySQL, PHP, Nginx, Yoast SEO, Google Analytics, Google Tag Manager, PWA, Preact, Parse.ly โ plus a CMS_WITHOUT_CDN risk flag.
Detected on https://www.gymshark.com (14 technologies):
Shopify, Next.js 15.5.18, React, Algolia, Amazon CloudFront, Intercom, Braze, Datadog, mParticle, LinkedIn Insight Tag.
๐ License
Actor code is open source. The Wappalyzer fingerprint ruleset is MIT-licensed via the community fork. Output data is yours to use commercially โ check the target site's Terms of Service for any scraping restrictions.