Tech Stack Profiler
Pricing
from $1.00 / 1,000 results
Tech Stack Profiler
Detect 200+ web technologies powering any domain with CMS, frameworks, analytics, CDN, eCommerce, payment, hosting, databases, search, security. Bulk-profile domains in parallel using open-source signatures. Free BuiltWith alternative. HTTP-only, no proxy, no API key.
Pricing
from $1.00 / 1,000 results
Rating
5.0
(21)
Developer
Crawler Bros
Actor stats
21
Bookmarked
2
Total users
1
Monthly active users
11 days ago
Last modified
Categories
Share
Detect 330+ web technologies powering any domain — CMS, frameworks, analytics, CDN, eCommerce, payment processors, hosting, databases, search, security, AI assistants, Web3 wallets, customer-support tools, and more. Bulk-profile a list of domains in parallel using open-source web-tech signature patterns. Each detection surfaces the signal types that matched (headers / cookies / scripts / meta / html) so you can rank results by evidence strength. Free BuiltWith alternative — HTTP-only, no proxy, no cookies, no API key.
What it does
You provide a list of domains; the actor:
- Fetches
https://<domain>/for each (in parallel, with configurable concurrency). - Matches every fetched page against a curated signature database covering CMS / Framework / Analytics / Tag Manager / Advertising / CDN / Hosting / Web Server / Programming Language / eCommerce / Payment / Customer Support / Marketing / A/B Testing / Font Script / Search / Security / Performance / Scheduling / Forms / Comments / Cookie Compliance.
- Surfaces:
- The full
technologieslist (name,category,versionwhen detectable,confidence,website). - Aggregated category labels (
categoriesDetected). - Top-of-stack shortcuts —
cmsName,frameworkName,cdnProvider,webServer,hostingProvider,analyticsTools. - Optional meta tags (title, description, Open Graph, Twitter Cards).
- The full
Empty fields are omitted (no nulls). Failed fetches surface an error field with the exception type instead of crashing the run.
Input
| Field | Type | Default | Description |
|---|---|---|---|
domains | array of strings (required) | ["apify.com"] | List of domains. Accepts bare hostnames (apify.com) or full URLs (https://apify.com/path). |
concurrency | integer | 5 (1–20) | Number of domains fetched in parallel. |
extractMetaTags | boolean | true | Also pull page title, meta description, Open Graph + Twitter Card tags. |
categoryFilter | array of enums | [] | Restrict detection to specific categories (CMS, Framework, Analytics, CDN, Hosting, eCommerce, Payment Processor, etc.). Empty = all. |
userAgent | string (optional) | (Chrome 131) | Override the default User-Agent. Only needed if a target site filters by UA. |
Example input
{"domains": ["apify.com", "shopify.com", "github.com"],"concurrency": 5,"extractMetaTags": true,"categoryFilter": []}
Output
One record per domain. Empty fields are omitted (no nulls).
{"domain": "apify.com","url": "https://apify.com/","httpStatus": 200,"responseTimeMs": 425,"title": "Apify: Full-stack web scraping and data extraction platform","description": "Cloud platform for web scraping…","metaTags": {"title": "Apify: Full-stack web scraping and data extraction platform","description": "Cloud platform for web scraping…","ogTags": {"image": "https://apify.com/og-image.jpg","type": "website"},"twitterTags": {"card": "summary_large_image"}},"technologies": [{"name": "Amazon CloudFront", "category": "CDN", "categories": ["CDN"], "confidence": 100, "signals": ["headers"], "website": "https://aws.amazon.com/cloudfront"},{"name": "Google Tag Manager", "category": "Tag Manager", "categories": ["Tag Manager"], "confidence": 100, "signals": ["html"], "website": "https://tagmanager.google.com"},{"name": "Next.js", "category": "JavaScript Framework", "categories": ["JavaScript Framework", "Framework"], "confidence": 100, "signals": ["headers", "html"], "website": "https://nextjs.org"}],"technologyCount": 3,"categoriesDetected": ["Analytics", "CDN", "Framework"],"categoryCount": 3,"frameworkName": "Next.js","cdnProvider": "Amazon CloudFront","scrapedAt": "2024-12-16T14:23:11+00:00"}
Output fields
domain— normalised hostname (lowercase, no scheme, no www).url— final URL after redirect chain (e.g.https://www.apify.com/→https://apify.com/).httpStatus— HTTP status code returned (200 = success).responseTimeMs— round-trip fetch time in milliseconds.title/description—<title>and<meta name="description">(whenextractMetaTags: true).metaTags— nested block withtitle,description,ogTags,twitterTags(whenextractMetaTags: true).technologies— array of{name, category, categories[], confidence, signals[], website, version?}. Sorted alphabetically.signals[]— which evidence types matched the signature: any combination of"headers","cookies","scripts","meta","html". A 2- or 3-signal hit is dramatically more reliable than a 1-signal hit.technologyCount— count of detected technologies.categoriesDetected— sorted list of grouped category labels (e.g. CMS, Framework, Analytics, CDN, Hosting, eCommerce, Payment).categoryCount— number of distinct categories.cmsName— first detected technology in the CMS category (e.g.WordPress,Shopify,Webflow).frameworkName— first detected technology in Framework / JavaScript Framework category (e.g.Next.js,Nuxt.js,Astro,Gatsby).cdnProvider— first detected technology in the CDN category (Cloudflare,Amazon CloudFront,Fastly,Akamai).webServer— first detected technology in Web Server category (nginx,Apache,LiteSpeed).hostingProvider— first detected technology in Hosting / PaaS category (Vercel,Netlify,Heroku).analyticsTools— every analytics tool detected (e.g.["Google Analytics", "Hotjar"]).error— fetch-failure reason (only when the request errored — DNS, timeout, TLS, etc.).scrapedAt— ISO-8601 UTC timestamp.
Use cases
- Lead enrichment — fingerprint a list of company websites for CRM enrichment (which CMS / hosting / payment processor each uses).
- Competitive research — pull tech stacks for every competitor in a market in one run.
- Sales targeting — find every Shopify / WordPress / Webflow site in a domain list.
- Migration audits — confirm a site is still running the expected stack after a deploy.
- Tech-stack market share — sample thousands of domains to compute CMS/framework adoption rates.
FAQ
Does it need a proxy or cookies?
No. The actor uses Chrome 131 TLS-fingerprint impersonation and connects directly from datacenter IPs. Any site behind aggressive bot protection (Cloudflare bot-fight on hard mode, etc.) may block; those records get error set and are skipped from detection.
Is this BuiltWith? It's a free alternative. The signature set is curated from open-source web-tech patterns (originating from the Wappalyzer project) and shipped inline (no external API calls). BuiltWith covers ~50,000 technologies in their paid product; this actor focuses on ~270 high-confidence common ones (the ones that actually matter for most use cases).
Why are some technologies not detected? The actor only sees the server-rendered HTML + response headers. A pure client-side React app may render content via JS post-load, which the actor doesn't observe. For deep JS-execution-based detection, run a Playwright-based actor instead.
How accurate is the version detection?
Version strings are extracted opportunistically from Server headers, meta generator tags, and a few signature regexes. About 30–40% of detections include a version; the rest only have a name.
Can I detect more technologies?
The current signature set ships ~330 high-confidence patterns. Adding more is easy — extend src/signatures.py. The infrastructure already supports headers / cookies / scripts / meta / HTML matchers, version groups, and version fallback parsed from common script-URL conventions (e.g. jquery-3.6.0.min.js → 3.6.0).
What happens when a domain is unreachable?
The actor emits one record with error populated (e.g. error: "ConnectionError") and technologies omitted. The run keeps going for the rest of the domains.
How is the input deduplicated?
By normalised hostname (lowercase, no scheme/www/path/port). https://apify.com/about and WWW.APIFY.COM collapse to the same apify.com record.
Why does my domain show only 1–2 technologies? Common causes: aggressive bot blocking (the actor only got an HTTP-challenge page), client-rendered single-page app (we only see the empty shell), or a static page that genuinely uses very few traceable technologies.