Pricing

from $3.50 / 1,000 results

CMS Detector

CMS detector tool: check what CMS any site runs (WordPress, Shopify, Wix…) + its full tech stack, in bulk as CSV/JSON. BuiltWith / Wappalyzer alternative.

Pricing

from $3.50 / 1,000 results

Rating

0.0

(0)

Developer

Thodor

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Wappalyzer alternative for bulk CMS detection

Each domain gets a single type label, so you can filter or segment a list of 100 or 100,000 prospects in one column:

`type`	What it means	Examples
CMS	Traditional content management system	WordPress, Drupal, Joomla, Sitecore, Adobe Experience Manager, Ghost, Strapi, Contentful
Ecommerce	Online-store platform	Shopify, WooCommerce, Magento, BigCommerce, PrestaShop, Salesforce Commerce Cloud
Website builder	No-code drag-and-drop builder	Wix, Squarespace, Webflow, Carrd, Framer, Bubble, Duda
Blog	Dedicated blogging platform	Medium, Substack, Tumblr, Hashnode, Beehiiv, Bear Blog
Framework	No CMS, but a known web/JS framework	Next.js, Nuxt.js, React, Vue.js, Svelte, Gatsby, Astro, Remix
Unknown	We reached the site, but nothing in the HTML matched	Hand-built static sites, heavily-stripped custom builds, JS-rendered SPAs we can't see
`null`	We couldn't reach the site (4xx / 5xx / DNS error)	No row is pushed; you are not charged

Alongside type, every row carries:

cms — name and version of the specific platform, or null if no CMS was found.
framework — the underlying tech as a plain name (Next.js, Webflow, WordPress, …).
cdn, analytics, marketing — three ready-to-filter lists of the tools the site uses.
breakdown — the complete list of every technology found, including the ones already shown above, so you have the full picture in one place. Each entry also carries a pricing array (when known): low / mid / high / poa for the cost tier, plus freemium / recurring / onetime / payg for the billing model.

What this CMS detector is good for

Five concrete plays. If yours isn't here, the data fits any CRM or spreadsheet.

Competitor-displacement campaigns. Filter your prospect list for every site running a specific marketing tool or CMS, then feed those URLs into Apollo or Hunter for verified emails. "Find every Klaviyo store missing a loyalty app" is two filters: keep the rows where marketing contains "Klaviyo" and the full tech list has no loyalty tool.
Agency migration prospecting. Pull every site still on outdated platforms (Joomla, Drupal 7, classic ASP, Magento 1) in your geo and pitch a re-platform. Filter by CMS name and version in your spreadsheet, done in one column.
Shopify-app and WordPress-plugin sales. Find every Shopify store in your region by filtering for type "Ecommerce" + CMS "Shopify", then scan the full tech list to see which apps they already run (or don't).
Clay / n8n / Make enrichment column. Feed a "Company Domain" column through this Actor's quick-response endpoint and get CMS, framework, CDN, analytics, and marketing tools appended to every row, without paying for Clay's Explorer tier or a monthly BuiltWith plan.
ABM TAM sizing. Measure Next.js / Shopify Plus / Webflow adoption across 10,000 sites in your vertical to validate ICP before spinning up an outbound team.
Enterprise vs SMB segmentation. Each detected technology carries a pricing tag when known. Filter breakdown for entries with pricing containing high or poa to flag enterprise-priced stacks (HubSpot Enterprise, Marketo, …), or freemium for free-plan adopters you could upsell. About 60% of detected technologies carry pricing data.

A common adjacent need: this Actor returns the tech stack, not contact data. Feed the URL → CMS output into the Email Scraper to pull emails off the same domains, or pair with the Apollo Scraper, Contact Info Scraper, or Clay's email-enrichment column for decision-maker enrichment.

How it works

Here's what the CMS detector does for every domain you pass in:

Clean up the URL. Bare domains, www.-prefixed, or any deep URL all work; we always check the homepage. www. is dropped, so www.x.com and x.com end up as one row.
Fetch the homepage looking like a Chrome browser. We mimic Chrome so the site doesn't realise it's being scraped, but we don't run a full browser. That's enough to get past most Cloudflare and bot-check pages.
Try again as a search-engine crawler if the first fetch looks blocked. If the page looks paywalled or cookie-walled, we try a second time pretending to be the DuckDuckGo bot. Many sites serve a cleaner version to search engines than to browsers, where the actual platform clues live.
Match against ~7,500 known technologies from the open-source enthec/webappanalyzer project (the community successor to the original Wappalyzer database). We add extra checks for modern frameworks (Next.js, Nuxt, React, Vue, Svelte, Astro, Remix, Gatsby) so they show up even when the live site hides the usual clues.
Clean up and categorise. One type label plus the curated cms / framework / cdn / analytics / marketing lists. We also fix common upstream quirks. For example, Amazon S3 is object storage and not a CDN (we move it), and Leadfeeder is a marketing tool and not analytics (we move it).
Save one row per domain. Failed fetches (DNS error, 4xx, 5xx, timeout) do not create a row, so you're only billed for sites we actually returned a result for.

Input

Two ways to give it a list of domains:

Apify Console (spreadsheet-friendly). Open the Actor, paste one URL per line into the Start URLs or domains box, hit Start. Bare domains, www.-prefixed, deep URLs are all accepted.

API / programmatic. Pass an array under start_urls:

{
  "start_urls": [
    { "url": "shopify.com" },
    { "url": "https://www.nytimes.com" },
    { "url": "techcrunch.com" }
  ]
}

You can pass 1 URL or 100,000 in one call. 10,000 finishes in ~50–100 minutes, 100,000 runs overnight. Subdomains are distinct rows; docs.example.com and example.com produce two separate detections.

Output

Three real examples, all from a live run.

A WordPress site (TechCrunch)

{
    "domain": "techcrunch.com",
    "url_checked": "https://techcrunch.com/",
    "type": "CMS",
    "cms": { "name": "WordPress", "version": "6.9.4" },
    "framework": "WordPress",
    "cdn": [],
    "analytics": [],
    "marketing": ["Google Tag Manager", "Sailthru"],
    "breakdown": [
        { "name": "MySQL",     "version": null,    "categories": ["Databases"],             "pricing": [] },
        { "name": "Nginx",     "version": null,    "categories": ["Web servers"],           "pricing": [] },
        { "name": "PHP",       "version": null,    "categories": ["Programming languages"], "pricing": [] },
        { "name": "React",     "version": null,    "categories": ["JavaScript frameworks"], "pricing": [] },
        { "name": "Sailthru",  "version": null,    "categories": ["Marketing automation"],  "pricing": ["poa"] },
        { "name": "WordPress", "version": "6.9.4", "categories": ["CMS", "Blogs"],          "pricing": ["low", "recurring", "freemium"] },
        { "name": "Yoast SEO Premium", "version": "25.1", "categories": ["SEO"],            "pricing": ["low", "freemium", "recurring"] }
    ],
    "tech_count": 18
}

(Breakdown above is trimmed for readability; the live row contains every detected technology. Sites change: when this Actor was first written TechCrunch was on Amazon CloudFront with HubSpot installed; today the page leaks Sailthru instead.)

A Shopify store

{
    "domain": "shopify.com",
    "url_checked": "https://shopify.com/",
    "type": "Ecommerce",
    "cms": { "name": "Shopify", "version": null },
    "framework": "Shopify",
    "cdn": ["Cloudflare"],
    "analytics": [],
    "marketing": [],
    "breakdown": [
        { "name": "Cloudflare", "version": null, "categories": ["CDN"],                   "pricing": [] },
        { "name": "FedEx",      "version": null, "categories": ["Shipping carriers"],     "pricing": [] },
        { "name": "React",      "version": null, "categories": ["JavaScript frameworks"], "pricing": [] },
        { "name": "Shopify",    "version": null, "categories": ["Ecommerce"],             "pricing": ["low", "recurring"] }
    ],
    "tech_count": 10
}

An Unknown result (heavily-stripped custom build). You'll see this for hand-rolled static sites and some JS-rendered SPAs where the production build hides every platform marker:

{
    "domain": "example.com",
    "url_checked": "https://example.com/",
    "type": "Unknown",
    "cms": null,
    "framework": null,
    "cdn": [],
    "analytics": [],
    "marketing": [],
    "breakdown": [],
    "tech_count": 0
}

Field	Meaning
`domain`	Canonical hostname (lowercased, leading `www.` stripped).
`url_checked`	The exact homepage URL fetched.
`type`	One of the seven values in the table above. Always set when the fetch succeeded.
`cms`	`{ name, version }` of the platform, or `null` if no CMS-tier match. `version` may be `null` if the site doesn't expose it.
`framework`	Plain-string "what runs this site": Next.js / Webflow / WordPress / etc.
`cdn`	CDN providers (Cloudflare, CloudFront, Fastly, Akamai, BunnyCDN, jsDelivr, …).
`analytics`	Web-analytics tools (Google Analytics, Plausible, Mixpanel, Heap, Fathom, Amplitude, Matomo, …).
`marketing`	Marketing automation, email marketing, CRM, tag managers, live chat, A/B testing, retargeting, CDP.
`breakdown`	Full detection list, sorted alphabetically. Each item: `name`, optional `version`, list of `categories`, and a `pricing` array (when upstream data is available; see below).
`breakdown[].pricing`	Cost tier (`low` <$100/mo, `mid` $100–$1k/mo, `high` >$1k/mo, `poa` price-on-asking) and/or billing model (`freemium`, `recurring`, `onetime`, `payg`). Empty array when upstream has no pricing data, about 40% of techs (mostly open-source projects, browser APIs, and infrastructure primitives).
`tech_count`	Unique technologies detected; useful for sorting "most stack-rich" domains.

Download the dataset as JSON, CSV, Excel, HTML, or XML from the Dataset tab. Tools like Clay, Make, n8n, and Zapier can stream rows out as they're produced via webhook.

BuiltWith alternative: pay per result, no monthly minimum

Billed per successful detection: one row pushed to the dataset per domain we returned a result for. Failed fetches (DNS error, 4xx, 5xx, timeout) do not push a row and are free. The maxItems setting on a run is respected: you won't get charged for more than you asked for.

Throughput: ~5–10 minutes for 1,000 domains, ~50–100 minutes for 10,000, overnight for 100,000 (concurrency capped at 10 simultaneous fetches to keep memory predictable).

Compare:

	This Actor	Wappalyzer Pro	BuiltWith Basic	WhatCMS
Billing model	Pay per result	Monthly subscription	Monthly subscription	Per-lookup or subscription
Monthly minimum	None	Yes	Yes	None
Credit expiration	None	60 days	n/a	n/a
Technologies you can filter on	Unlimited	Unlimited	Capped at 2	CMS only
Bulk lookup via API	Yes	Yes	Yes	Yes (paid tier)
Single `type` label per site	Yes	No	Partial	Yes (CMS only)
Bot-UA cloaking workaround	Yes	No	n/a	Unknown
Modern-framework heuristic layer (Next.js / Vue / Svelte / Astro)	Yes	Partial	Partial	n/a (CMS only)

Where the technology database comes from

In August 2023, Wappalyzer closed its open-source rules and moved everything behind a paid subscription. The ~7,500-technology database that powered every "what is this site running" tool for a decade was suddenly frozen.

A few weeks later (September 2023), the enthec/webappanalyzer project picked up where Wappalyzer left off, keeping the database open, public, and actively maintained. It now covers ~7,500 technologies across 108 categories and is updated roughly weekly (most recent update: April 2026; 500+ stars, 119 forks on GitHub).

Almost no shipped tool uses it. Most "Wappalyzer alternative" libraries you'll find online are still using the frozen pre-2023 database, which means they don't recognise Next.js App Router, modern Shopify themes, recent Cloudflare products, the post-2024 wave of headless CMSes, or anything added in the last two years. This Actor reads the up-to-date enthec database directly, and on top of that adds a clean-up pass to fix the quirks that the upstream data still has (Amazon S3 wrongly tagged as a CDN, B2B retargeting tools tagged as analytics, and so on).

Custom changes on top of the upstream

Source for the curious: github.com/Polluxs/apify/tree/master/apify-cms-detector.

A naive "load enthec JSON + match regex" loop produces output most users would consider broken, both because the database has known false positives that any pre-2023 Wappalyzer fork would hit, and because enthec's category IDs aren't the same as the old database's. Everything I had to fix to get clean output:

Of the ~7,500 fingerprints, exactly 884 are JS-only. They fire only on window.X globals, so any HTTP-only matcher (no headless browser) has a hard ceiling at ~6,634 detectable techs. Useful number if you're planning capacity or comparing alternatives.
Category-ID remapping for the post-2023 database. Pre-2023 Wappalyzer numbered "Email marketing" as cat 95 and "Personalisation" as cat 70. enthec renumbered: cat 95 is now "Digital asset management", cat 70 is now "SSL/TLS certificate authorities". Every Wappalyzer port written before September 2023 silently mis-buckets technologies if its MARKETING_CATEGORY_IDS constants haven't been audited against the new categories.json. I rebuilt the bucket constants from scratch.
"Cart Functionality" tie-break fix. Upstream ships a generic cats=[6] detector called "Cart Functionality" that triggers on any page with shopping-cart markup. In a naive port it beats out the real platform alphabetically, so stripe.com, shopify.com, and most ecommerce sites end up reporting "Cart Functionality" as the CMS instead of Shopify / Stripe. I exclude it from the CMS / framework picker (SKIP_CMS_FRAMEWORK_PICK); it still appears in breakdown so the match is visible.
Bucket priority flip + ~25 curation overrides. The upstream double-tags hundreds of marketing-automation tools (Braze, CleverTap, Airship, …) as both Marketing automation AND Analytics. I flipped the bucket priority from analytics-first to marketing-first so those naturally land in marketing, then added explicit overrides for the edge cases the upstream gets wrong: Amazon S3 excluded from CDN (object storage, not CDN), styled-components / Emotion / JSS excluded from frameworks (CSS-in-JS), Leadfeeder / LinkedIn Insight Tag moved analytics → marketing (B2B retargeting), Datadog / BugSnag / etc. dropped from analytics (APM, not web analytics). All deltas live in one BUCKET_OVERRIDES dict so the curation is auditable.
Label fixes on the breakdown rows so they match the top-level buckets. Amazon S3 reads "Object storage" (not "CDN"), styled-components reads "CSS-in-JS" (not "JavaScript frameworks"), Leadfeeder reads "B2B retargeting" (not "Analytics"), Datadog reads "APM", Ahrefs drops the "Analytics" tag, Imperva drops the "CDN" tag, etc.
Modern-framework heuristic layer. Wappalyzer's strict rules occasionally miss Next.js / Nuxt / React / Vue / Svelte / Astro / Remix / Gatsby on production builds that strip the obvious markers. I look explicitly for /_next/, __NEXT_DATA__, window.__NUXT__, data-reactroot, data-svelte-h, data-astro-, /_remix/, gatsby-image, etc. as secondary markers.
DuckDuckGo bot-UA fallback. Some sites cloak: paywalled or cookie-walled to real browsers, clean SEO version to crawlers. If the Chrome response looks materially different from the DuckDuckGo-UA response, I run detection on the bot version. Recovers a meaningful chunk of otherwise-Unknown sites for cheap.

Open an issue if you spot a curation case I've missed. BUCKET_OVERRIDES and SKIP_CMS_FRAMEWORK_PICK are one-line additions.

Why open source matters here

The detection ruleset, the curation overrides, and the matcher source are all public. When a detection is wrong on a real site, you have actual leverage: open an issue on this Actor (curation overrides are one-line additions) or PR the fingerprint upstream at enthec/webappanalyzer where the fix lands for every tool that consumes the database. Many eyes on one shared ruleset means edge cases get found and patched faster than any single team could ship them. Closed-data tools (BuiltWith, the post-2023 Wappalyzer SaaS) are fine for what they are; the update loop just runs on their schedule, not the community's.

What we don't detect

We don't run a full browser. This Actor mimics a real Chrome browser so it doesn't get blocked the way ordinary scrapers do, but it doesn't run the page's JavaScript. Skipping JavaScript is what keeps it fast, and on ~95% of sites the platform still shows up in the page source so a real browser would add nothing. Every fetch is real-time; nothing is cached.

The 5% that does need a browser:

Tools that only show up after the page's JavaScript has run. Most of the popular ones (HubSpot, Salesforce, Segment, Mixpanel, Amplitude, FullStory, …) still have static markers so we catch them. The notable misses are the Adobe enterprise stack (Adobe Analytics, Target, DTM, Launch), Drift, Cloudflare Turnstile / Zaraz / Rocket Loader, Microsoft Application Insights, and Amazon CloudWatch RUM; these are 100% JavaScript-only with no static fingerprint to fall back on.
Sites that build their entire homepage with JavaScript and leave almost nothing in the page source, returning type: "Unknown".
The underlying framework (Next.js, Nuxt, React) usually still leaves traces in the page source (like /_next/ paths or __NEXT_DATA__ blocks) and we explicitly look for those, so even a JavaScript-heavy site typically returns type: "Framework" with the correct name.

Want browser-rendered detection? Open an issue on the Issues tab with the URL(s) where you're seeing Unknown. If demand is there I'll add an opt-in browser-rendered mode (the default stays cheap and fast for the 95%).

How accurate is it? Independent testing across ~2,000 sites (Nick Sawinyh, SEOmator) puts CMS detection accuracy at 87–93% across all major tools, and backend-tech detection at ~27%. We sit in the same band, and the freshness of the enthec database is what tips the modern-tech edge cases (Next.js App Router, Astro, recent Shopify themes, post-2024 CDN products) where libraries still using the pre-2023 frozen database quietly miss things.

Integrations

Clay (HTTP Enrichment column). Skip the Explorer tier. Paste the run-sync URL into a Clay HTTP column and append cms / framework / cdn to every row:

GET https://api.apify.com/v2/acts/<username>~apify-cms-detector/run-sync-get-dataset-items
  ?token=<APIFY_TOKEN>
  &method=POST
  &start_urls[0][url]={{Domain}}

The ?method=POST query parameter is the standard Apify trick for tools that only support GET webhooks (Clay, certain Zapier paths, browser bookmarks).

n8n. Add an HTTP Request node, method POST, URL:

https://api.apify.com/v2/acts/<username>~apify-cms-detector/run-sync-get-dataset-items?token=<APIFY_TOKEN>

Body (JSON): { "start_urls": [{ "url": "{{$json.domain}}" }] }. Wire a CRM trigger in, filter on {{$json.cms.name === "Shopify"}}, send to Apollo or Outreach.

Make / Zapier / Google Sheets. Same pattern: a single HTTP module pointed at run-sync-get-dataset-items, returning JSON in one round-trip. Combine with Apify Schedules to refresh a Google Sheet of prospects nightly and Slack-alert on CMS changes.

curl (any other workflow).

curl -X POST "https://api.apify.com/v2/acts/<username>~apify-cms-detector/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "start_urls": [{ "url": "shopify.com" }] }'

FAQ

What CMS is this site built with? Paste the domain into this CMS detector and it returns the exact platform (WordPress, Shopify, Wix, Drupal, …) plus the rest of the stack — one site in seconds, or one row per domain across a whole list.

Does this work on JavaScript-heavy sites (single-page apps)? Partly. We don't run JavaScript; we read the page source like a search engine does. Any tool that only appears after JavaScript runs in your browser is invisible to us. The good news: most modern sites still leave framework traces in the page source (Next.js leaves /_next/ paths, React leaves data-reactroot markers, etc.) and we explicitly look for those, so you'll usually still see the framework name even when the rest is hidden. About 5% of sites return Unknown. If you need full browser-mode detection, open an issue with the URL.

Why not just run a real browser then? Speed and cost. Running a full browser per site is roughly 10× more expensive per URL and takes seconds instead of milliseconds. It also gets blocked more often by Cloudflare and Akamai bot-detection products, because their detectors specifically look for headless browsers. For the 95% of sites where the platform shows up in the page source, a browser adds nothing. The 5% where it'd genuinely help is exactly what an opt-in browser mode would solve; let me know via Issues if you're in that 5%.

What's the difference vs. BuiltWith / Wappalyzer / WhatCMS? No monthly subscription and no lock-in. BuiltWith Basic caps you at 2 technologies you can filter on; Wappalyzer Pro expires unused API credits after 60 days. This Actor bills per result with no minimum and no expiration. The detection engine itself uses the same upstream fingerprint database as Wappalyzer (the actively-maintained enthec fork, ~7,500 technologies).

How many technologies do you detect? Why not all 7,500? All 7,500 are loaded. Detection rate depends on how well a tech leaks into the HTML. Popular CMSes and ecommerce platforms (WordPress, Shopify, Webflow, Wix, Magento, Drupal) hit near 100%; obscure plugins and JS-only libraries hit much lower. We don't artificially cap.

Do you return version numbers? Confidence scores? Versions: yes, when the site exposes them (e.g. WordPress's <meta name="generator"> tag). Many sites strip these in production; version: null means matched-but-no-version. Confidence: not surfaced today; under consideration.

What about Cloudflare-protected sites? Most of them work. Mimicking a real Chrome browser gets us through Cloudflare's standard bot challenges, and the DuckDuckGo bot fallback recovers a chunk of what's left. Sites with the hardest JavaScript challenges (Cloudflare Turnstile, Akamai Bot Manager) will sometimes return Unknown. Open an issue with the URL if you hit one.

Do you detect headless CMSes (Contentful, Sanity, Strapi)? When they leak. Contentful's CDN domain and Sanity's API endpoints are in the fingerprint database. If the site fully proxies them, we'll see the framework (Next.js, Astro, etc.) but not the headless backend.

Why is cms null for this site? Either (a) the site isn't built on a CMS (check the framework field instead); (b) it's a JavaScript-heavy site where platform clues never appear in the page source; or (c) the site explicitly stripped all generator tags. Cross-check by running the URL through Wappalyzer's browser extension. If Wappalyzer sees it too, we should. Open an issue with the URL.

Why wasn't X detected even though I can see it on the site? Each detection has a confidence score, and a tool is only reported once the total confidence reaches 100. Strong signals (like a unique header or generator tag) count as 100 on their own, so they fire by themselves. Weaker signals are worth 50, which means one weak signal alone isn't enough — you need two. This is the upstream Wappalyzer model and we follow it. It cuts down on noisy false positives, but it does mean obscure tools sometimes hide behind a single weak marker.

Why does a site that re-platformed still show its old platform? Migrations rarely strip every legacy marker. Old /wp-content/ paths in image references, <meta name="generator"> tags in the homepage source, robots.txt entries, sitemap structures — all linger for months or years. If a detection seems wrong on a site you know switched stacks, check the Wayback Machine to see when the change happened; the lingering markers typically fade over 6–18 months as the site is rebuilt.

Can a site lie to detection? Yes. Tech-stack detection reads public signals (HTML, headers, cookies, script src URLs) and any of those can be added or removed. Julien Verneaut demonstrated that Wappalyzer can be tricked into reporting 1,929 technologies on a single page by stuffing it with matchable scripts and cookies. Practical implication: treat tech-stack data as a signal, not gospel. For high-stakes calls (a million-dollar account, a security audit), cross-reference with another source before committing.

Why does the result sometimes show a strange category label inside breakdown? The upstream Wappalyzer database has some legacy and overlapping categories (you'll occasionally see "Captchas" tagged on unrelated tech). We pass most of those through as-is. For the handful of technologies the upstream consistently mislabels (Amazon S3 tagged as CDN, styled-components tagged as a JS framework, Leadfeeder tagged as analytics, …), we apply a small relabel, so Amazon S3 reads "Object storage", styled-components reads "CSS-in-JS", Leadfeeder reads "B2B retargeting", etc. That keeps breakdown[].categories consistent with the top-level cdn / analytics / marketing / framework arrays.

Do you provide contact data? No. This Actor returns the tech stack. Pipe the URL into the Email Scraper to pull addresses off the same domains, or pair with the Apollo Scraper, Contact Info Scraper, or Clay's email-finder for decision-maker enrichment.

How is this different from running Wappalyzer's open-source rules myself? Three things. (1) Fallbacks to actually get the page (cookie-accept walls, redirects, browsers cloaked behind a search-engine-only version), all handled without a full headless browser. (2) An extra detection layer for modern frameworks (Next.js, Nuxt, React, Vue, Svelte, Astro, Remix, Gatsby) that fires when the strict Wappalyzer rules don't. (3) A clean-up pass on the categorisation: Amazon S3 doesn't end up in cdn (it's object storage), Leadfeeder ends up in marketing instead of analytics (it's a B2B retargeting tool), the generic "Cart Functionality" entry doesn't steal the CMS slot from Shopify, and so on.

What if I have a CSV of domains, not a JSON file? Open the Actor in the Apify Console, paste the column directly into the Start URLs or domains box (one per line). No JSON formatting needed.

What format do my URLs need to be in? Anything reasonable. Bare domain (example.com), www.example.com, https://example.com, deep URLs (https://example.com/blog/post); we strip everything to the homepage and lowercase the host. www. is dropped so www.x.com and x.com collapse into one row; if you want both, pass the subdomain explicitly (e.g. docs.example.com).

What if a detection is wrong? Open an issue on the Apify Console Issues tab with the URL and what you expected to see. Both false positives and false negatives are useful: they help us tune the curation rules.

Is scraping public site metadata legal? Detecting the platform from public HTML is generally permitted: you're reading already-published metadata, the same data Wappalyzer's browser extension reads. You remain responsible for following each target site's Terms of Service and applicable law.

Support

Open an issue on this Actor's Issues tab on the Apify Console. Include the URL, the field that's wrong, and what you expected. Detection bugs are usually fixable by adding a fingerprint or a curation override; those land in the next build.

Tech Stack Detector API - BuiltWith & Wappalyzer Alternative

tugelbay/website-tech-stack-detector

Tech stack detector and website technology checker API. BuiltWith/Wappalyzer alternative for bulk URL enrichment: detect 100+ CMS, ecommerce, framework. Guide: https://konabayev.com/tools/website-tech-stack-detector/?utm_source=apify_info&utm_medium=referral&utm_campaign=website-tech-stack-detector

Tugelbay Konabayev

Website Tech Stack Detector - CMS & Framework

logiover/website-tech-stack-detector

Bulk BuiltWith API alternative: detect any website's CMS, framework, analytics & hosting, then export the tech stack to CSV/JSON. No browser, pay per result.

Logiover

Website Tech Stack Detector — CMS, Frameworks, Analytics

samwise.agency/tech-stack-detector

Detect any website's technology stack — CMS, JS frameworks, analytics, marketing, payments, CDN, server. Bulk BuiltWith alternative, keyless, detected directly from the site. Clean JSON/CSV for sales intel and competitive research.

samwise.agency

Bulk Tech Stack & CMS Detector

milburn-automations/bulk-tech-stack-cms-detector

Instantly detect if a list of websites is using WordPress, Shopify, Wix, or Custom HTML. Perfect for Web Agencies looking for leads, SEO audits, and market research. Fast, lightweight, and cheap.

Milburn Data Solutions

Tech Stack Detector — BuiltWith / Wappalyzer Alternative

ponderable_hydrometer/tech-stack-scraper

Detect any website's tech stack from a URL — Shopify, WordPress, React, Stripe, HubSpot, Cloudflare and 60+ more. A cheaper, keyless, reliable BuiltWith / Wappalyzer alternative for B2B sales targeting, competitive intel and lead enrichment. No API key, no proxies.

Ponderable Hydrometer

Website Technology Detector – Tech Stack Scanner & CMS Detector

motivational_nickel/website-technology-detector

Website technology detector that identifies CMS, ecommerce platforms, frameworks, JavaScript libraries, analytics tools, marketing pixels, CDN, and hosting signals. Detect WordPress, Shopify, React, Next.js, and other tech stacks for lead generation, audits, and automation workflows.

Leoncio Jr Coronado

Tech Stack Detector - BuiltWith Alternative

alizarin_refrigerator-owner/tech-stack-detector

Detect website technology stack including CMS, frameworks, analytics, and marketing tools. A free alternative to BuiltWith. CMS detection Framework detection Analytics tools Marketing tools Hosting & CDN Payment & chat

The Howlers

167

1.0

Website Technology Detector

enrich-crm/tech-detector

Best alternative to builtwith.com and Wappalyzer.com : Detect technologies (CMS, CDN, analytics, frameworks, etc.) used by any website.

Enrich-CRM

Website Tech Stack Detector — CMS, Ecommerce & Marketing Tools

hichemdev/website-tech-stack-detector

Detect what any website runs on: CMS, ecommerce platform, analytics, marketing tools, frameworks, payments and CDN. Perfect for lead scoring and competitor research.

Hichem Ben Moussa

Tech Stack Detector

automation-lab/tech-stack-detector

Affordable BuiltWith and Wappalyzer alternative. Detect 45+ technologies on any website — JavaScript frameworks, CMS, analytics, CDN, and hosting — from a single HTTP request. Ideal for bulk lead qualification, competitive analysis, and technology market research.