Website Tech Stack Detector — Technographics by Domain avatar

Website Tech Stack Detector — Technographics by Domain

Deprecated

Pricing

from $8.50 / 1,000 domain profileds

Go to Apify Store
Website Tech Stack Detector — Technographics by Domain

Website Tech Stack Detector — Technographics by Domain

Deprecated

Detect the technologies a website runs — CMS, ecommerce platform, analytics, tag managers, JS frameworks, CDN, payment, and marketing tools. Give a list of company domains; get a normalized JSON tech profile per site for B2B sales targeting and competitive technographic research.

Pricing

from $8.50 / 1,000 domain profileds

Rating

0.0

(0)

Developer

Scott Helvick

Scott Helvick

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 hours ago

Last modified

Share

Find out what technology any website runs. Give this Actor a list of company domains and it returns a normalized JSON technology profile per site — the CMS, ecommerce platform, analytics and tag managers, JavaScript frameworks, CDN, payment providers, marketing and support tools, and more — each with the evidence that identified it. Built for B2B sales targeting, competitive research, and technographic datasets, and callable directly by an AI agent.

What this does

  • Takes a list of domains or URLs (1–100 per run) and profiles each site's homepage.
  • Detects technologies across ~20 categories: CMS (WordPress, Drupal, Webflow, Wix, …), ecommerce platform (Shopify, WooCommerce, Magento, BigCommerce, …), analytics (Google Analytics, Hotjar, Segment, Mixpanel, …), tag managers, JavaScript frameworks (React, Next.js, Vue, Angular, …), CDN / hosting, web server / language, marketing automation (HubSpot, Marketo, Klaviyo, …), customer support / chat, advertising pixels, A/B testing, payment (Stripe, PayPal, Braintree, …), cookie consent, search, and video.
  • Returns each detected technology with a category, a confidence (high for a specific signature like a script URL or generator tag; medium for a generic HTML pattern), and the concrete evidence that matched.
  • Groups results by category and gives a per-site technology count for fast scanning.
  • Profiles are deterministic — the same page yields the same result every time. No model guesses what a site "probably" runs.

Use it to:

  • Build B2B prospect lists filtered by technology (e.g. "every site running Shopify and Klaviyo").
  • Score sales leads by the tools they already use.
  • Run competitive research on what a set of competitors' sites are built with.
  • Track migrations — re-run a domain list over time to see platform changes.
  • Feed an AI sales-research agent structured technographic data without maintaining your own signature database.

Why deterministic, signature-based detection matters

Technographic data drives outreach and spend decisions, so a wrong answer is worse than no answer. Every technology this Actor reports is a verbatim signature match against the page's own markup — a script URL, a generator meta tag, a framework marker, a response header — and the matching evidence travels with each result so you can audit it. There is no language model in the detection path inventing a plausible-but-wrong stack. A useful side effect: because the work is fetch-plus-pattern-match with no inference bill, the cost floor is tiny, so the price reflects the lookup, not a model call.

How it compares to the alternatives

ApproachNormalized categoriesEvidence per matchBulk by domainAgent-callable
Subscription technology-lookup servicesyesrarelyyesvia their own API/plan
Roll-your-own page parsingyou build ityou build ityou build ityou build it
Website Tech Stack Detectoryesyesyes (1–100/run)yes

The honest framing: you can parse pages and maintain a signature set yourself — this Actor is for when you'd rather not own that, and want a stable JSON contract you can point an agent or a pipeline at. Subscription technology databases are the alternative when you need a multi-year history or a firmographic overlay and don't mind a per-seat plan; this Actor is the better fit for on-demand, pay-per-domain lookups wired into your own workflow.

Input

FieldTypeRequiredDefaultDescription
domainsarray of stringsyesCompany domains or full URLs to profile (1–100). Accepts example.com, www.example.com, or https://example.com/path; each is normalized to an https homepage fetch.
deepRenderbooleanfalseForce a full JavaScript render on every domain to catch technologies injected by client-side scripts. Off by default — raw HTML already contains the loader tags for the large majority of technologies, and the fetch escalates to a render automatically when a site blocks the plain request.

One dataset record is produced per input domain.

Output

One record per domain. Nullable fields are null on a failed record.

{
"identifier": "shopify.com",
"status": "completed",
"url": "https://www.shopify.com/",
"domain": "shopify.com",
"technologies": [
{
"name": "Shopify",
"category": "Ecommerce",
"confidence": "high",
"evidence": "URL: https://cdn.shopify.com/s/..."
},
{
"name": "Google Analytics",
"category": "Analytics",
"confidence": "high",
"evidence": "URL: https://www.googletagmanager.com/gtag/js?id=G-..."
}
],
"categories": {
"Ecommerce": ["Shopify"],
"Analytics": ["Google Analytics"]
},
"technologyCount": 2,
"realizedTier": "basic",
"error": null,
"notice": "Technologies are inferred from publicly served page markup via signature matching; detection is best-effort and provided as-is, not a guarantee of what a site runs."
}

A failed record carries status: "failed", an error tag (e.g. fetch-failed, empty-response, invalid-domain), and an empty technologies list.

Example

Profile three companies' sites:

{
"domains": ["shopify.com", "stripe.com", "wordpress.org"]
}
curl -X POST "https://api.apify.com/v2/acts/shelvick~website-tech-stack-detector/run-sync-get-dataset-items?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"domains":["shopify.com","stripe.com","wordpress.org"]}'
from apify_client import ApifyClient
client = ApifyClient("YOUR_TOKEN")
run = client.actor("shelvick/website-tech-stack-detector").call(
run_input={"domains": ["shopify.com", "stripe.com", "wordpress.org"]}
)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item["identifier"], item["technologyCount"], list((item.get("categories") or {}).keys()))

Calling from an AI agent

Apify MCP server (mcp.apify.com) — the Actor is exposed as a callable tool whose input schema is self-documenting, so an LLM can construct a valid call from the tool description alone (domains in; technology profiles out). Pay per call via x402 (USDC on Base) or Skyfire managed tokens.

Apify SDK (Python)from apify_client import ApifyClient, then client.actor("shelvick/website-tech-stack-detector").call(run_input=...) and iterate the dataset (see above).

REST APIPOST /v2/acts/shelvick~website-tech-stack-detector/run-sync-get-dataset-items?token=... for synchronous runs; the async /runs endpoint for large domain lists that may exceed the 5-minute sync window.

Pricing

Pay-per-event, billed only on success: one charge per domain that is fetched and analyzed, after its record is pushed to the dataset. Domains that fail to fetch — or that are too heavily bot-walled to reach — are free. Because billing is per domain, your domain-list length is your spend cap.

See the Pricing tab on this Store page for the current per-domain rate and any active subscriber discounts.

Design notes: www.scotthelvick.com/tools/website-tech-stack-detector

Behavior

Run-level failures (rare) — input validation only: an empty domains list or more than 100 entries is rejected before any work.

Per-domain outcomes (common) — each domain yields a record; failures are isolated and never charged:

  • invalid-domain — the input had no usable hostname.
  • fetch-failed — the site could not be reached (or is bot-walled beyond the rendered-fetch tier).
  • challenge-blocked — the fetch returned a bot-wall / CAPTCHA challenge page rather than the real site (only CDN/security markers were present), so it is reported as blocked instead of a misleading thin profile.
  • empty-response — the fetch returned no usable HTML.

A domain that is reached successfully but matches no known signature returns status: "completed" with an empty technology list — a valid answer (the site uses none of the detected technologies), and it is charged like any successful profile.

Performance — one homepage fetch per domain, raw HTML first with an automatic escalation to a rendered fetch when a site needs it. Domains are processed concurrently, so a small list finishes in seconds; a 100-domain list runs longer and may need the async endpoint rather than the 5-minute sync window.

FAQ

Which technologies can it detect? Around 95 technologies across ~20 categories — CMS, ecommerce platforms, analytics, tag managers, JavaScript frameworks, CDN/hosting, web servers and languages, marketing automation, support/chat, ad pixels, A/B testing, payment, cookie consent, search, and video. The set favors the highest-signal, most common technologies and grows over time.

Why did a site I know uses tool X not show it? Some technologies are injected only after client-side scripts run; enable deepRender to force a full render. Others leave no detectable public signature, and a few sites are bot-walled beyond the rendered-fetch tier (those return a failed record and aren't charged).

What does "confidence" mean? high = a specific signature matched (a third-party script URL, a generator meta tag, a response header, or a cookie). medium = a generic HTML pattern matched. Every result includes the exact evidence so you can verify it.

Can I pass full URLs, not just domains? Yes — bare domains, www. hostnames, and full URLs are all accepted; each is normalized to an https homepage fetch.

Is this only the homepage? Yes. One homepage fetch per domain covers site-wide technologies (most tags load on every page) and keeps the cost predictable. Per-page crawling is out of scope.

What this doesn't do

  • No deep crawl. It profiles the homepage, not every page of a site.
  • No technology history or version timelines. It reports the current state, not when a site adopted or dropped a tool.
  • No firmographics or contact data. It returns the tech stack, not company size, revenue, or email addresses.
  • No market-share rankings. It profiles the domains you give it; it doesn't tell you how popular a technology is overall.
  • No authenticated or paywalled pages. Public homepage markup only.

For an aggregated competitive landscape of local businesses (counts, ratings, saturation) use a local-market analysis Actor instead. For turning a list of URLs into arbitrary structured fields against your own schema, use a structured web-extraction Actor. For fetching the raw page content itself in multiple formats, use an adaptive page-fetching Actor.