Web Company Intelligence (MCP) avatar

Web Company Intelligence (MCP)

Pricing

from $5.00 / 1,000 company cards

Go to Apify Store
Web Company Intelligence (MCP)

Web Company Intelligence (MCP)

MCP-friendly Apify Actor returning a structured intelligence card for any company worldwide. Input: name or domain. Output: identity, web presence, tech stack fingerprint. For French-registry data (SIREN, BODACC), use fr-company-intelligence-mcp instead.

Pricing

from $5.00 / 1,000 company cards

Rating

0.0

(0)

Developer

scrap_them_all

scrap_them_all

Maintained by Community

Actor stats

1

Bookmarked

1

Total users

0

Monthly active users

a day ago

Last modified

Share

What does Web Company Intelligence do?

Pass a company name, domain, or URL and get back a single structured JSON object with the company's identity, web presence, and tech stack fingerprint (frameworks, CMS, hosting, CDN, analytics, payments, auth). Universal: works for any company worldwide. Designed to be called from an MCP agent loop (Claude, Cursor, GPT, Custom agents) where input is minimal, output is small, and runs complete in under a second on most sites.

For French registry data (SIREN, BODACC, dirigeants, finances), use the companion Actor fr-company-intelligence-mcp instead. The two are complementary: one returns official legal data, the other returns the company's web fingerprint.

Why use it from an AI agent?

  • Sub-second runs for most sites (HTTP fetch + regex extraction, no browser).
  • Universal input - the agent can pass Stripe, stripe.com, or https://stripe.com/pricing and get the same canonical card.
  • Tight schema - agents don't waste tokens parsing tables.
  • Tech intel out of the box - 50+ technologies detected (Next.js, WordPress, Shopify, Cloudflare, Stripe, HubSpot, Sentry, etc.) with category and confidence.
  • Structured failure - even on Actor.fail, the run's OUTPUT key carries { error, message, query } so the agent can branch instead of parsing logs.

Input

FieldTypeDefaultDescription
querystring-Required. Company name, domain (anthropic.com), or URL
includeTechStackbooleantrueRun the Wappalyzer-style fingerprint over headers + HTML
includeRawHtmlbooleanfalseAdd the first 4 KB of HTML to the output (debug only)
{
"query": "anthropic.com"
}
{
"query": "Stripe",
"includeTechStack": true
}

Output

Single object, written to both the run's dataset (1 row) and the default key-value store under the key OUTPUT:

{
"name": "Stripe",
"domain": "stripe.com",
"homepageUrl": "https://stripe.com/",
"finalUrl": "https://stripe.com/",
"statusCode": 200,
"title": "Stripe | Financial Infrastructure to Grow Your Revenue",
"description": "Stripe is a financial services platform...",
"ogTitle": "Stripe | Financial Infrastructure to Grow Your Revenue",
"ogDescription": "Stripe is a financial services platform...",
"ogImage": "https://images.stripeassets.com/.../Stripe.jpg",
"language": "en-us",
"country": "US",
"techStack": [
{ "name": "Next.js", "category": "framework", "confidence": "high" },
{ "name": "Nginx", "category": "webserver", "confidence": "high" }
],
"techStackCount": 2,
"techByCategory": {
"framework": ["Next.js"],
"webserver": ["Nginx"]
},
"rawHtmlExcerpt": null,
"sources": ["homepage", "tech-fingerprint"],
"fetchElapsedMs": 47,
"scrapedAt": "2026-05-08T13:33:38.280Z"
}

Name resolution

When query is a brand name (not a domain), the Actor:

  1. Tries direct TLD guesses in order: .com, .io, .ai, .co, .net, .org, .app. This resolves the majority of well-known companies in one HTTP probe.
  2. Falls back to a DuckDuckGo HTML lookup when no TLD guess responds, with a hostname-overlap check to reject unrelated results.

If neither lands on a valid homepage, the Actor fails with error: name_not_resolved and the agent can ask the user to provide the domain directly.

Pricing (PPE)

ModePrice per call
Default (domain or URL input)$0.006
Name resolution (brand name input)$0.006
includeRawHtml: true$0.006

Pricing is per-call, billed via Apify Pay-Per-Event (apify-actor-start

  • apify-default-dataset-item). No BYOK required - the Actor only hits public homepages.

Compare: Apollo and ZoomInfo charge $0.50-$1+ per company lookup. This Actor delivers identity + tech stack at a fraction of that price, designed for high-volume agent loops.

Tech stack coverage

50+ technologies across these categories:

  • Framework: Next.js, Nuxt, Remix, Astro, SvelteKit, Gatsby, React, Vue, Angular, Express
  • CMS / E-commerce: WordPress, Shopify, Wix, Squarespace, Webflow, Drupal, Ghost
  • Hosting / CDN: Vercel, Netlify, Cloudflare, Fastly, AWS CloudFront, Akamai, GitHub Pages
  • Web server: Nginx, Apache, Caddy
  • Languages: PHP, ASP.NET, Express
  • Analytics: Google Analytics, GTM, Plausible, Mixpanel, Segment, Amplitude, Hotjar, PostHog
  • CRM / Live chat: HubSpot, Salesforce, Intercom, Zendesk, Drift, Crisp
  • Payments: Stripe, PayPal
  • Auth: Auth0, Clerk, Firebase Auth
  • A/B testing: Optimizely
  • Monitoring: Sentry, Datadog RUM, New Relic

Calling from an MCP agent

The Apify MCP server (mcp.apify.com) exposes this Actor as the tool call-actor with name web-company-intelligence-mcp. Typical prompt: "Use call-actor with web-company-intelligence-mcp to fetch the tech stack of stripe.com." The agent receives the JSON above and can answer in a single turn.

Limits

  • Sites behind aggressive bot protection (Cloudflare Bot Management on certain plans, PerimeterX, DataDome) return 403 to the HTTP fetcher. The Actor still returns headers-only tech detection in that case.
  • country is best-effort from og:locale / <html lang>. It does not reflect legal jurisdiction (use fr-company-intelligence-mcp for registered French entities).
  • Tech detection is fingerprint-based, not behavioral - it can miss technologies that load only after user interaction.

Sources, freshness, legality

  • All data is fetched from the company's own public homepage with a standard browser User-Agent. No login, no API key.
  • Tech detection runs against headers + the static HTML markup. It does not execute JavaScript.
  • This Actor exposes only data the company makes publicly available on their own website. Respect the target site's robots.txt if you scale this beyond ad-hoc agent calls.