Web Company Intelligence (MCP)
Pricing
from $5.00 / 1,000 company cards
Web Company Intelligence (MCP)
MCP-friendly Apify Actor returning a structured intelligence card for any company worldwide. Input: name or domain. Output: identity, web presence, tech stack fingerprint. For French-registry data (SIREN, BODACC), use fr-company-intelligence-mcp instead.
Pricing
from $5.00 / 1,000 company cards
Rating
0.0
(0)
Developer
scrap_them_all
Actor stats
1
Bookmarked
1
Total users
0
Monthly active users
a day ago
Last modified
Categories
Share
What does Web Company Intelligence do?
Pass a company name, domain, or URL and get back a single structured JSON object with the company's identity, web presence, and tech stack fingerprint (frameworks, CMS, hosting, CDN, analytics, payments, auth). Universal: works for any company worldwide. Designed to be called from an MCP agent loop (Claude, Cursor, GPT, Custom agents) where input is minimal, output is small, and runs complete in under a second on most sites.
For French registry data (SIREN, BODACC, dirigeants, finances), use the companion Actor
fr-company-intelligence-mcpinstead. The two are complementary: one returns official legal data, the other returns the company's web fingerprint.
Why use it from an AI agent?
- Sub-second runs for most sites (HTTP fetch + regex extraction, no browser).
- Universal input - the agent can pass
Stripe,stripe.com, orhttps://stripe.com/pricingand get the same canonical card. - Tight schema - agents don't waste tokens parsing tables.
- Tech intel out of the box - 50+ technologies detected (Next.js, WordPress, Shopify, Cloudflare, Stripe, HubSpot, Sentry, etc.) with category and confidence.
- Structured failure - even on Actor.fail, the run's
OUTPUTkey carries{ error, message, query }so the agent can branch instead of parsing logs.
Input
| Field | Type | Default | Description |
|---|---|---|---|
query | string | - | Required. Company name, domain (anthropic.com), or URL |
includeTechStack | boolean | true | Run the Wappalyzer-style fingerprint over headers + HTML |
includeRawHtml | boolean | false | Add the first 4 KB of HTML to the output (debug only) |
{"query": "anthropic.com"}
{"query": "Stripe","includeTechStack": true}
Output
Single object, written to both the run's dataset (1 row) and the default
key-value store under the key OUTPUT:
{"name": "Stripe","domain": "stripe.com","homepageUrl": "https://stripe.com/","finalUrl": "https://stripe.com/","statusCode": 200,"title": "Stripe | Financial Infrastructure to Grow Your Revenue","description": "Stripe is a financial services platform...","ogTitle": "Stripe | Financial Infrastructure to Grow Your Revenue","ogDescription": "Stripe is a financial services platform...","ogImage": "https://images.stripeassets.com/.../Stripe.jpg","language": "en-us","country": "US","techStack": [{ "name": "Next.js", "category": "framework", "confidence": "high" },{ "name": "Nginx", "category": "webserver", "confidence": "high" }],"techStackCount": 2,"techByCategory": {"framework": ["Next.js"],"webserver": ["Nginx"]},"rawHtmlExcerpt": null,"sources": ["homepage", "tech-fingerprint"],"fetchElapsedMs": 47,"scrapedAt": "2026-05-08T13:33:38.280Z"}
Name resolution
When query is a brand name (not a domain), the Actor:
- Tries direct TLD guesses in order:
.com,.io,.ai,.co,.net,.org,.app. This resolves the majority of well-known companies in one HTTP probe. - Falls back to a DuckDuckGo HTML lookup when no TLD guess responds, with a hostname-overlap check to reject unrelated results.
If neither lands on a valid homepage, the Actor fails with
error: name_not_resolved and the agent can ask the user to provide the
domain directly.
Pricing (PPE)
| Mode | Price per call |
|---|---|
| Default (domain or URL input) | $0.006 |
| Name resolution (brand name input) | $0.006 |
includeRawHtml: true | $0.006 |
Pricing is per-call, billed via Apify Pay-Per-Event (apify-actor-start
apify-default-dataset-item). No BYOK required - the Actor only hits public homepages.
Compare: Apollo and ZoomInfo charge $0.50-$1+ per company lookup. This Actor delivers identity + tech stack at a fraction of that price, designed for high-volume agent loops.
Tech stack coverage
50+ technologies across these categories:
- Framework: Next.js, Nuxt, Remix, Astro, SvelteKit, Gatsby, React, Vue, Angular, Express
- CMS / E-commerce: WordPress, Shopify, Wix, Squarespace, Webflow, Drupal, Ghost
- Hosting / CDN: Vercel, Netlify, Cloudflare, Fastly, AWS CloudFront, Akamai, GitHub Pages
- Web server: Nginx, Apache, Caddy
- Languages: PHP, ASP.NET, Express
- Analytics: Google Analytics, GTM, Plausible, Mixpanel, Segment, Amplitude, Hotjar, PostHog
- CRM / Live chat: HubSpot, Salesforce, Intercom, Zendesk, Drift, Crisp
- Payments: Stripe, PayPal
- Auth: Auth0, Clerk, Firebase Auth
- A/B testing: Optimizely
- Monitoring: Sentry, Datadog RUM, New Relic
Calling from an MCP agent
The Apify MCP server (mcp.apify.com) exposes this Actor as the tool
call-actor with name web-company-intelligence-mcp. Typical prompt:
"Use call-actor with web-company-intelligence-mcp to fetch the tech
stack of stripe.com." The agent receives the JSON above and can answer
in a single turn.
Limits
- Sites behind aggressive bot protection (Cloudflare Bot Management on certain plans, PerimeterX, DataDome) return 403 to the HTTP fetcher. The Actor still returns headers-only tech detection in that case.
countryis best-effort fromog:locale/<html lang>. It does not reflect legal jurisdiction (usefr-company-intelligence-mcpfor registered French entities).- Tech detection is fingerprint-based, not behavioral - it can miss technologies that load only after user interaction.
Sources, freshness, legality
- All data is fetched from the company's own public homepage with a standard browser User-Agent. No login, no API key.
- Tech detection runs against headers + the static HTML markup. It does not execute JavaScript.
- This Actor exposes only data the company makes publicly available on
their own website. Respect the target site's
robots.txtif you scale this beyond ad-hoc agent calls.