Company Profile Enrichment MCP — Domain to Data avatar

Company Profile Enrichment MCP — Domain to Data

Pricing

from $10.00 / 1,000 results

Go to Apify Store
Company Profile Enrichment MCP — Domain to Data

Company Profile Enrichment MCP — Domain to Data

Give an AI agent a domain and get a clean structured company profile back — description, contact emails, social links, and technology signals. Built MCP-first for agents, with a hard cost cap.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Hamza Tariq

Hamza Tariq

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

2 days ago

Last modified

Share

Company Profile Enrichment MCP — domain to structured company data

Quick Start: click Start with the default input — it enriches a demo domain and writes one dense structured company profile with zero changes. Then drop your own domains into the domains field and run again.

Give it a company domain, get back a dense, clean structured company profile: the company name (and legal name), a normalized industry and a one-line AI summary of what they do, concrete product names, the company's own canonical social links, a logo, a rich technology stack, a structured location (country / region / city / address), founding year, company type, and a generic contact email/phone where the site exposes them. Built MCP-first for AI agents — an agent can call "enrich [domain]" and get JSON back in one step — but it works just as well as a plain company-enrichment API, scraper, extractor, or data feed you call from a script, a sheet, or a CRM.

It reads the homepage plus the /contact and /about pages and prefers schema.org JSON-LD structured data (the Organization block most modern sites publish) over loose meta tags — so socials, address, phone, founding year, and ticker come from the most reliable source on the page.

Natural ways people search for this: company enrichment API, domain-to-company-data, MCP server for AI agents, RAG enrichment, sales enrichment, website data extractor.

What you get per company

Deterministic, website-sourced (present on most marketing homepages):

  • name, legalName, description — from JSON-LD first, then <title> / meta / og:* tags
  • socials — the company's own Twitter/X, LinkedIn, Facebook, Instagram, YouTube, GitHub profile/channel links. Recovered from JSON-LD sameAs and page links, canonical pages only: LinkedIn keeps /company/, /school/, /showcase/ and rejects personal /in/ profiles; tracking redirect wrappers (e.g. Facebook l.php?u=) are decoded to the real profile; video/post/feature deep links are dropped. When in doubt the field stays null.
  • techStack — a Wappalyzer-style detected stack (~45 fingerprints across analytics, CMS, frameworks, ecommerce, CDN/hosting, marketing/CRM, payments, fonts)
  • logoUrl — the brand logo (JSON-LD logo or favicon / apple-touch-icon), distinct from imageUrl
  • imageUrl — the company's og:image (marketing/hero image)
  • location{ country, region, city, address } parsed from JSON-LD PostalAddress or an <address> block (each part null when absent)
  • phone, email — a corporate/generic phone and one generic, trustworthy mailbox (info@, contact@, …) on the company's own domain; placeholder/demo addresses are dropped
  • foundedYear, companyType, ticker — when the page states them

AI-enriched (optional — see the API key note below; null when the AI pass didn't run):

  • industry / subIndustry — a normalized, common-industry label
  • summary — a neutral 1–2 sentence "what they do"
  • products — concrete product/service names (≤10)
  • tags — short topical keywords (≤8)

Always present:

  • confidence — a transparent 0–1 score for how complete the (website-winnable) profile is
  • aiEnrichedtrue only when the AI pass ran and returned fields
  • status / error"ok" for an enriched domain, or "failed" with a short reason ("HTTP 404", "timeout", …) when a domain couldn't be reached. One row is emitted per input domain, so a batch never silently under-delivers.

What this actor does NOT do (honest expectations)

It enriches strictly from a company's own website, so it deliberately does not return proprietary fields a website can't source: verified employee count, revenue, funding rounds, or per-person/personal emails. Those need paid B2B databases — guessing them would be hallucination. This actor nails the website-winnable fields densely instead.

The optional AI key

The AI fields (industry, subIndustry, products, tags, summary) are produced by a small, cheap LLM via any OpenAI-compatible Chat Completions API. Set an AI_API_KEY secret to turn them on — by default it calls OpenRouter (https://openrouter.ai/api/v1) with model google/gemini-2.5-flash-lite. Point it anywhere by also setting AI_BASE_URL and AI_MODEL (e.g. OpenAI https://api.openai.com/v1 + gpt-4.1-nano, or DeepSeek https://api.deepseek.com + deepseek-chat). Without a key the actor still runs and returns the full deterministic record — the AI fields simply stay null and aiEnriched is false. The AI pass also degrades gracefully on any error or timeout, never failing a run.

Input

Runs zero-config. Optional fields:

  • domains — one or more company domains to enrich (bare host or full URL). Empty = a demo run.
  • maxItems — cap on companies per run.
  • maxCostPerRunUsd — hard cost cap, default $5; the run stops before exceeding it.

Output

One canonical CompanyProfile record per domain (keyed by domain, so re-running updates the same company). On repeat runs only profiles that changed are re-emitted — a company-profile monitor. See .actor/dataset_schema.json.

Use as an MCP tool

Apify's MCP server exposes this actor to agents (Claude, Cursor) automatically. The agent intent it matches: "enrich this company / give me structured data for [domain]".

Pricing

Pay-per-event: a small start fee plus a per-enrichment charge, with a hard per-run cost cap on by default so you are never surprised by the bill. Actively maintained — fixes within 24h.