Company Profile Enrichment MCP — Domain to Data
Pricing
from $10.00 / 1,000 results
Company Profile Enrichment MCP — Domain to Data
Give an AI agent a domain and get a clean structured company profile back — description, contact emails, social links, and technology signals. Built MCP-first for agents, with a hard cost cap.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
Hamza Tariq
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
0
Monthly active users
2 days ago
Last modified
Categories
Share
Company Profile Enrichment MCP — domain to structured company data
Quick Start: click Start with the default input — it enriches a demo domain and writes one dense structured company profile with zero changes. Then drop your own domains into the
domainsfield and run again.
Give it a company domain, get back a dense, clean structured company profile: the company
name (and legal name), a normalized industry and a one-line AI summary of what they do,
concrete product names, the company's own canonical social links, a logo, a rich
technology stack, a structured location (country / region / city / address), founding
year, company type, and a generic contact email/phone where the site exposes them. Built MCP-first
for AI agents — an agent can call "enrich [domain]" and get JSON back in one step — but it works
just as well as a plain company-enrichment API, scraper, extractor, or data feed
you call from a script, a sheet, or a CRM.
It reads the homepage plus the /contact and /about pages and prefers schema.org JSON-LD
structured data (the Organization block most modern sites publish) over loose meta tags — so
socials, address, phone, founding year, and ticker come from the most reliable source on the page.
Natural ways people search for this: company enrichment API, domain-to-company-data, MCP server for AI agents, RAG enrichment, sales enrichment, website data extractor.
What you get per company
Deterministic, website-sourced (present on most marketing homepages):
name,legalName,description— from JSON-LD first, then<title>/meta/og:*tagssocials— the company's own Twitter/X, LinkedIn, Facebook, Instagram, YouTube, GitHub profile/channel links. Recovered from JSON-LDsameAsand page links, canonical pages only: LinkedIn keeps/company/,/school/,/showcase/and rejects personal/in/profiles; tracking redirect wrappers (e.g. Facebookl.php?u=) are decoded to the real profile; video/post/feature deep links are dropped. When in doubt the field staysnull.techStack— a Wappalyzer-style detected stack (~45 fingerprints across analytics, CMS, frameworks, ecommerce, CDN/hosting, marketing/CRM, payments, fonts)logoUrl— the brand logo (JSON-LDlogoor favicon / apple-touch-icon), distinct fromimageUrlimageUrl— the company'sog:image(marketing/hero image)location—{ country, region, city, address }parsed from JSON-LDPostalAddressor an<address>block (each partnullwhen absent)phone,email— a corporate/generic phone and one generic, trustworthy mailbox (info@,contact@, …) on the company's own domain; placeholder/demo addresses are droppedfoundedYear,companyType,ticker— when the page states them
AI-enriched (optional — see the API key note below; null when the AI pass didn't run):
industry/subIndustry— a normalized, common-industry labelsummary— a neutral 1–2 sentence "what they do"products— concrete product/service names (≤10)tags— short topical keywords (≤8)
Always present:
confidence— a transparent 0–1 score for how complete the (website-winnable) profile isaiEnriched—trueonly when the AI pass ran and returned fieldsstatus/error—"ok"for an enriched domain, or"failed"with a short reason ("HTTP 404","timeout", …) when a domain couldn't be reached. One row is emitted per input domain, so a batch never silently under-delivers.
What this actor does NOT do (honest expectations)
It enriches strictly from a company's own website, so it deliberately does not return proprietary fields a website can't source: verified employee count, revenue, funding rounds, or per-person/personal emails. Those need paid B2B databases — guessing them would be hallucination. This actor nails the website-winnable fields densely instead.
The optional AI key
The AI fields (industry, subIndustry, products, tags, summary) are produced by a small,
cheap LLM via any OpenAI-compatible Chat Completions API. Set an AI_API_KEY secret to
turn them on — by default it calls OpenRouter (https://openrouter.ai/api/v1) with model
google/gemini-2.5-flash-lite. Point it anywhere by also setting AI_BASE_URL and
AI_MODEL (e.g. OpenAI https://api.openai.com/v1 + gpt-4.1-nano, or DeepSeek
https://api.deepseek.com + deepseek-chat). Without a key the actor still runs and returns
the full deterministic record — the AI fields simply stay null and aiEnriched is false.
The AI pass also degrades gracefully on any error or timeout, never failing a run.
Input
Runs zero-config. Optional fields:
domains— one or more company domains to enrich (bare host or full URL). Empty = a demo run.maxItems— cap on companies per run.maxCostPerRunUsd— hard cost cap, default $5; the run stops before exceeding it.
Output
One canonical CompanyProfile record per domain (keyed by domain, so re-running updates the same
company). On repeat runs only profiles that changed are re-emitted — a company-profile monitor.
See .actor/dataset_schema.json.
Use as an MCP tool
Apify's MCP server exposes this actor to agents (Claude, Cursor) automatically. The agent intent
it matches: "enrich this company / give me structured data for [domain]".
Pricing
Pay-per-event: a small start fee plus a per-enrichment charge, with a hard per-run cost cap on by default so you are never surprised by the bill. Actively maintained — fixes within 24h.