Company Tech-Stack & Domain Intelligence API
Pricing
from $4.00 / 1,000 company profile results
Company Tech-Stack & Domain Intelligence API
Domain to company tech stack, founded year, headcount, industry, funding, socials & logo. Open-data company enrichment API. A BuiltWith + Clearbit alternative.
Pricing
from $4.00 / 1,000 company profile results
Rating
0.0
(0)
Developer
Technical Dost Solutions
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Turn any company domain into a complete intelligence profile in one API call: detected technology stack, founded year, headcount range, industry, funding signals, social links, logo, DNS and contact emails. This Apify Actor is a fast, stateless, developer-friendly company enrichment API built entirely on open and public data sources — no scraping of gated platforms, no login walls, no fragile cookies. Send a list of domains, get back clean structured JSON or CSV.
This Actor is the superset edition of the popular Company Firmographic & Logo API. It keeps every firmographic field you already rely on (company name, description, industry, logo, social profiles and emails) and layers on BuiltWith-style technology detection, domain-age / founded-year resolution via RDAP, headcount estimation, and funding-signal extraction. If you need one endpoint that answers "what is this company, what do they build with, how big are they, and how do I reach them?", this is it.
Table of contents
- What this company intelligence API does
- Why use this Actor for domain enrichment
- Key features
- Use cases
- How it works
- Input
- Output
- Output field reference
- Pricing
- How to use the Actor (step by step)
- Calling the Actor via API
- Integrations: Make, Zapier, n8n, sheets
- Data sources and methodology
- Accuracy, confidence and limitations
- Comparison with BuiltWith, Clearbit and Wappalyzer
- Frequently asked questions
- Legal and responsible use
- Changelog
- Support
What this company intelligence API does
The Company Tech-Stack & Domain Intelligence API accepts one or many company domains — for example stripe.com, notion.so, vercel.com — and returns a single, flat, easy-to-consume JSON record per domain. Each record is a company profile assembled in real time from the company's own public website, its public DNS records, and the public RDAP (Registration Data Access Protocol) registry that has replaced classic WHOIS.
For every domain you submit, the Actor attempts to resolve:
- Technology stack — the frameworks, CMS, e-commerce platform, analytics tools, marketing and CRM tools, payment processors, hosting / CDN providers, UI libraries and security tooling a site runs, detected from HTTP response headers, cookies, and HTML fingerprints (BuiltWith-style / Wappalyzer-style detection).
- Founded year — the year the company was founded, resolved from schema.org structured data, explicit "Founded in YYYY" / "Est. YYYY" copy, or, as a labelled fallback, the domain registration year from RDAP.
- Headcount range — an employee-count bucket (for example
51-200or1001-5000) derived from schema.orgnumberOfEmployeesor explicit "X employees" / "team of X" copy on the site. - Industry — a normalized industry label such as Fintech / Payments, SaaS / Software, Healthcare / Biotech or E-commerce / Retail.
- Funding signals — best-effort detection of funding stage (Seed, Series A–J, IPO, acquired, bootstrapped), total amount raised, and a Crunchbase profile link when the company surfaces these publicly.
- Social links — Twitter / X, LinkedIn, Facebook, Instagram, YouTube, GitHub, TikTok and Crunchbase profile URLs, de-duplicated and cleaned of share/intent links.
- Logo — a high-quality logo URL resolved from
apple-touch-icon, Open Graph image or favicon, with free Clearbit and Google favicon fallbacks. - DNS and email provider — A and MX records, plus an inferred email provider (Google Workspace, Microsoft 365, Zoho, Proton, Proofpoint).
- Contact emails — public contact emails discovered on the homepage and a
/contactpage, filtered to remove placeholders and asset filenames. - Company name and description — the cleaned brand name and meta / schema.org description.
The result is a domain-to-company enrichment API that you can drop straight into a CRM enrichment pipeline, a lead-scoring workflow, a sales-intelligence dashboard, a competitive-analysis tool, or a technographic segmentation job.
Why use this Actor for domain enrichment
Most teams that need company data face the same trade-off. The big commercial providers — Clearbit (now part of HubSpot Breeze), ZoomInfo, Apollo, BuiltWith Pro — are powerful but expensive, often gated behind annual contracts, seat minimums and per-record pricing that punishes experimentation. Open-source libraries like Wappalyzer give you technology detection but nothing firmographic. Stitching together five different tools to answer one question ("tell me everything about this domain") is slow and brittle.
This Actor collapses that stack into one pay-as-you-go endpoint with no subscription, no seat license and no minimum commitment. You pay per result. You can enrich a single domain to test, then scale to fifty thousand domains in a batch when you are ready. Because it runs on the Apify platform, you get retries, proxy support, scheduling, webhooks, and a generated REST API for free — without managing any infrastructure.
Crucially, it is transparent about where every field comes from. Founded year is labelled with its source (schema.org, website-copy or domain-registration) so you always know whether you are looking at a verified fact or an estimate. That honesty is what makes the data safe to act on.
Key features
- ⚡ One call, full profile. Tech stack + firmographics + funding + socials + logo + DNS in a single structured record per domain.
- 🧱 BuiltWith-style technology detection. 70+ technology signatures across frameworks, CMS, e-commerce, payments, analytics, marketing/CRM, hosting/CDN, UI and security — detected from headers, cookies and HTML.
- 🗓️ Founded-year resolution. Schema.org
foundingDate, on-page "Founded in YYYY" copy, and RDAP domain-registration fallback, each clearly labelled. - 👥 Headcount estimation. Employee-count buckets from structured data or site copy.
- 💸 Funding-signal extraction. Stage, amount raised, and Crunchbase profile detection from public copy.
- 🔗 Clean social profiles. LinkedIn, Twitter / X, Facebook, Instagram, YouTube, GitHub, TikTok and Crunchbase, de-duplicated.
- 🖼️ Logo resolution. Brand-quality logo URL with free Clearbit + Google fallbacks (a drop-in Clearbit Logo API alternative).
- 📡 DNS + email-provider detection. A / MX records and inferred mail platform.
- 📧 Public email discovery. Homepage and contact-page emails with placeholder filtering.
- 🚀 Batch-ready. Enrich up to 50,000 domains per run with built-in concurrency.
- 🧩 Zero gated sources. 100% open and public data — the company's own site, public DNS, and the RDAP registry.
- 🛠️ Developer-first output. Flat JSON, CSV, Excel, or HTML table; webhooks; generated REST API.
Use cases
Sales intelligence and lead enrichment. Append firmographics and technographics to inbound leads the moment they hit your CRM. Know a prospect's tech stack, size and industry before the first call.
Technographic segmentation. Build target lists of companies using a specific technology — every prospect on Shopify, every site running HubSpot, every company on AWS — for account-based marketing and competitive displacement campaigns.
Lead scoring. Use headcount range, funding stage and detected tools as features in a lead-scoring model. A Series B fintech on a modern stack scores differently from a bootstrapped hobby site.
Competitive and market analysis. Profile an entire market segment — feed in 5,000 competitor and adjacent domains and get a structured map of who builds with what.
CRM data hygiene. Re-enrich stale CRM records: refresh logos, social links, descriptions and industries on a schedule.
Investor and deal sourcing. Surface funding signals and founded year across a watchlist of domains to flag growth-stage companies.
Recruiting and talent intelligence. Understand a target company's tech stack before sourcing engineers, and gauge company size from headcount range.
Cybersecurity and attack-surface mapping. Inventory the technologies and hosting providers exposed by a portfolio of domains.
Logo and brand assets. Use it purely as a logo API — pass a domain, get a logo URL — a free, keyless Clearbit Logo API alternative.
How it works
For each domain you submit, the Actor:
- Normalizes the domain (strips
http(s)://,www., paths and query strings). - Fetches the homepage over HTTPS with realistic headers, following redirects, with automatic retries on
429and5xxresponses. - In parallel, queries RDAP for the domain registration date and resolves DNS A and MX records.
- Parses the HTML and headers: extracts the
<title>, meta description, Open Graph tags, and allapplication/ld+json(schema.org) structured-data blocks. - Runs technology fingerprinting against the combined HTML + headers + cookies using 70+ signatures, grouped into categories.
- Resolves each firmographic field with a clear source-priority chain (structured data first, then site copy, then registry fallback).
- Extracts socials, logo and emails, with a one-shot
/contactpage probe if no emails appear on the homepage. - Pushes one clean record to the dataset.
Everything is stateless: no data is stored between runs, and each domain is processed independently, which makes the Actor fast, predictable and easy to parallelize.
Input
The Actor takes a JSON input object. The only required field is domains.
| Field | Type | Default | Description |
|---|---|---|---|
domains | array of strings | — (required) | Company domains to enrich. Accepts bare domains (stripe.com) or full URLs (https://www.stripe.com/pricing). |
includeTechStack | boolean | true | Detect the technology stack from headers and HTML. |
includeFunding | boolean | true | Extract funding stage, amount raised and Crunchbase profile. |
includeWhois | boolean | true | Resolve domain registration year via RDAP (used as a founded-year fallback). |
includeDns | boolean | true | Resolve A and MX records and infer the email provider. |
includeEmails | boolean | true | Scan the homepage and /contact for public emails. |
includeLogo | boolean | true | Resolve the company logo URL. |
maxItems | integer | 100 | Maximum number of domains to process (controls your cost). |
Example input
{"domains": ["stripe.com", "notion.so", "vercel.com", "gitlab.com"],"includeTechStack": true,"includeFunding": true,"includeWhois": true,"includeDns": true,"includeEmails": true,"includeLogo": true,"maxItems": 1000}
Output
Each domain produces one dataset record. Example (abbreviated) for gitlab.com:
{"domain": "gitlab.com","url": "https://gitlab.com","companyName": "GitLab","description": "GitLab is the most comprehensive AI-powered DevSecOps platform...","industry": "SaaS / Software","foundedYear": 2011,"foundedYearSource": "schema.org","domainRegisteredYear": 2003,"headcountRange": "1001-5000","headcountSource": "schema.org","approxEmployees": 2000,"fundingStage": "Series E","funding": {"stage": "Series E","totalRaised": "$436M","crunchbaseUrl": "https://www.crunchbase.com/organization/gitlab","source": "crunchbase-link+copy","snippet": "raised $436 million in funding"},"technologies": ["Nuxt.js", "Vue.js", "Google Analytics", "Cloudflare", "Bootstrap"],"techStack": {"framework": ["Nuxt.js", "Vue.js"],"analytics": ["Google Analytics", "Google Tag Manager"],"hosting_cdn": ["Cloudflare"],"ui": ["Bootstrap"]},"server": "cloudflare","poweredBy": null,"logoUrl": "https://logo.clearbit.com/gitlab.com","socials": {"twitter": "https://twitter.com/gitlab","linkedin": "https://www.linkedin.com/company/gitlab-com","youtube": "https://www.youtube.com/gitlab"},"emails": ["press@gitlab.com"],"dns": { "a": ["172.65.251.78"], "mx": ["mxa-00d52e01.gslb.pphosted.com"] },"emailProvider": "Proofpoint","fetched": true,"statusCode": 200,"fetchedAt": "2026-06-23T12:00:00.000Z"}
You can export the dataset as JSON, CSV, Excel, HTML table, JSONL, RSS or XML from the Apify Console, or pull it through the Dataset API.
Output field reference
| Field | Type | Description |
|---|---|---|
domain | string | The normalized domain that was enriched. |
url | string | The HTTPS URL fetched. |
companyName | string | Cleaned company / brand name (schema.org → og:site_name → title heuristic). |
description | string | Company description (schema.org → meta description → og:description). |
industry | string|null | Normalized industry label, or null if undetermined. |
foundedYear | integer|null | Best estimate of the founding year. |
foundedYearSource | string|null | schema.org, website-copy, or domain-registration. Always check this to gauge confidence. |
domainRegisteredYear | integer|null | Year the domain was registered (RDAP), shown separately for transparency. |
headcountRange | string|null | Employee bucket, e.g. 1-10, 51-200, 1001-5000, 10000+. |
headcountSource | string|null | schema.org or website-copy. |
approxEmployees | integer|null | The raw employee number behind the bucket, when available. |
fundingStage | string|null | Detected stage (e.g. Series B, Seed, IPO). |
funding | object|null | { stage, totalRaised, crunchbaseUrl, source, snippet }. |
technologies | array | Flat list of all detected technologies. |
techStack | object | Technologies grouped by category. |
server | string|null | Server response header. |
poweredBy | string|null | X-Powered-By response header. |
logoUrl | string|null | Resolved logo URL. |
socials | object | Map of platform → profile URL. |
emails | array | Public contact emails (max 10), company-domain emails preferred. |
dns | object|null | { a: [...], mx: [...] }. |
emailProvider | string|null | Inferred mail platform from MX records. |
title | string|null | Raw page title. |
fetched | boolean | Whether the homepage was reachable. |
statusCode | integer | HTTP status of the homepage request. |
error | string | Present only when a domain could not be enriched. |
fetchedAt | string | ISO 8601 timestamp of enrichment. |
Technology categories
framework, cms, ecommerce, payments, analytics, marketing, hosting_cdn, ui, security, and other. Human-readable labels for each category are written to the run's STATS key-value record.
Pricing
This Actor uses Apify's pay-per-event pricing model, so your cost scales exactly with usage and there is no subscription:
- Actor start — a tiny fixed fee per run (per GB of memory).
- Per result — a small fee for each company profile returned.
You can cap spend precisely with the maxItems input. Enriching a handful of domains costs cents; large batches are billed linearly per result. See the Pricing tab on the Actor's Apify Store page for the current per-event rates. Apify's free monthly platform credits are enough to evaluate the Actor at no cost.
How to use the Actor (step by step)
- Open the Actor in Apify Console and click Try for free.
- In the Input tab, paste or type your list of domains into the Domains field.
- Toggle the data modules you want (tech stack, funding, WHOIS/RDAP, DNS, emails, logo). Leave them all on for the richest profile.
- Set Max items to cap how many domains are processed.
- Click Start. Watch the live log; progress is reported as domains complete.
- When the run finishes, open the Output / Dataset tab.
- Export as JSON, CSV or Excel, or copy the API endpoint to pull results programmatically.
To enrich domains on a schedule, use the Schedules feature; to push results into your own systems automatically, attach a Webhook to the Actor run.
Calling the Actor via API
Every Apify Actor exposes a REST API. Run the Actor and retrieve dataset items synchronously with a single call (replace <YOUR_TOKEN>):
curl -X POST "https://api.apify.com/v2/acts/technicaldost~company-intelligence-api/run-sync-get-dataset-items?token=<YOUR_TOKEN>" \-H "Content-Type: application/json" \-d '{"domains": ["stripe.com", "vercel.com"],"maxItems": 100}'
JavaScript with the Apify client:
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: '<YOUR_TOKEN>' });const run = await client.actor('technicaldost/company-intelligence-api').call({domains: ['stripe.com', 'notion.so', 'vercel.com'],maxItems: 500,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Python with the Apify client:
from apify_client import ApifyClientclient = ApifyClient("<YOUR_TOKEN>")run = client.actor("technicaldost/company-intelligence-api").call(run_input={"domains": ["stripe.com", "notion.so", "vercel.com"],"maxItems": 500,})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["domain"], item["industry"], item["technologies"])
Integrations
Because the Actor runs on Apify, it plugs directly into the tools your team already uses:
- Make (Integromat) and Zapier — trigger enrichment when a new lead is created, then write the profile back to your CRM.
- n8n — self-hosted automation: call the Actor, branch on
headcountRangeorfundingStage, and route leads. - Google Sheets — enrich a column of domains and append firmographics and technographics.
- Webhooks — POST each finished run to your own endpoint.
- Apify API & SDKs — JavaScript and Python clients for full programmatic control.
- Datasets — export to JSON, CSV, Excel, XML, or stream with the Dataset API.
Data sources and methodology
This Actor is intentionally built on open, public, and free data sources so that it is reliable, transparent and resilient:
- The company's own website (HTTP). Headers, cookies, HTML, meta tags and schema.org JSON-LD are all served publicly by the company to any visitor. This is the basis for technology detection, company name, description, industry, socials, logo, emails, founded year and headcount.
- RDAP (Registration Data Access Protocol). The modern, structured, free successor to WHOIS, used to obtain the domain registration date as a founded-year fallback.
- Public DNS. Standard A and MX record resolution, used for hosting hints and email-provider inference.
- Free logo fallbacks. Clearbit's public logo endpoint and Google's favicon service provide a logo when the site does not expose a clean brand mark.
The Actor does not scrape LinkedIn, does not require cookies or logins, and does not depend on any single paid third-party API that could disappear or rate-limit you. Field resolution follows an explicit source-priority chain — structured data is trusted first, on-page copy second, and registry/derived data last — and the source of each derived field is returned alongside it.
Accuracy, confidence and limitations
Honesty about data quality is a feature, not an afterthought. Please read this section before you act on the data.
- Founded year is most reliable when
foundedYearSourceisschema.orgorwebsite-copy. When it isdomain-registration, treat the year as an upper-bound estimate — a domain may have been registered by a previous owner (e.g.stripe.comwas registered in 1995, long before Stripe existed). The separatedomainRegisteredYearfield is always provided so you can see the raw signal. - Headcount is only returned when the company publishes it in structured data or site copy. Many companies do not, so
headcountRangeis frequentlynull. It is an honest "we found it" / "we didn't", never a fabricated guess. - Funding signals are extracted from public copy and Crunchbase links. They are best-effort: absence of a signal does not mean a company is unfunded, and a detected amount reflects what the site states, which may be a historical round.
- Technology detection is fingerprint-based. It detects technologies that leave a footprint in the public HTML, headers or cookies. Server-side-only technologies, or tools loaded after consent/interaction, may not be visible. False positives are minimized but possible when third-party snippets are embedded.
- Reachability. If a site blocks automated requests, is down, or returns a non-HTML response,
fetchedwill befalseand anerrorwill explain why; firmographic fields will be empty while DNS/RDAP fields may still populate. - JavaScript-rendered content. The Actor reads the server-delivered HTML. Most sites expose their meta, schema.org and key fingerprints server-side, but a fully client-rendered SPA with no SSR may yield thinner firmographics.
Comparison with BuiltWith, Clearbit and Wappalyzer
| Capability | This Actor | BuiltWith | Clearbit (Breeze) | Wappalyzer |
|---|---|---|---|---|
| Technology detection | ✅ | ✅ | ➖ | ✅ |
| Firmographics (name, industry, description) | ✅ | ➖ | ✅ | ➖ |
| Founded year | ✅ | ➖ | ✅ | ➖ |
| Headcount range | ✅ | ➖ | ✅ | ➖ |
| Funding signals | ✅ | ➖ | ✅ | ➖ |
| Logo API | ✅ | ➖ | ✅ | ➖ |
| Social profiles | ✅ | ➖ | ✅ | ➖ |
| DNS / email provider | ✅ | ➖ | ➖ | ➖ |
| Pay-as-you-go, no subscription | ✅ | ➖ | ➖ | ➖ |
| Source labelled per field | ✅ | ➖ | ➖ | ➖ |
This Actor is best understood as a practical superset for the common "enrich this domain" job: it combines the technology-detection value of BuiltWith and Wappalyzer with the firmographic value of Clearbit, on a pay-per-result basis, without a contract. It is not a replacement for a full commercial data platform's proprietary databases — it is a transparent, affordable, open-data alternative that covers the fields most teams actually use.
Frequently asked questions
What is a company tech-stack and domain intelligence API? It is an API that takes a company's domain name and returns a structured profile of that company — what technologies their website runs, what industry they are in, when they were founded, roughly how many employees they have, any public funding signals, their social profiles, their logo, and how to contact them. It combines technographic data (the tech stack) with firmographic data (company facts) in one call.
How do I detect the technology stack of a website?
Submit the domain with includeTechStack: true. The Actor fetches the site and fingerprints the HTML, HTTP response headers and cookies against 70+ known technology signatures, returning both a flat technologies list and a categorized techStack object (frameworks, CMS, analytics, hosting/CDN, payments, marketing, UI, security).
Is this a free Clearbit Logo API alternative?
Yes. Set includeLogo: true and you receive a logoUrl for each domain, resolved from the site's own brand assets with free Clearbit and Google fallbacks. You can use the Actor purely for logos if that is all you need.
Where does the founded year come from?
In priority order: schema.org foundingDate, an explicit "Founded in YYYY" / "Est. YYYY" / "Since YYYY" statement in the site copy, and finally the RDAP domain-registration year as a labelled fallback. The foundedYearSource field tells you which one was used.
How accurate is the headcount?
Headcount is returned only when the company publishes it in structured data or on-page copy, mapped to a range bucket. It is never fabricated; if no signal exists, the field is null.
Do I need an API key for any third-party service? No. Everything runs on open, keyless, public sources — the company website, public DNS and RDAP. You only need your Apify token to call the Actor.
Does this scrape LinkedIn or other gated platforms?
No. The Actor does not access any login-gated platform. Social links are discovered from links the company places on its own public website (and its schema.org sameAs).
How many domains can I enrich at once?
Up to 50,000 per run, with built-in concurrency. Use maxItems to control batch size and cost.
What output formats are available? JSON, JSONL, CSV, Excel, HTML table, XML and RSS, via the Apify Console export or the Dataset API.
Can I run this on a schedule? Yes. Use Apify Schedules to re-enrich domains daily, weekly or monthly, and attach webhooks to push fresh data into your systems.
How is this different from the Firmographic & Logo API? This Actor is a superset. It returns every field the Company Firmographic & Logo API returns, and adds technology-stack detection, founded year, headcount range, funding signals, DNS and email-provider detection.
Why are some fields empty?
Either the company does not publish that information anywhere public, or the site was unreachable / blocked automated access. The fetched, statusCode and error fields explain reachability.
Is the data GDPR-compliant to use? The Actor returns only publicly available business information and publicly listed contact emails. You are responsible for using the data lawfully under the regulations that apply to you. See the next section.
Legal and responsible use
This Actor collects only publicly available information that companies publish about themselves: the content of their public websites, their public DNS records, and public domain-registration data exposed through RDAP. It does not bypass authentication, does not access private or login-gated data, and does not collect personal data beyond business contact emails that companies have chosen to publish publicly.
You are responsible for ensuring your use of the output complies with all applicable laws and regulations — including GDPR, CCPA, the CAN-SPAM Act, and any contractual terms that apply to you — particularly when using contact emails for outreach. Always honor opt-out requests and local marketing-consent rules. Use this data to inform and personalize legitimate business communication, not to spam.
Changelog
v0.1
- Initial release of the Company Tech-Stack & Domain Intelligence API.
- 70+ technology signatures across nine categories.
- Founded-year resolution (schema.org / website copy / RDAP) with per-field source labelling.
- Headcount estimation from structured data and site copy.
- Funding-signal extraction (stage, amount, Crunchbase link).
- Firmographics, social profiles, logo, DNS, email-provider detection and public email discovery.
- Pay-per-event pricing; batch up to 50,000 domains.
Support
Questions, feature requests, or a technology you'd like added to the detection set? Open an issue on the Actor's Issues tab in Apify Console. This Actor is actively maintained — new technology signatures and firmographic sources are added regularly. If you rely on the Company Firmographic & Logo API, this superset Actor is a drop-in upgrade that keeps every existing field and adds technographics, founded year, headcount and funding on top.
Keywords: company intelligence API, domain enrichment API, tech stack detection API, technographics API, firmographic API, company data API, BuiltWith alternative, Clearbit alternative, Wappalyzer alternative, logo API, company logo API, founded year API, headcount data, employee count API, funding data API, Crunchbase alternative, lead enrichment, B2B data enrichment, domain to company API, website technology lookup, CRM enrichment, sales intelligence, account-based marketing, technographic segmentation, RDAP WHOIS API, DNS lookup, email finder.