Company Firmographics: Employees Industry HQ Revenue Clay avatar

Company Firmographics: Employees Industry HQ Revenue Clay

Pricing

from $3.40 / 1,000 results

Go to Apify Store
Company Firmographics: Employees Industry HQ Revenue Clay

Company Firmographics: Employees Industry HQ Revenue Clay

Domain to structured company firmographics: employee band, industry, HQ, founded year, revenue estimate, logo, and description from schema.org JSON-LD and meta tags. Flat JSON, Clay ready, with source provenance.

Pricing

from $3.40 / 1,000 results

Rating

0.0

(0)

Developer

Mamba Labs

Mamba Labs

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Company Firmographics: Employees, Industry, HQ, Revenue from a Domain

Turn a company domain into a structured firmographic record. Give it a domain and get back employee band, industry, HQ location, founded year, a revenue estimate, logo, and description, parsed from the company's own schema.org/Organization JSON-LD and HTML meta tags. Every record carries a source_signals array and a data_completeness score, so you always know where the data came from and how much was found. Flat JSON, one row per domain, ready to drop into a Clay table. Pure HTTP, no browser, no paid data provider, no Crunchbase.

Built for Clay users, RevOps teams, and outbound agencies that need enriched company records without a ZoomInfo or Clearbit dependency. It extends the JSON-LD parsing pattern from the Domain to LinkedIn URL Resolver and is the canonical company record the rest of the Mamba Labs fleet joins on.

Features

  • Structured firmographics from the company's own site. Employee band, industry, HQ, founded year, revenue estimate, logo, and description from schema.org/Organization JSON-LD, with HTML meta tags as a fallback.
  • Transparent provenance. A source_signals array on every record names exactly which sources contributed (JSON-LD, meta tags, proxy fetch). No competitor exposes this.
  • Honest coverage score. data_completeness (0 to 100) tells you how much of the firmographic record was actually found, so you can gate downstream work on real coverage.
  • No paid data dependency. Pure HTTP and public structured data. No ZoomInfo, Clearbit, or Crunchbase. Lower cost, fully auditable.
  • Datacenter proxy fallback. Fetches direct first and only falls back to a datacenter proxy when a domain blocks, keeping runs fast and cheap.
  • Batch and cache. Pass a domains array for bulk runs; results are cached for 7 days to make repeat lookups free.

Input

FieldTypeRequiredDefaultDescription
domainstringnostripe.comBare domain without https:// or trailing slash.
company_namestringnononeOptional company name, used as a fallback label when the page does not expose one.
domainsarraynononeList of bare domains for batch processing. Takes precedence over domain. One output row per domain.
batchSizeintegerno5Domains enriched concurrently per wave in batch mode. Maximum 10.
skipCachebooleannofalseForce a fresh enrichment and ignore the 7 day result cache.

Provide either domain or domains.

Output

One flat row per domain. Every field is always present; absent values are null.

FieldTypeDescriptionExample
domainstringNormalized input domaingitlab.com
company_namestringFrom JSON-LD name, meta tags, or the input fallbackGitLab
employee_bandstringBucketed employee range, or null1001-5000
employee_countintegerRaw count from JSON-LD numberOfEmployees, or null2500
industrystringBest-effort industry, often nullFinancial Services
hq_locationstring"City, Region, Country" from JSON-LD addressSan Francisco, CA, US
founded_yearstringFour digit year from foundingDate2011
revenue_estimatestringHeuristic band from employee count, not authoritative$250M-$1B
logo_urlstringFrom JSON-LD logo or og:imagehttps://.../logo.svg
descriptionstringCompany description, capped at 500 charactersGitLab is ...
source_signalsarrayWhich sources populated the record["jsonld_organization"]
data_completenessinteger0 to 100, share of the eight core fields populated88
run_datestringISO timestamp of the run2026-06-19T13:13:15Z

Heuristics: employee_band buckets the raw count (1-10 through 10001+). revenue_estimate is derived from the employee count using a per-employee proxy and is an estimate, not an authoritative figure. industry is best-effort and is often null because schema.org has no first-class industry field. data_completeness and source_signals let you gate on real coverage rather than assuming a field is present.

Pricing

TierDiscountPer resultPer 1K results
Free (no plan)0%$0.004$4.00
Starter (Bronze)~5%$0.0038$3.80
Scale (Silver)~10%$0.0036$3.60
Business (Gold)~15%$0.0034$3.40

Free tier: 50 results per month included, resets monthly. Cached repeat lookups within 7 days are free.

Usage Examples

Apify Console / API

curl -X POST "https://api.apify.com/v2/acts/YlUtLWjfPpqykmB8g/run-sync-get-dataset-items?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"domain":"gitlab.com"}'

Batch:

{ "domains": ["gitlab.com", "stripe.com", "notion.so"], "batchSize": 5 }

Clay Integration

  1. Add an Enrichment column of type HTTP API, or use the Apify integration.
  2. Call this actor with domain mapped to your domain column.
  3. Map the returned fields to columns: company_name, employee_band, industry, hq_location, founded_year, revenue_estimate, data_completeness.
  4. Gate downstream enrichment or outreach on a formula like data_completeness >= 50 to skip rows where little firmographic data was found.

The output is flat and one row per domain, so every field maps directly to a Clay column with no JSON unwrapping.

MCP Integration

$npm install @mambalabsdev/mcp-company-firmographic-enricher
{
"mcpServers": {
"company-firmographic-enricher": {
"command": "npx",
"args": ["-y", "@mambalabsdev/mcp-company-firmographic-enricher"],
"env": { "APIFY_TOKEN": "YOUR_TOKEN" }
}
}
}

Tool: enrich_company_firmographics with { "domain": "gitlab.com" }.

Error Handling

ConditionBehaviorOutput
Empty or invalid domainEmpty record pushed, run continuesall fields null, data_completeness:0
Domain unreachable or fetch failsEmpty record, source_signals:[]row emitted, not a run error
Direct fetch blocked (403 or 429)Retry via datacenter proxy; if still blocked, empty recordsource_signals notes the proxy attempt
No JSON-LD on the pageFall back to HTML meta tagspartial record from meta only
One domain throws in a batchCaught per domain, empty record pushedother rows unaffected

Limitations

  • Coverage varies by what the company publishes. Firmographics come from the company's own structured data. Sites with rich schema.org/Organization JSON-LD return a near-complete record; sites with only basic meta tags return name, description, and logo. data_completeness and source_signals make this transparent on every row.
  • revenue_estimate is a heuristic, not a figure. It is derived from the employee count using a per-employee proxy. Treat it as a rough band, not an authoritative revenue number.
  • industry is best-effort. schema.org has no first-class industry field, so this is often null. Do not rely on it being present.
  • No paid data provider. This actor does not call ZoomInfo, Clearbit, or Crunchbase, so it will not return data those providers hold but the company does not publish. That is the tradeoff for fully auditable, low-cost, ToS-clean enrichment.
  • Data freshness. Results are cached for 7 days. Pass skipCache: true for a live enrichment.

Part of the Mamba Labs GTM Intelligence Suite

ActorActor ID
GTM Hiring Signal ScraperD7O1SA2EqwHGsGr1P
GTM Tech Stack Signal Enrichmentqyd7nNyqFPelQViBx
GTM Signals AggregatorxKdRfnfFNkdMpFuNs
Job Board Keyword Signal Scanner4DvqpvhMR74NLcDDY
Domain to LinkedIn URL Resolver3HtnSaqPHOg1Qg5gx
ICP Fit ScorerW161DT8W4kW55dMFh
Domain Deliverability Checker0tVgxI7A6o9jMlxmc
Company Firmographic EnricherYlUtLWjfPpqykmB8g

npm: @mambalabsdev/ats-scrapers

Built by Mamba Labs.