# Company Firmographics: Employees Industry HQ Revenue Clay (`mambalabs/company-firmographic-enricher`) Actor

Domain to structured company firmographics: employee band, industry, HQ, founded year, revenue estimate, logo, and description from schema.org JSON-LD and meta tags. Flat JSON, Clay ready, with source provenance.

- **URL**: https://apify.com/mambalabs/company-firmographic-enricher.md
- **Developed by:** [Mamba Labs](https://apify.com/mambalabs) (community)
- **Categories:** Lead generation, SEO tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.40 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Company Firmographics: Employees, Industry, HQ, Revenue from a Domain

Turn a company domain into a structured firmographic record. Give it a domain and get back employee band, industry, HQ location, founded year, a revenue estimate, logo, and description, parsed from the company's own schema.org/Organization JSON-LD and HTML meta tags. Every record carries a `source_signals` array and a `data_completeness` score, so you always know where the data came from and how much was found. Flat JSON, one row per domain, ready to drop into a Clay table. Pure HTTP, no browser, no paid data provider, no Crunchbase.

Built for Clay users, RevOps teams, and outbound agencies that need enriched company records without a ZoomInfo or Clearbit dependency. It extends the JSON-LD parsing pattern from the Domain to LinkedIn URL Resolver and is the canonical company record the rest of the Mamba Labs fleet joins on.

### Features

- **Structured firmographics from the company's own site.** Employee band, industry, HQ, founded year, revenue estimate, logo, and description from schema.org/Organization JSON-LD, with HTML meta tags as a fallback.
- **Transparent provenance.** A `source_signals` array on every record names exactly which sources contributed (JSON-LD, meta tags, proxy fetch). No competitor exposes this.
- **Honest coverage score.** `data_completeness` (0 to 100) tells you how much of the firmographic record was actually found, so you can gate downstream work on real coverage.
- **No paid data dependency.** Pure HTTP and public structured data. No ZoomInfo, Clearbit, or Crunchbase. Lower cost, fully auditable.
- **Datacenter proxy fallback.** Fetches direct first and only falls back to a datacenter proxy when a domain blocks, keeping runs fast and cheap.
- **Batch and cache.** Pass a `domains` array for bulk runs; results are cached for 7 days to make repeat lookups free.

### Input

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `domain` | string | no | stripe.com | Bare domain without https:// or trailing slash. |
| `company_name` | string | no | none | Optional company name, used as a fallback label when the page does not expose one. |
| `domains` | array | no | none | List of bare domains for batch processing. Takes precedence over `domain`. One output row per domain. |
| `batchSize` | integer | no | 5 | Domains enriched concurrently per wave in batch mode. Maximum 10. |
| `skipCache` | boolean | no | false | Force a fresh enrichment and ignore the 7 day result cache. |

Provide either `domain` or `domains`.

### Output

One flat row per domain. Every field is always present; absent values are null.

| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `domain` | string | Normalized input domain | `gitlab.com` |
| `company_name` | string | From JSON-LD name, meta tags, or the input fallback | `GitLab` |
| `employee_band` | string | Bucketed employee range, or null | `1001-5000` |
| `employee_count` | integer | Raw count from JSON-LD numberOfEmployees, or null | `2500` |
| `industry` | string | Best-effort industry, often null | `Financial Services` |
| `hq_location` | string | "City, Region, Country" from JSON-LD address | `San Francisco, CA, US` |
| `founded_year` | string | Four digit year from foundingDate | `2011` |
| `revenue_estimate` | string | Heuristic band from employee count, not authoritative | `$250M-$1B` |
| `logo_url` | string | From JSON-LD logo or og:image | `https://.../logo.svg` |
| `description` | string | Company description, capped at 500 characters | `GitLab is ...` |
| `source_signals` | array | Which sources populated the record | `["jsonld_organization"]` |
| `data_completeness` | integer | 0 to 100, share of the eight core fields populated | `88` |
| `run_date` | string | ISO timestamp of the run | `2026-06-19T13:13:15Z` |

**Heuristics:** `employee_band` buckets the raw count (1-10 through 10001+). `revenue_estimate` is derived from the employee count using a per-employee proxy and is an estimate, not an authoritative figure. `industry` is best-effort and is often null because schema.org has no first-class industry field. `data_completeness` and `source_signals` let you gate on real coverage rather than assuming a field is present.

### Pricing

| Tier | Discount | Per result | Per 1K results |
|------|----------|-----------|----------------|
| Free (no plan) | 0% | $0.004 | $4.00 |
| Starter (Bronze) | ~5% | $0.0038 | $3.80 |
| Scale (Silver) | ~10% | $0.0036 | $3.60 |
| Business (Gold) | ~15% | $0.0034 | $3.40 |

Free tier: 50 results per month included, resets monthly. Cached repeat lookups within 7 days are free.

### Usage Examples

#### Apify Console / API

```bash
curl -X POST "https://api.apify.com/v2/acts/YlUtLWjfPpqykmB8g/run-sync-get-dataset-items?token=YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"domain":"gitlab.com"}'
````

Batch:

```json
{ "domains": ["gitlab.com", "stripe.com", "notion.so"], "batchSize": 5 }
```

#### Clay Integration

1. Add an Enrichment column of type HTTP API, or use the Apify integration.
2. Call this actor with `domain` mapped to your domain column.
3. Map the returned fields to columns: `company_name`, `employee_band`, `industry`, `hq_location`, `founded_year`, `revenue_estimate`, `data_completeness`.
4. Gate downstream enrichment or outreach on a formula like `data_completeness >= 50` to skip rows where little firmographic data was found.

The output is flat and one row per domain, so every field maps directly to a Clay column with no JSON unwrapping.

#### MCP Integration

```bash
npm install @mambalabsdev/mcp-company-firmographic-enricher
```

```json
{
  "mcpServers": {
    "company-firmographic-enricher": {
      "command": "npx",
      "args": ["-y", "@mambalabsdev/mcp-company-firmographic-enricher"],
      "env": { "APIFY_TOKEN": "YOUR_TOKEN" }
    }
  }
}
```

Tool: `enrich_company_firmographics` with `{ "domain": "gitlab.com" }`.

### Error Handling

| Condition | Behavior | Output |
|-----------|----------|--------|
| Empty or invalid domain | Empty record pushed, run continues | all fields null, `data_completeness:0` |
| Domain unreachable or fetch fails | Empty record, `source_signals:[]` | row emitted, not a run error |
| Direct fetch blocked (403 or 429) | Retry via datacenter proxy; if still blocked, empty record | `source_signals` notes the proxy attempt |
| No JSON-LD on the page | Fall back to HTML meta tags | partial record from meta only |
| One domain throws in a batch | Caught per domain, empty record pushed | other rows unaffected |

### Limitations

- **Coverage varies by what the company publishes.** Firmographics come from the company's own structured data. Sites with rich schema.org/Organization JSON-LD return a near-complete record; sites with only basic meta tags return name, description, and logo. `data_completeness` and `source_signals` make this transparent on every row.
- **`revenue_estimate` is a heuristic, not a figure.** It is derived from the employee count using a per-employee proxy. Treat it as a rough band, not an authoritative revenue number.
- **`industry` is best-effort.** schema.org has no first-class industry field, so this is often null. Do not rely on it being present.
- **No paid data provider.** This actor does not call ZoomInfo, Clearbit, or Crunchbase, so it will not return data those providers hold but the company does not publish. That is the tradeoff for fully auditable, low-cost, ToS-clean enrichment.
- **Data freshness.** Results are cached for 7 days. Pass `skipCache: true` for a live enrichment.

***

**Part of the [Mamba Labs GTM Intelligence Suite](https://apify.com/mambalabs)**

| Actor | Actor ID |
|-------|----------|
| [GTM Hiring Signal Scraper](https://apify.com/mambalabs/gtm-hiring-signal-scraper) | D7O1SA2EqwHGsGr1P |
| [GTM Tech Stack Signal Enrichment](https://apify.com/mambalabs/gtm-tech-stack-signal-scraper) | qyd7nNyqFPelQViBx |
| [GTM Signals Aggregator](https://apify.com/mambalabs/gtm-signals-aggregator) | xKdRfnfFNkdMpFuNs |
| [Job Board Keyword Signal Scanner](https://apify.com/mambalabs/job-board-keyword-signal-scanner) | 4DvqpvhMR74NLcDDY |
| [Domain to LinkedIn URL Resolver](https://apify.com/mambalabs/domain-to-linkedin-url-resolver) | 3HtnSaqPHOg1Qg5gx |
| [ICP Fit Scorer](https://apify.com/mambalabs/icp-fit-scorer) | W161DT8W4kW55dMFh |
| [Domain Deliverability Checker](https://apify.com/mambalabs/domain-deliverability-checker) | 0tVgxI7A6o9jMlxmc |
| [Company Firmographic Enricher](https://apify.com/mambalabs/company-firmographic-enricher) | YlUtLWjfPpqykmB8g |

npm: [@mambalabsdev/ats-scrapers](https://www.npmjs.com/package/@mambalabsdev/ats-scrapers)

Built by [Mamba Labs](https://apify.com/mambalabs).

# Actor input Schema

## `domain` (type: `string`):

Bare domain without https:// or trailing slash. Example: stripe.com

## `company_name` (type: `string`):

Optional company name, used as a fallback label when the page does not expose one.

## `domains` (type: `array`):

Optional list of bare domains for batch processing. Takes precedence over the single domain field when provided. Each domain produces its own output row.

## `batchSize` (type: `integer`):

How many domains to enrich concurrently per wave in batch mode. Default 5, maximum 10.

## `skipCache` (type: `boolean`):

By default a result is cached for 7 days and reused on repeat lookups. Set true to force a fresh enrichment and ignore any cached result.

## Actor input object example

```json
{
  "domain": "stripe.com",
  "batchSize": 5,
  "skipCache": false
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "domain": "stripe.com"
};

// Run the Actor and wait for it to finish
const run = await client.actor("mambalabs/company-firmographic-enricher").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "domain": "stripe.com" }

# Run the Actor and wait for it to finish
run = client.actor("mambalabs/company-firmographic-enricher").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "domain": "stripe.com"
}' |
apify call mambalabs/company-firmographic-enricher --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=mambalabs/company-firmographic-enricher",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Company Firmographics: Employees Industry HQ Revenue Clay",
        "description": "Domain to structured company firmographics: employee band, industry, HQ, founded year, revenue estimate, logo, and description from schema.org JSON-LD and meta tags. Flat JSON, Clay ready, with source provenance.",
        "version": "0.0",
        "x-build-id": "CSCU3DXMkJDll4vKp"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/mambalabs~company-firmographic-enricher/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-mambalabs-company-firmographic-enricher",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/mambalabs~company-firmographic-enricher/runs": {
            "post": {
                "operationId": "runs-sync-mambalabs-company-firmographic-enricher",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/mambalabs~company-firmographic-enricher/run-sync": {
            "post": {
                "operationId": "run-sync-mambalabs-company-firmographic-enricher",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "domain": {
                        "title": "Company Domain",
                        "type": "string",
                        "description": "Bare domain without https:// or trailing slash. Example: stripe.com"
                    },
                    "company_name": {
                        "title": "Company Name (optional)",
                        "type": "string",
                        "description": "Optional company name, used as a fallback label when the page does not expose one."
                    },
                    "domains": {
                        "title": "Company Domains (batch)",
                        "type": "array",
                        "description": "Optional list of bare domains for batch processing. Takes precedence over the single domain field when provided. Each domain produces its own output row.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "batchSize": {
                        "title": "Batch Size",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "How many domains to enrich concurrently per wave in batch mode. Default 5, maximum 10.",
                        "default": 5
                    },
                    "skipCache": {
                        "title": "Skip Cache",
                        "type": "boolean",
                        "description": "By default a result is cached for 7 days and reused on repeat lookups. Set true to force a fresh enrichment and ignore any cached result.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
