# Business Registry & Ownership Intel - KYB, Officers (`seibs.co/business-registry-intel`) Actor

Unifies US Secretary-of-State business registries into one normalized entity schema: status, type, formation date, registered agent, officers, addresses. KYB value layer: cross-state entity resolution, officer linking, and an ownership graph. Logged-out public records. For KYB/AML and PE/M\&A.

- **URL**: https://apify.com/seibs.co/business-registry-intel.md
- **Developed by:** [Seibs.co](https://apify.com/seibs.co) (community)
- **Categories:** Business, Lead generation, Developer tools
- **Stats:** 1 total users, 0 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $8.00 / 1,000 company records

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Business Registry & Corporate Ownership Intel (KYB)

> **TL;DR for KYB/AML, compliance, PE/M&A, and B2B firmographics teams:** Unifies the fragmented US Secretary-of-State business registries (California, New York, Florida, Texas - plus opt-in Delaware) into one normalized legal-entity schema: name, file number, status, type, formation date, registered agent, officers/directors, and addresses. On top of the raw records it adds the KYB value layer competitors charge for: **cross-state entity resolution** (the same company in CA and NY is one entity, not two rows), **officer-to-company linking** (one person's full registry footprint), and an **inferred ownership/association graph** (parent/subsidiary/sister entities via shared officers, agents, addresses, and name roots). The underlying data is public-by-law, but it lives behind 50 separate portals with no unified free API - the exact fragmentation OpenCorporates (GBP 2,250-12,000/yr) and D&B Direct+ ($25,000+/yr) monetize. Government public records, logged-out, PII-minimized. Free Apify plan covers exploration runs on your $5 platform credit.

### Run it in 30 seconds

```python
## Via the Apify Python SDK
from apify_client import ApifyClient

client = ApifyClient("<YOUR_APIFY_TOKEN>")
run = client.actor("seibs.co/business-registry-intel").call(run_input={
    "mode": "entity_search",
    "companies": ["Acme Holdings"],
    "states": ["CA", "NY", "FL", "TX"]
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)
````

Or via curl:

```bash
curl -X POST "https://api.apify.com/v2/acts/seibs.co~business-registry-intel/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"mode": "entity_search", "companies": ["Acme Holdings"], "states": ["CA","NY","FL","TX"]}'
```

Or click "Try for free" on this page if you prefer the no-code UI.

### What you get

Each run produces:

- A clean dataset, filterable in the Apify console and downloadable as CSV or JSON
- An OUTPUT.html dashboard preview of your top records
- A sample-output preview at [`.actor/sample-output.json`](./.actor/sample-output.json)
- An `access_notes` record up top documenting each state's access method, proxy needs, and any cost/gating

### What does Business Registry Intel do?

It queries each selected state's public business-entity registry and normalizes every result into one schema: `entity_name`, `file_number`, `jurisdiction`, `entity_type` (normalized to `llc` / `corporation` / `lp` / ...), `status` (normalized to `active` / `dissolved` / `suspended` / ...), `formation_date`, `registered_agent`, `officers`, `principal_address`, and `source_url`. Then it runs the **value layer**:

- **Cross-state entity resolution** - clusters records that refer to the same legal entity across jurisdictions (canonical-name + fuzzy match), so a company registered in CA and foreign-qualified in NY shows up as one `entity_cluster` rollup, not two disconnected rows.
- **Officer-to-company linking** - groups officers and registered agents by a normalized person key and flags **multi-entity individuals** (the AML/investigator signal).
- **Ownership/association graph** - infers parent/subsidiary/sister-entity edges from shared officers, shared registered agents, shared addresses, and common name roots, then reports the connected components.

### Modes

| Mode | What it returns |
|---|---|
| `entity_search` (default) | Every matching entity per state for your name queries, plus cross-state `entity_cluster` resolution rollups. |
| `entity_profile` | Deep profile (registered agent, officers/directors, addresses, filing history where exposed) for the matched entities - charges `officer_director_enrichment` per entity that yields officer data. |
| `officer_lookup` | One `officer_link` record per person, with their resolved company footprint and a `multi_entity` flag. |
| `ownership_graph` | The full association graph (`nodes`, `edges`, `components`) plus the entities and clusters it was built from. Charges `ownership_graph_enrichment` once. |

### National coverage: all 50 states + DC

The 50 US registries share no schema and no unified API - that fragmentation is the moat we unify. Every state + DC is registered as a connector with a documented access method, anti-bot level, and proxy tier. Coverage is honestly tiered:

**Fully-parsed registries (13)** - request + parser verified against the live portal, returns real normalized entity data (11 over http; PA + MI over the browser tier):

| State | Registry | Access method | Anti-bot | Proxy | Notes |
|---|---|---|---|---|---|
| **PA** | Pennsylvania DOS (PennFile) | `browser` (bizfile API behind Cloudflare) | high | RESIDENTIAL | patchright clears Cloudflare; bizfile API called from inside the page. ~3M entities. |
| **MI** | Michigan LARA (MiBusiness Registry) | `browser` (webSearch JSON API behind Cloudflare) | high | RESIDENTIAL | patchright clears Cloudflare; agent + status + type in the search row. |
| **CA** | California SOS (bizfileOnline) | `http_json` (POST API) | low | DATACENTER | Bulk data available. Officers via detail page. |
| **NY** | NY Dept. of State Public Inquiry | `http_json` (token bootstrap + POST) | moderate | DATACENTER | Officers via detail page. |
| **FL** | Florida Sunbiz | `http_html` | moderate (edge WAF) | DATACENTER -> RESIDENTIAL | Edge WAF 403s plain HTTP; cleared by the curl\_cffi tier. Officers via detail page (drift-guarded). |
| **TX** | Texas Comptroller Taxable Entity Search | `http_html` | low | DATACENTER | SOSDirect officer data is paid -> out of scope. |
| **CO** | Colorado SOS (Business Entities open data) | `http_json` (Socrata) | none | DATACENTER | Full dataset downloadable. Agent + principal address in search. |
| **CT** | Connecticut SOS (Business Registry open data) | `http_json` (Socrata) | none | DATACENTER | Open data exposes name/status/date; no entity type. |
| **OR** | Oregon SOS (Business Registry open data) | `http_json` (Socrata) | none | DATACENTER | Multi-row-per-entity (grouped); agent + principal + authorized reps. Active entities only. |
| **WI** | Wisconsin DFI (Corporate Records) | `http_html` (plain GET) | none | DATACENTER | Name/type/status/formation date in results; officers not online (agent free on detail). |
| **NJ** | New Jersey DORES (Business Name Search) | `http_html` (GET token -> POST) | low | DATACENTER | Free path: name/id/city/type/formation date. Status/agent/officers are paid -> out of scope (like TX/DE). |
| **ID** | Idaho SOS (SOSBiz) | `http_json` (bizfile platform) | low | DATACENTER | Same vendor API family as CA. Status/type/agent in search. |
| **ND** | North Dakota SOS (FirstStop) | `http_json` (bizfile platform) | low | DATACENTER | Same vendor API family as CA. Status/type in search. |

**Catalog-registered (38)** - correct portal + access method + anti-bot + proxy tier recorded, escalation pipeline wired, per-state parser pending (these return a documented `state_pending` note). Grouped by *why* they're pending:

- **`browser_required` (21)** - the browser tier now **defeats Cloudflare/Imperva** (patchright headful - same approach that made PA + MI fully-parsed), so for **14 of these the only remaining work is capturing each portal's search API/selectors** (a routine ~15-min task per state): **AK, AR, IL, MA, MD, MN, NC, NM, NV, OH, OK, WA** (Cloudflare/SPA, no CAPTCHA) - drop-in candidates for the next round. The other **9 cross the no-CAPTCHA / no-login line and stay fail-soft**: **AZ, GA, NE, SC, WY** (per-search CAPTCHA - need the opt-in solver), **VA** (reCAPTCHA v3 Enterprise -> use its bulk file), **DC** (login), **DE** (CAPTCHA + opt-in fee), **HI** (login/migration).
- **`http_html` ViewState-pending (14)** - reachable via the curl\_cffi tier, but the search is an ASP.NET WebForms/MVC form needing a POST with per-page ViewState/anti-forgery tokens + a live result-row capture: **AL, IN, KS, KY, LA, ME, MO, NH, RI, SD, TN, UT, VT, WV**.
- **`http_json` pending (3)** - **IA** (Socrata dataset 404s after a portal migration; REST API has no name-search endpoint), **MS** (corpreporting JSON returns the full DB - its Kendo filter does not narrow server-side), **MT** (same bizfile platform as CA/ID/ND, but its API 500s to logged-out queries).

Pass `states: ["ALL"]` (or `all_states: true`) to query every jurisdiction. The live `access_matrix` (all 51, with `coverage` per state) is emitted in the `access_notes` record on every run. Upgrading a catalog state to fully-parsed is a single per-state parser (and, for the browser\_required group, the Playwright image) - the orchestrator, entity-resolution, escalation, and monitor layers are all state-agnostic.

### Anti-bot escalation (residential + browser)

Many registries sit behind an edge WAF that fingerprints the TLS/JA3 of the caller and 403s a plain request even from a clean IP (FL Sunbiz does exactly this). The client escalates automatically instead of giving up:

1. **httpx** over the DATACENTER proxy - cheapest, tried first.
2. On a Cloudflare/CAPTCHA/403 challenge -> **curl\_cffi** with real Chrome TLS impersonation over the **RESIDENTIAL** proxy. This defeats JA3/TLS-fingerprint WAFs (it turns FL Sunbiz's 403 into a 200 with real data) and is the portfolio's proven anti-bot tool.
3. On a true JS/CAPTCHA challenge (`browser_required` states like DE) -> **Playwright** headless Chromium over the RESIDENTIAL proxy. This tier is optional: it runs on a Playwright-capable image and is skipped cleanly otherwise.
4. **Fail-soft** - if every tier is blocked or unavailable, the connector emits a documented `fetch_error` and the run still finishes SUCCEEDED with whatever other states returned.

The proxy tier auto-selects: DATACENTER for the first pass, RESIDENTIAL for the escalation legs (provisioned up front). Set `use_browser_fallback: false` to use plain httpx only. Per-run escalation counts are reported in the `access_notes.anti_bot_escalation` block. Delaware is **off by default** and never spends its per-search fee silently.

### The browser tier (how PA + MI work, and how to extend it)

The browser tier opens a **stealth-patched browser (`patchright`, bundled) in headful mode**, which **defeats the Cloudflare/Imperva managed challenges** that block a plain headless browser (verified live: PA + MI return real data this way). For Cloudflare-fronted JSON APIs it calls the API **from inside the warmed page** via `fetch()`, so the request carries the `cf_clearance` cookie + the real browser TLS (a Playwright APIRequestContext does not, and 403s).

- **PA** warms the page, then POSTs the bizfile API in-page (same shape as CA/ID/ND). **MI** GETs its `webSearch` API in-page. Both are `coverage: full`.
- Runs headful by default; set **`BROWSER_HEADLESS=1`** to force headless (e.g. a server with no display and no Xvfb - but Cloudflare will then block). On the `apify/actor-python-playwright` image headful runs under Xvfb automatically.
- Optionally point at an **already-running anti-detect browser** via **`browser_cdp_url`** (input) or **`BROWSER_CDP_URL`** (env) to inherit its session/IP over CDP - useful for a residential-IP browser you already trust.

**Adding the remaining Cloudflare/SPA states (AK, AR, IL, MA, MD, MN, NC, NM, NV, OH, OK, WA)** is now routine: patchright already passes their challenge, so each just needs its search API/selectors captured once and a ~20-line connector (a `browser_recipe` returning either an `api` call or a `fill`/`submit`/`capture` flow, plus a parser - copy PA/MI). They currently make a generic best-effort attempt (rows tagged `parse_confidence: "generic"`).

**CAPTCHA / login states (AZ, GA, NE, SC, WY, DC, DE, HI, VA)** gate the *search action* behind a real CAPTCHA or a login - off by default:

- Set **`CAPTCHA_SOLVER_PROVIDER`** (`2captcha` | `capsolver`) + **`CAPTCHA_SOLVER_KEY`** as actor env secrets to enable the solver (AZ has a worked fill -> solve -> capture recipe; the others follow once their selectors are captured).
- VA's reCAPTCHA v3 Enterprise is unsolvable (use its bulk file); DC/DE need a login/fee and stay opt-in.

The `access_notes.browser_tier` record reports whether the CDP endpoint + solver are configured each run. **Responsibility note:** these are government public records, but solving a CAPTCHA or logging in circumvents an access control - the actor only does so when you explicitly enable it, and never pays per-record fees.

### Miami (and any city) -> covered by the state connector

US business registration is at the **state** level, not the city level. A Miami-based company is registered with the **Florida** Division of Corporations (Sunbiz), so it resolves through the **FL** connector - there is no separate "Miami" registry to add. The same holds for every city: search the state (e.g. `states: ["FL"]` for Miami/Orlando/Tampa, `states: ["NY"]` for NYC, `states: ["IL"]` for Chicago). Confirmed in testing: a Walt-Disney query against FL returns "THE WALT DISNEY COMPANY" and other Florida-registered Disney entities via Sunbiz.

### Responsible use / data scope

This actor is a **public-data tool** that reads **government public records** - business-entity registries that are public by law, with no adverse platform owner. It stays logged-out: no accounts, no cookies pasted by a user, no paywalls bypassed (Delaware's fee-gated detail report is opt-in and never auto-charged). It minimizes PII: officer and registered-agent **names are themselves public record** on these filings, and we keep only name + title + business address - we never enrich into private/personal contact data. The ownership graph is labeled an **association graph** (an investigative lead from publicly-filed linkages), not a verified beneficial-ownership determination. You are responsible for lawful use of the outputs - GDPR (EU) and CCPA (CA) apply to personal data even when it is public.

### AI / RAG / Agent

A turn-key KYB feed for compliance copilots, diligence agents, and B2B-enrichment bots. Entities arrive pre-normalized with `status`, `entity_type`, `entity_cluster_id`, and resolved `officers` so an agent can answer "is this counterparty active, and what other entities does its CEO control?" without parsing 50 different portals. Compatible with **LangChain**, **LlamaIndex**, **Pinecone**, **Weaviate**, **Chroma**, and any **MCP**-aware agent runtime (see the sibling `mcp-business-registry-intel` actor for direct tool-call wiring with x402 / Skyfire agentic payments).

### Features

- **National coverage** - all 50 states + DC catalogued; 13 fully parsed today (CA, NY, FL, TX, CO, CT, OR, WI, NJ, ID, ND + PA, MI via the browser tier), the rest catalog-registered with correct access metadata and the escalation pipeline wired. Pass `states: ["ALL"]` for every jurisdiction.
- **Automatic anti-bot escalation** - httpx -> curl\_cffi Chrome TLS impersonation (residential) -> optional Playwright browser (residential) -> fail-soft. Defeats the TLS-fingerprint WAFs registries use, with the proxy tier auto-selected per challenge.
- **Normalized entity schema** - one shape across every state, with `entity_type` and `status` mapped onto a canonical vocabulary.
- **Cross-state entity resolution** - canonical-name + fuzzy clustering (rapidfuzz when present, stdlib `difflib` fallback) with `entity_cluster_id` + confidence.
- **Officer-to-company linking** - per-person footprint with a `multi_entity` flag.
- **Ownership/association graph** - shared officer / agent / address / name-root edges with connected components.
- **Monitor mode** - run under an Apify Schedule and get only the change-delta (new filings, status changes, dissolutions) plus an optional Slack digest.
- **Cost-control** - pre-flight caps, per-run budget guard, and demo-mode soft-fail so runs finish SUCCEEDED.

### Use cases

- **KYB / AML onboarding** - verify a counterparty's legal existence, status, and registered agent across states in one call; flag dissolved or suspended entities.
- **PE / M\&A entity mapping** - resolve a target's entities across jurisdictions and map the subsidiary/sister-entity graph from shared officers and agents.
- **B2B firmographics** - attach the verified legal entity (and its officers) behind a Maps/website lead. Pairs with every vertical lead-finder in this portfolio.
- **Investigations / journalism** - find every company a person is an officer of, and the cluster of entities sharing their address or registered agent.

### Pricing (Pay Per Event)

| Event | Price | What it is |
|---|---|---|
| `company_record` | $0.008 | One normalized, cross-state-resolved legal entity. |
| `officer_director_enrichment` | $0.012 | Officers/directors, agent, and addresses from the detail page. |
| `ownership_graph_enrichment` | $0.020 | The association graph (premium KYB layer), once per `ownership_graph` run. |
| `scheduled_delta_run` | $0.050 | One monitor-mode change digest. |

A run that returns nothing costs nothing. Far below the gated alternatives (OpenCorporates GBP 2,250-12,000/yr; D\&B Direct+ $25,000+/yr).

### Related actors

- [sec-edgar-intel](https://apify.com/seibs.co/sec-edgar-intel) - federal SEC filings (Form D issuer <-> legal entity).
- [hiring-signal-intel](https://apify.com/seibs.co/hiring-signal-intel) - hiring surges per company.
- [b2b-sales-triggers](https://apify.com/seibs.co/b2b-sales-triggers) - buying-signal triggers.
- `mcp-business-registry-intel` - the MCP twin for AI agents (x402 / Skyfire ready).

# Actor input Schema

## `mode` (type: `string`):

entity\_search = search each registry by company name and resolve duplicates across states. entity\_profile = deep profile (officers, agent, addresses, filing history) for specific entities. officer\_lookup = group officers/agents by person and surface multi-entity individuals. ownership\_graph = build the parent/subsidiary/sister-entity association graph.

## `companies` (type: `array`):

Company names (or name fragments) to search each selected registry for, e.g. \['Acme Holdings', 'Blue Ridge Capital']. Hard cap of 50.

## `company_numbers` (type: `array`):

Profile specific entities directly without searching. Each item: {"state": "FL", "file\_number": "L21000123456"}. Use for entity\_profile / ownership\_graph when you already have the registry id. Hard cap of 50.

## `officer_names` (type: `array`):

Person names to resolve to their company footprint in officer\_lookup mode, e.g. \['Jane A Doe']. Officer names are public record on these filings; we keep name + title + business address only. Hard cap of 25.

## `states` (type: `array`):

Two-letter state codes. All 50 states + DC are recognized. 13 fully-parsed registries return real data: CA, NY, FL, TX, CO, CT, OR, WI, NJ, ID, ND, plus PA + MI via the browser tier (CA/NY/FL/TX/CO are the default set). The other 43 states are catalog-registered with the correct access method and the anti-bot escalation pipeline wired (a per-state parser is pending; they return a documented state\_pending note). Delaware (DE) is opt-in. Pass \['ALL'] to query every jurisdiction. Leave empty for the default five.

## `all_states` (type: `boolean`):

Shortcut to query every US jurisdiction (overrides the states list). Fully-parsed states return entities; catalog states return a documented state\_pending note. Delaware still needs enable\_delaware=true.

## `enable_delaware` (type: `boolean`):

Delaware is OFF by default: its entity search is CAPTCHA-gated and the detailed report carries a per-search fee (~$10). Turn this on to attempt DE. Even when on, the actor never silently spends the DE fee - it fails soft with a documented note if the portal blocks a logged-out request.

## `include_officers` (type: `boolean`):

Fetch each entity's detail page to add registered agent, officers/directors, and addresses. Charges officer\_director\_enrichment per entity that yields officer/director data. Defaults on for entity\_profile / ownership\_graph, off for a fast name-only entity\_search.

## `include_filing_history` (type: `boolean`):

Include the entity's filing-history list where the portal exposes it. Records grow larger; no extra charge.

## `max_results_per_state` (type: `integer`):

Hard cap on entities returned per state per name query. Default 25.

## `contact_email` (type: `string`):

Optional operator contact added to request headers. Good-citizen practice; not required.

## `monitor_webhook_url` (type: `string`):

When this actor runs under an Apify Schedule (monitor mode), post the change digest (new filings, status changes, dissolved entities) to this Slack-compatible webhook URL.

## `use_apify_proxy` (type: `boolean`):

Route registry requests through Apify Proxy. The DATACENTER tier handles the first (httpx) pass; a RESIDENTIAL tier is provisioned for the anti-bot escalation legs.

## `use_browser_fallback` (type: `boolean`):

When a portal serves a Cloudflare/CAPTCHA/403 challenge (e.g. FL Sunbiz from a datacenter IP), automatically escalate: switch to the RESIDENTIAL proxy and retry with curl\_cffi Chrome TLS impersonation, then (if available) a headless browser. Turn off to use plain httpx only.

## `browser_cdp_url` (type: `string`):

Optional. CDP/WebSocket endpoint of an already-running, anti-detect (UC-mode / real Chrome) browser. When set, the browser tier connects to it (inheriting its session + fingerprint) so it passes the Cloudflare/Imperva states (PA, OH, NC, MI, NV, ...). Without it a plain headless Chromium is launched (blocked by Cloudflare managed challenges). Can also be set as the BROWSER\_CDP\_URL env var. The per-search CAPTCHA states (AZ, GA, ...) additionally need an opt-in solver: set CAPTCHA\_SOLVER\_PROVIDER (2captcha|capsolver) + CAPTCHA\_SOLVER\_KEY as actor env secrets - off by default.

## `apify_proxy_groups` (type: `array`):

Override the auto-selected proxy group. Leave empty to let the actor pick DATACENTER (or RESIDENTIAL when Delaware is active).

## `concurrency` (type: `integer`):

Parallel registry fetches. Gov portals are rate-sensitive; default 4.

## Actor input object example

```json
{
  "mode": "entity_search",
  "companies": [
    "Acme Holdings"
  ],
  "company_numbers": [],
  "officer_names": [],
  "states": [
    "CA",
    "NY",
    "FL",
    "TX",
    "CO"
  ],
  "all_states": false,
  "enable_delaware": false,
  "include_officers": false,
  "include_filing_history": false,
  "max_results_per_state": 10,
  "contact_email": "",
  "monitor_webhook_url": "",
  "use_apify_proxy": true,
  "use_browser_fallback": true,
  "browser_cdp_url": "",
  "apify_proxy_groups": [],
  "concurrency": 4
}
```

# Actor output Schema

## `datasetItems` (type: `string`):

Narrow, token-efficient slice of every record. Consumer: LLM agents (Claude, GPT, LangChain tools), MCP hosts, KYB dashboards.

## `datasetItemsEntities` (type: `string`):

Full normalized legal entities with agent, officers, and addresses. Consumer: KYB/AML onboarding, CRM enrichment, RAG ingest.

## `datasetItemsClusters` (type: `string`):

One row per legal entity resolved across >= 2 jurisdictions. Consumer: PE/M\&A entity mapping, dedup pipelines.

## `datasetItemsMcp` (type: `string`):

First 50 overview records as a clean JSON array. Wrap on the agent side in an MCP tool-call envelope. Consumer: MCP servers, Claude Desktop, Cursor, OpenAI Assistants tool calls.

## `datasetItemsCsv` (type: `string`):

Spreadsheet-friendly export of the overview view. Consumer: compliance analysts, Excel / Google Sheets users.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "entity_search",
    "companies": [
        "Acme Holdings"
    ],
    "states": [
        "CA",
        "NY",
        "FL",
        "TX",
        "CO"
    ],
    "max_results_per_state": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("seibs.co/business-registry-intel").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "entity_search",
    "companies": ["Acme Holdings"],
    "states": [
        "CA",
        "NY",
        "FL",
        "TX",
        "CO",
    ],
    "max_results_per_state": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("seibs.co/business-registry-intel").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "entity_search",
  "companies": [
    "Acme Holdings"
  ],
  "states": [
    "CA",
    "NY",
    "FL",
    "TX",
    "CO"
  ],
  "max_results_per_state": 10
}' |
apify call seibs.co/business-registry-intel --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=seibs.co/business-registry-intel",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Business Registry & Ownership Intel - KYB, Officers",
        "description": "Unifies US Secretary-of-State business registries into one normalized entity schema: status, type, formation date, registered agent, officers, addresses. KYB value layer: cross-state entity resolution, officer linking, and an ownership graph. Logged-out public records. For KYB/AML and PE/M&A.",
        "version": "0.7",
        "x-build-id": "YoIQetYX5R3paoD2u"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/seibs.co~business-registry-intel/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-seibs.co-business-registry-intel",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/seibs.co~business-registry-intel/runs": {
            "post": {
                "operationId": "runs-sync-seibs.co-business-registry-intel",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/seibs.co~business-registry-intel/run-sync": {
            "post": {
                "operationId": "run-sync-seibs.co-business-registry-intel",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "entity_search",
                            "entity_profile",
                            "officer_lookup",
                            "ownership_graph"
                        ],
                        "type": "string",
                        "description": "entity_search = search each registry by company name and resolve duplicates across states. entity_profile = deep profile (officers, agent, addresses, filing history) for specific entities. officer_lookup = group officers/agents by person and surface multi-entity individuals. ownership_graph = build the parent/subsidiary/sister-entity association graph.",
                        "default": "entity_search"
                    },
                    "companies": {
                        "title": "Company name queries",
                        "maxItems": 50,
                        "type": "array",
                        "description": "Company names (or name fragments) to search each selected registry for, e.g. ['Acme Holdings', 'Blue Ridge Capital']. Hard cap of 50.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "company_numbers": {
                        "title": "Direct entity references (state + file number)",
                        "maxItems": 50,
                        "type": "array",
                        "description": "Profile specific entities directly without searching. Each item: {\"state\": \"FL\", \"file_number\": \"L21000123456\"}. Use for entity_profile / ownership_graph when you already have the registry id. Hard cap of 50.",
                        "default": []
                    },
                    "officer_names": {
                        "title": "Officer / person names (officer_lookup)",
                        "maxItems": 25,
                        "type": "array",
                        "description": "Person names to resolve to their company footprint in officer_lookup mode, e.g. ['Jane A Doe']. Officer names are public record on these filings; we keep name + title + business address only. Hard cap of 25.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "states": {
                        "title": "States to query",
                        "maxItems": 51,
                        "type": "array",
                        "description": "Two-letter state codes. All 50 states + DC are recognized. 13 fully-parsed registries return real data: CA, NY, FL, TX, CO, CT, OR, WI, NJ, ID, ND, plus PA + MI via the browser tier (CA/NY/FL/TX/CO are the default set). The other 43 states are catalog-registered with the correct access method and the anti-bot escalation pipeline wired (a per-state parser is pending; they return a documented state_pending note). Delaware (DE) is opt-in. Pass ['ALL'] to query every jurisdiction. Leave empty for the default five.",
                        "default": [
                            "CA",
                            "NY",
                            "FL",
                            "TX",
                            "CO"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "all_states": {
                        "title": "Query all 50 states + DC",
                        "type": "boolean",
                        "description": "Shortcut to query every US jurisdiction (overrides the states list). Fully-parsed states return entities; catalog states return a documented state_pending note. Delaware still needs enable_delaware=true.",
                        "default": false
                    },
                    "enable_delaware": {
                        "title": "Enable Delaware (per-search fee)",
                        "type": "boolean",
                        "description": "Delaware is OFF by default: its entity search is CAPTCHA-gated and the detailed report carries a per-search fee (~$10). Turn this on to attempt DE. Even when on, the actor never silently spends the DE fee - it fails soft with a documented note if the portal blocks a logged-out request.",
                        "default": false
                    },
                    "include_officers": {
                        "title": "Fetch officers / agent / addresses (detail page)",
                        "type": "boolean",
                        "description": "Fetch each entity's detail page to add registered agent, officers/directors, and addresses. Charges officer_director_enrichment per entity that yields officer/director data. Defaults on for entity_profile / ownership_graph, off for a fast name-only entity_search.",
                        "default": false
                    },
                    "include_filing_history": {
                        "title": "Include filing history",
                        "type": "boolean",
                        "description": "Include the entity's filing-history list where the portal exposes it. Records grow larger; no extra charge.",
                        "default": false
                    },
                    "max_results_per_state": {
                        "title": "Max entities per state per query",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Hard cap on entities returned per state per name query. Default 25.",
                        "default": 25
                    },
                    "contact_email": {
                        "title": "Contact email (optional)",
                        "type": "string",
                        "description": "Optional operator contact added to request headers. Good-citizen practice; not required.",
                        "default": ""
                    },
                    "monitor_webhook_url": {
                        "title": "Monitor webhook URL (Slack / email, optional)",
                        "type": "string",
                        "description": "When this actor runs under an Apify Schedule (monitor mode), post the change digest (new filings, status changes, dissolved entities) to this Slack-compatible webhook URL.",
                        "default": ""
                    },
                    "use_apify_proxy": {
                        "title": "Use Apify Proxy",
                        "type": "boolean",
                        "description": "Route registry requests through Apify Proxy. The DATACENTER tier handles the first (httpx) pass; a RESIDENTIAL tier is provisioned for the anti-bot escalation legs.",
                        "default": true
                    },
                    "use_browser_fallback": {
                        "title": "Anti-bot escalation (curl_cffi + browser)",
                        "type": "boolean",
                        "description": "When a portal serves a Cloudflare/CAPTCHA/403 challenge (e.g. FL Sunbiz from a datacenter IP), automatically escalate: switch to the RESIDENTIAL proxy and retry with curl_cffi Chrome TLS impersonation, then (if available) a headless browser. Turn off to use plain httpx only.",
                        "default": true
                    },
                    "browser_cdp_url": {
                        "title": "Warm browser CDP endpoint (for browser-required states)",
                        "type": "string",
                        "description": "Optional. CDP/WebSocket endpoint of an already-running, anti-detect (UC-mode / real Chrome) browser. When set, the browser tier connects to it (inheriting its session + fingerprint) so it passes the Cloudflare/Imperva states (PA, OH, NC, MI, NV, ...). Without it a plain headless Chromium is launched (blocked by Cloudflare managed challenges). Can also be set as the BROWSER_CDP_URL env var. The per-search CAPTCHA states (AZ, GA, ...) additionally need an opt-in solver: set CAPTCHA_SOLVER_PROVIDER (2captcha|capsolver) + CAPTCHA_SOLVER_KEY as actor env secrets - off by default.",
                        "default": ""
                    },
                    "apify_proxy_groups": {
                        "title": "Proxy groups (optional override)",
                        "type": "array",
                        "description": "Override the auto-selected proxy group. Leave empty to let the actor pick DATACENTER (or RESIDENTIAL when Delaware is active).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "concurrency": {
                        "title": "Max concurrent requests",
                        "minimum": 1,
                        "maximum": 8,
                        "type": "integer",
                        "description": "Parallel registry fetches. Gov portals are rate-sensitive; default 4.",
                        "default": 4
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
