# Dun & Bradstreet Scraper (`crawlerbros/dnb-scraper`) Actor

Extract company profile data from Dun & Bradstreet (D\&B) business directory with fields like company name, industry, location, description, and more.

- **URL**: https://apify.com/crawlerbros/dnb-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Lead generation, Other, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 17 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## D&B + Wikipedia + Wikidata Company Scraper

Extract **rich company intelligence** on any business — combining the **Dun & Bradstreet profile URL** with **Wikipedia descriptions** and **Wikidata structured facts** (CEO, founders, revenue, employees, industries, HQ, stock info, and more).

### Features

- **29 output fields** per company — the most complete free company data available
- **D&B profile URL** + name + location (via Google SERP)
- **Wikipedia summary** — canonical description, full paragraph, thumbnail
- **Wikidata structured facts** — machine-readable properties from the world's largest open knowledge base:
  - Founded year, official legal name, website
  - **Current CEO** (filtered via Wikidata rank + end-time)
  - Founders, headquarters, industries
  - **Latest employee count** with reporting year
  - **Latest revenue** with currency and year
  - Stock ticker + exchanges
  - ISIN, company type, country, logo
- **100% reliable** — free data sources, no Cloudflare/geo blocks
- **No nulls** — every field has a typed default

### How It Works

D&B's own pages are geo-restricted and heavily Cloudflare-protected, so direct scraping is unreliable. This scraper fixes that by combining **three complementary sources**:

1. **Google SERP** — finds the D&B profile URL + headquarters location
2. **Wikipedia REST API** — fetches the canonical summary, description, and thumbnail
3. **Wikidata API** — pulls 20+ structured facts about the company, including current CEO, latest revenue, employees, founders, stock info, and more

All three sources are free, reliable, and not blocked — delivering **richer data than D&B itself shows publicly**.

### Input

| Field | Type | Description |
|-------|------|-------------|
| `companyNames` | Array | Company names to look up |
| `maxItems` | Integer | Max profiles to scrape (default 20) |

#### Example Input

```json
{
    "companyNames": ["Apple Inc", "Tesla Inc", "Microsoft Corporation"],
    "maxItems": 10
}
````

### Output

Each company is saved as a dataset item with **29 fields**:

#### Identity & Sources

| Field | Type | Description |
|-------|------|-------------|
| `name` | String | Company name |
| `query` | String | Original query |
| `dnbUrl` | String | Dun & Bradstreet profile URL |
| `dnbLocation` | String | HQ location from D\&B |
| `wikipediaUrl` | String | Wikipedia article URL |
| `wikipediaTitle` | String | Wikipedia article title |
| `wikidataId` | String | Wikidata Q-ID |

#### Descriptions

| Field | Type | Description |
|-------|------|-------------|
| `description` | String | Short description |
| `summary` | String | Full Wikipedia summary paragraph |
| `thumbnail` | String | Logo/thumbnail image URL |
| `logo` | String | Company logo from Wikidata |

#### Key Facts

| Field | Type | Description |
|-------|------|-------------|
| `officialName` | String | Official legal name |
| `founded` | String | Year founded |
| `website` | String | Canonical website URL |
| `country` | String | Company country |
| `headquarters` | Array | HQ locations |
| `industries` | Array | Industries the company operates in |
| `companyType` | String | Type (public company, corporation, etc.) |

#### People

| Field | Type | Description |
|-------|------|-------------|
| `ceo` | String | Current CEO (filtered by end-time/rank) |
| `founders` | Array | Company founders |

#### Financials

| Field | Type | Description |
|-------|------|-------------|
| `employees` | Integer | Latest employee count |
| `employeesYear` | String | Year of employee count |
| `revenue` | String | Latest revenue |
| `revenueCurrency` | String | Revenue currency |
| `revenueYear` | String | Year of revenue |

#### Market Info

| Field | Type | Description |
|-------|------|-------------|
| `stockTicker` | String | Stock ticker symbol (e.g., `AAPL`) |
| `stockExchanges` | Array | Exchanges the company is listed on |
| `isin` | String | ISIN number |

| `scrapedAt` | String | ISO 8601 scrape timestamp |

#### Example Output

```json
{
    "name": "Apple Inc.",
    "query": "Apple Inc",
    "dnbUrl": "https://www.dnb.com/business-directory/company-profiles.apple_inc.ec7f550b3a97b94d919d837672573959.html",
    "dnbLocation": "Cupertino, California",
    "wikipediaUrl": "https://en.wikipedia.org/wiki/Apple_Inc.",
    "wikipediaTitle": "Apple Inc.",
    "wikidataId": "Q312",
    "description": "American multinational technology company",
    "summary": "Apple Inc. is an American multinational technology company headquartered in Cupertino, California...",
    "officialName": "Apple Inc.",
    "founded": "1976",
    "website": "https://apple.com",
    "industries": ["software industry", "consumer electronics industry", "digital distribution"],
    "ceo": "Tim Cook",
    "headquarters": ["Apple Park", "Cupertino"],
    "employees": 164000,
    "employeesYear": "2022",
    "revenue": "416161000000",
    "revenueCurrency": "United States dollar",
    "revenueYear": "2025",
    "founders": ["Steve Wozniak", "Ronald Wayne", "Steve Jobs"],
    "stockTicker": "AAPL",
    "stockExchanges": ["Nasdaq", "Tokyo Stock Exchange"],
    "isin": "US0378331005",
    "companyType": "enterprise, business, public company, corporation, technology company",
    "country": "United States",
    "scrapedAt": "2026-04-10T12:00:00+00:00"
}
```

### FAQ

**Q: Why is the data from Wikipedia/Wikidata instead of D\&B directly?**
D\&B's business directory is heavily Cloudflare-protected and geo-restricted. Meanwhile, Wikipedia/Wikidata contain the same (and often richer) company data — curated, structured, and freely accessible. This scraper combines both to give you the D\&B profile URL alongside rich structured data.

**Q: How fresh is the data?**
Wikidata is updated continuously — revenue, employee counts, and CEO changes typically reflect within days of announcements. Each value includes its reporting year so you can judge freshness.

**Q: How does the "current CEO" filter work?**
Wikidata stores all past CEOs with start and end dates. This scraper selects the entry with `rank=preferred` (explicitly marked current), or falls back to the entry with no `end time` qualifier.

**Q: Does this work for private/small companies?**
Large public companies have comprehensive Wikidata entries. Smaller companies may only have the D\&B URL + location, with empty Wikipedia/Wikidata fields.

### Use Cases

- **Lead generation & sales intelligence** — qualify leads with revenue, employees, and CEO data
- **Competitive intelligence** — track CEO changes, financial performance
- **Due diligence** — verify company age, HQ, stock listing
- **M\&A research** — identify subsidiaries, parent companies, founders
- **Investment research** — stock tickers, ISIN, exchanges, market info

# Actor input Schema

## `companyNames` (type: `array`):

List of company names to look up on Dun & Bradstreet.

## `maxItems` (type: `integer`):

Maximum number of companies to scrape.

## Actor input object example

```json
{
  "companyNames": [
    "Apple Inc"
  ],
  "maxItems": 1
}
```

# Actor output Schema

## `companies` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "companyNames": [
        "Apple Inc"
    ],
    "maxItems": 1
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/dnb-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "companyNames": ["Apple Inc"],
    "maxItems": 1,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/dnb-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "companyNames": [
    "Apple Inc"
  ],
  "maxItems": 1
}' |
apify call crawlerbros/dnb-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/dnb-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Dun & Bradstreet Scraper",
        "description": "Extract company profile data from Dun & Bradstreet (D&B) business directory with fields like company name, industry, location, description, and more.",
        "version": "1.0",
        "x-build-id": "H6dZYche8aoJbaM0g"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~dnb-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-dnb-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~dnb-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-dnb-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~dnb-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-dnb-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "companyNames"
                ],
                "properties": {
                    "companyNames": {
                        "title": "Company Names",
                        "type": "array",
                        "description": "List of company names to look up on Dun & Bradstreet.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 200,
                        "type": "integer",
                        "description": "Maximum number of companies to scrape.",
                        "default": 20
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
