# ECHA Europe Chemicals Scraper (`parseforge/echa-europe-chemicals-scraper`) Actor

Search the European Chemicals Agency registry by substance, CAS, or EC number and return substance\_name, cas\_number, ec\_number, classification, hazard\_statements, registrants, and tonnage. Useful for REACH compliance, EHS workflows, and chemical safety research across EU industries.

- **URL**: https://apify.com/parseforge/echa-europe-chemicals-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Automation, Integrations
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $7.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🧪 ECHA Europe Chemicals Registry Scraper

> 🚀 **Export the European Chemicals Agency public registry in seconds. Substance names, CAS numbers, EC numbers, hazard classifications, and registrant counts.**

> 🕒 **Last updated:** 2026-05-29 · **📊 9 fields** per record · 100,000+ substances · CLP and REACH coverage

The ECHA Europe Chemicals Registry Scraper turns the [echa.europa.eu](https://echa.europa.eu/information-on-chemicals) public substance search into a structured dataset. It queries ECHA's substance API and returns one row per matching substance with identifiers, classification, and registration metadata.

Coverage spans the substances published in ECHA's Information on Chemicals portal, including REACH-registered and CLP-classified substances.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| 🧪 Chemical regulatory teams | Look up CLP hazard classifications by CAS |
| 🏭 EHS managers | Audit substance inventories against REACH |
| 📰 Journalists | Verify hazard statements for stories |
| 🤖 Data engineers | Mirror ECHA into a compliance warehouse |
| 🎓 Researchers | Build substance cohorts for studies |
| 💼 Supply chain leads | Screen incoming materials against ECHA listings |

### 📋 What the ECHA Europe Chemicals Registry Scraper does

- Queries the public ECHA substance search API with a free text term.
- Flattens each substance result into a normalized record.
- Joins hazard statements into a single string for easy spreadsheet use.
- Surfaces upstream errors as a single diagnostic row.

> 💡 **Why it matters:** ECHA's public viewer is paginated and requires multiple clicks per substance. This actor returns the underlying API data directly.

### 🎬 Full Demo

_🚧 Coming soon._

### ⚙️ Input

<table>
<tr><th>Field</th><th>Type</th><th>Required</th><th>Description</th></tr>
<tr><td><code>search</code></td><td>string</td><td>No</td><td>Substance name, CAS number, or EC number. Prefill: benzene.</td></tr>
<tr><td><code>maxItems</code></td><td>integer</td><td>No</td><td>Free 10, paid up to 1,000,000.</td></tr>
</table>

**Example 1 - Benzene lookup:**
```json
{ "search": "benzene", "maxItems": 5 }
````

**Example 2 - Specific CAS number:**

```json
{ "search": "71-43-2", "maxItems": 1 }
```

> ⚠️ **Good to Know:** ECHA's API throttles aggressive callers. This actor rotates residential proxies and retries to keep runs reliable.

### 📊 Output

| Field | Type | Description |
|---|---|---|
| 🧪 `substance_name` | string | Official substance name. |
| 🔬 `cas_number` | string | CAS registry number. |
| 🧬 `ec_number` | string | EC (EINECS) number. |
| ⚠️ `classification` | string | Harmonised CLP classification. |
| ☣️ `hazard_statements` | string | H-statements joined by semicolon. |
| 🏭 `registrants` | number | Number of REACH registrants. |
| ⚖️ `tonnage` | string | Tonnage band reported under REACH. |
| 🕒 `scrapedAt` | string | When this row was fetched. |
| ❌ `error` | string | Set if upstream response was an error. |

**Sample record:**

```json
{
  "substance_name": "Benzene",
  "cas_number": "71-43-2",
  "ec_number": "200-753-7",
  "classification": "Carc. 1A; Muta. 1B; Asp. Tox. 1",
  "hazard_statements": "H350; H340; H304; H225",
  "registrants": 312,
  "tonnage": "1 000 000 - 10 000 000 tonnes per annum",
  "scrapedAt": "2026-05-29T13:00:00.000Z",
  "error": null
}
```

### ✨ Why choose this Actor

| 🆓 | Works with no API key. |
| 🇪🇺 | Covers ECHA's full public substance index. |
| 🧹 | Hazard statements joined into a single spreadsheet-friendly column. |
| 🛟 | Surfaces upstream errors as a clean diagnostic row. |
| 💾 | Push to dataset and export CSV, Excel, JSON, or XML. |

### 📈 How it compares to alternatives

| Approach | Setup time | Clean rows | Maintained |
|---|---|---|---|
| Manual ECHA portal lookup | minutes per substance | ❌ | manual |
| ECHA bulk dumps | hours to parse | partial | quarterly |
| **This Actor** | 5 sec, no install | ✅ | live |

### 🚀 How to use

1. Click **Try for free**.
2. Type a substance name, CAS, or EC number.
3. Click **Start**.

### 💼 Business use cases

**🧪 Regulatory screening.** Run incoming SKUs against ECHA's hazard classifications.

**🏭 EHS compliance.** Audit warehouse inventories against REACH registration tonnage bands.

**📰 Newsroom.** Verify hazard statements quoted in industrial accident reporting.

**🤖 Compliance pipelines.** Pipe results into an internal MDM or SAP system.

### 🔌 Automating ECHA Europe Chemicals Registry Scraper

- **Make / Zapier**: trigger and push to Airtable or Google Sheets.
- **Cron schedule**: weekly hazards refresh.
- **Webhooks**: POST to your endpoint after each run.
- **Pipe to BigQuery / Snowflake / Postgres**: native integrations.

### 🌟 Beyond business use cases

**🎓 Education.** Teach chemical safety with real CLP data.

**🧪 Personal research.** Look up household chemical hazards before buying.

**🤝 Non-profit.** Power consumer advocacy and right-to-know apps.

**🧰 Prototyping.** Mock up a compliance dashboard quickly.

### 🤖 Ask an AI assistant about this scraper

Paste this README into ChatGPT or Claude.

### ❓ Frequently Asked Questions

**❓ Do I need an API key?** No.

**❓ How fresh is the data?** Live with ECHA's portal updates.

**❓ Can I search by CAS?** Yes, paste a CAS number into the search field.

**❓ Are hazard statements included?** Yes, joined into one string.

**❓ Can I schedule runs?** Yes via the Apify scheduler.

**❓ Is this scraping or API?** Public ECHA portal API.

**❓ What format can I download?** CSV, Excel, JSON, JSONL, XML, RSS, or HTML.

**❓ What if nothing matches?** A diagnostic record with `error` is pushed.

**❓ Will the schema change?** Stable.

**❓ Does it follow robots and ToS?** Yes, only public data is fetched.

### 🔌 Integrate with any app

Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST API or webhook.

### 🔗 Recommended Actors

| Actor | What it does |
|---|---|
| [ParseForge CNES Brazil Health Scraper](https://apify.com/parseforge/cnes-brazil-health-establishments-scraper) | Brazil health establishments. |
| [ParseForge ONPE Peru Elections Scraper](https://apify.com/parseforge/onpe-peru-elections-scraper) | Peru election results. |
| [ParseForge collection](https://apify.com/parseforge) | 900+ production scrapers. |

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for 900+ production-grade scrapers across business intelligence, real estate, e-commerce, sports, finance, and public records.

***

**Disclaimer:** This actor scrapes only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any of the third-party services referenced. Users are responsible for complying with the target site's terms of service and applicable law. [Create a free account w/ $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp).

# Actor input Schema

## `search` (type: `string`):

Substance name, CAS number, or EC number.

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## Actor input object example

```json
{
  "search": "benzene",
  "maxItems": 10
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "search": "benzene",
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/echa-europe-chemicals-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "search": "benzene",
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/echa-europe-chemicals-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "search": "benzene",
  "maxItems": 10
}' |
apify call parseforge/echa-europe-chemicals-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/echa-europe-chemicals-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "ECHA Europe Chemicals Scraper",
        "description": "Search the European Chemicals Agency registry by substance, CAS, or EC number and return substance_name, cas_number, ec_number, classification, hazard_statements, registrants, and tonnage. Useful for REACH compliance, EHS workflows, and chemical safety research across EU industries.",
        "version": "0.1",
        "x-build-id": "8yIo9DazSefQIcvAe"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~echa-europe-chemicals-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-echa-europe-chemicals-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~echa-europe-chemicals-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-echa-europe-chemicals-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~echa-europe-chemicals-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-echa-europe-chemicals-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "search": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Substance name, CAS number, or EC number."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
