ECHA Scraper โ EU Chemical Substance Data & Hazard Info
Pricing
Pay per event
ECHA Scraper โ EU Chemical Substance Data & Hazard Info
Scrape chemical substance records from the European Chemicals Agency. Get CAS numbers, EC numbers, molecular formulas, hazard classifications, and REACH data.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Studio Amba
Actor stats
0
Bookmarked
6
Total users
1
Monthly active users
11 days ago
Last modified
Categories
Share
ECHA Scraper โ Extract Chemical Substance Data from the EU Chemicals Database
Automatically extract chemical substance information, CAS numbers, EC numbers, hazard classifications, and REACH registration data from the European Chemicals Agency (ECHA) database at echa.europa.eu.
What is ECHA Scraper?
The ECHA database holds detailed regulatory information on over 23,000 registered chemical substances in the European market. Manually searching through it is slow, frustrating, and impractical if you need data on more than a handful of substances.
ECHA Scraper gives you structured, machine-readable data from ECHA in minutes instead of days. Here is how teams use it:
- Chemical compliance officers monitor newly registered substances and track hazard classification changes across their product portfolio
- REACH consultants pull registration data for hundreds of substances at once to prepare dossier assessments
- Supply chain managers verify that raw materials and intermediates have valid REACH registrations before signing procurement contracts
- Environmental researchers build datasets of hazardous substances filtered by molecular formula, CAS number, or classification for academic studies
- Product safety teams cross-reference CAS numbers against their ingredient lists to flag substances requiring Safety Data Sheet updates
What data does ECHA Scraper extract?
Every scraped substance includes the following fields:
- ๐งช name โ Official IUPAC or common substance name (e.g., "Benzene", "Bisphenol A")
- ๐ข casNumber โ CAS Registry Number for unambiguous identification (e.g., "71-43-2")
- ๐ ecNumber โ European Community number from the EINECS/ELINCS inventory (e.g., "200-753-7")
- ๐งฌ molecularFormula โ Chemical formula (e.g., "C6H6")
- โ๏ธ registrationType โ REACH registration status ("Full registration", "Intermediate registration")
- โข๏ธ hazardClassification โ CLP hazard classification information
- ๐ท๏ธ substanceType โ Substance category
- ๐ description โ ECHA's summary description of the substance
- ๐ url โ Direct link to the substance infocard on ECHA
- ๐ scrapedAt โ ISO timestamp of when the data was extracted
How to scrape ECHA
Configure your scrape with these input parameters:
| Parameter | Type | Description |
|---|---|---|
| Start URLs | Array | Specific ECHA pages to scrape. Paste substance infocard URLs or listing pages directly. |
| Search Query | String | Search by substance name or CAS number. Examples: benzene, 71-43-2, bisphenol. |
| Max Results | Integer | Maximum substances to return (1-10,000). Default: 100. |
| Proxy Configuration | Object | Proxy settings for improved reliability on large scrapes. |
Tips for best results
- Search by CAS number for precise lookups โ enter
71-43-2to find exactly Benzene - Use partial names to cast a wider net โ searching
phthalatereturns all phthalate-related substances - Paste individual infocard URLs into Start URLs when you need specific substances from a known list
- Start with a low Max Results (10-20) to preview the data format before running large extractions
- Enable proxy for scrapes above 500 substances to avoid rate limiting
Output
Each substance is returned as a JSON object. Here is a realistic example:
[{"name": "Benzene","casNumber": "71-43-2","ecNumber": "200-753-7","molecularFormula": "C6H6","registrationType": "Full registration","hazardClassification": "Flam. Liq. 2, Carc. 1A, Muta. 1B, STOT RE 1, Asp. Tox. 1","substanceType": "Mono-constituent substance","description": "Benzene is a registered substance under REACH with a full registration tonnage band of over 1,000,000 tonnes per year.","url": "https://echa.europa.eu/substance-information/-/substanceinfo/100.000.685","scrapedAt": "2026-04-03T08:15:32.000Z"},{"name": "Bisphenol A","casNumber": "80-05-7","ecNumber": "201-245-8","molecularFormula": "C15H16O2","registrationType": "Full registration","hazardClassification": "Repr. 1B, STOT SE 3, Eye Dam. 1, Skin Sens. 1","substanceType": "Mono-constituent substance","description": "Bisphenol A is manufactured and imported in the European Economic Area at volumes of 1,000,000 or more tonnes per year.","url": "https://echa.europa.eu/substance-information/-/substanceinfo/100.001.133","scrapedAt": "2026-04-03T08:15:34.000Z"},{"name": "Di(2-ethylhexyl) phthalate","casNumber": "117-81-7","ecNumber": "204-211-0","molecularFormula": "C24H38O4","registrationType": "Full registration","hazardClassification": "Repr. 1B","substanceType": "Mono-constituent substance","description": "DEHP is a substance of very high concern (SVHC) and is included in the Authorisation List (Annex XIV of REACH).","url": "https://echa.europa.eu/substance-information/-/substanceinfo/100.003.829","scrapedAt": "2026-04-03T08:15:36.000Z"}]
Results can be downloaded as JSON, CSV, Excel, XML, or accessed via the Apify API.
How much does it cost?
ECHA Scraper runs on the Apify platform. You only pay for the compute resources consumed during the scrape.
| Scrape size | Estimated time | Estimated cost |
|---|---|---|
| 10 substances | ~30 seconds | ~$0.01 |
| 100 substances | ~3 minutes | ~$0.05 |
| 1,000 substances | ~25 minutes | ~$0.40 |
| 5,000 substances | ~2 hours | ~$1.80 |
Apify's free tier includes $5 of monthly compute, enough for hundreds of substance lookups at no cost.
Can I integrate?
Yes. Connect ECHA Scraper to your existing tools without writing code:
- Google Sheets โ Automatically push new substance data into a spreadsheet for your compliance team
- Slack โ Get notified in a channel whenever new SVHC substances appear in your search results
- Zapier โ Trigger downstream workflows when new hazardous substances are detected
- Make (Integromat) โ Build complex compliance monitoring pipelines with conditional logic
- Webhooks โ Send results to any HTTP endpoint for custom processing
Set up integrations directly from the Apify console under the "Integrations" tab of your Actor run.
Can I use it as an API?
Absolutely. Call ECHA Scraper programmatically from any language. Here are examples using the Apify client libraries:
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("studio-amba/echa-scraper").call(run_input={"searchQuery": "phthalate","maxResults": 50,})for substance in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{substance['name']} | CAS: {substance['casNumber']} | {substance['hazardClassification']}")
JavaScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('studio-amba/echa-scraper').call({searchQuery: 'phthalate',maxResults: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach(s => console.log(`${s.name} | CAS: ${s.casNumber} | ${s.hazardClassification}`));
FAQ
What is ECHA?
The European Chemicals Agency (ECHA) is the EU agency responsible for implementing the REACH, CLP, and Biocidal Products regulations. Its database at echa.europa.eu is the authoritative source for information on chemical substances manufactured or imported in the European Economic Area, including registration dossiers, hazard classifications, and substance of very high concern (SVHC) listings.
How does ECHA Scraper work?
The scraper navigates ECHA's substance search and infocard pages, extracting structured data from each substance profile. It handles ECHA's Liferay-based pagination automatically and can process search results or individual substance URLs you provide.
Is it legal to scrape ECHA?
ECHA publishes chemical substance data as part of its regulatory transparency mandate. The data is publicly accessible without login requirements. This scraper accesses only publicly available information and respects the site's rate limits through built-in concurrency controls.
How to scrape the ECHA chemicals database?
Set the Search Query field to a substance name (e.g., "formaldehyde") or CAS number (e.g., "50-00-0"), configure your desired Max Results, and click Start. The scraper returns structured JSON with all available substance fields. For bulk lookups, paste multiple substance URLs into the Start URLs field.
Is this an ECHA API alternative?
Yes. ECHA does not offer a comprehensive public API for bulk substance data retrieval. This scraper fills that gap by providing structured, programmatic access to the same data available on the ECHA website, with output in JSON, CSV, or Excel format.
How fresh is the data?
Each run scrapes live data directly from echa.europa.eu. The data is as current as what ECHA has published. Schedule recurring runs on Apify to keep your substance database automatically updated.
What is REACH and how does it relate to ECHA?
REACH (Registration, Evaluation, Authorisation and Restriction of Chemicals) is the EU regulation governing chemical substances. ECHA is the agency that manages the REACH database. Every chemical substance manufactured or imported into the EU above one tonne per year must be registered in the ECHA database with safety data. This scraper extracts that registration data, including whether a substance has a full or intermediate registration.
Can I search by CAS number?
Yes. Enter a CAS number (e.g., "71-43-2" for Benzene or "80-05-7" for Bisphenol A) directly in the Search Query field. The scraper will find the matching substance and extract all available data from its ECHA infocard.
Limitations
- ECHA pages use dynamic Liferay widgets; some fields may be empty if the substance infocard has limited data registered
- Very broad searches (e.g., searching
*) may return thousands of results โ use a reasonablemaxResultsto control run time and cost - Detailed hazard classification text depends on how extensively the registrant populated the CLP section
- The scraper extracts data from the substance infocard page; full registration dossier PDFs are not downloaded
Other regulatory data scrapers
Building a compliance monitoring pipeline? These scrapers cover adjacent regulatory databases:
- Staatsblad Scraper โ Belgian Official Gazette: laws, royal decrees, and ministerial orders
- Safety Gate Scraper โ EU product safety alerts and RAPEX recall notifications
- EFSA Scraper โ EU food safety alerts, scientific opinions, and risk assessments
- EUR-Lex Scraper โ EU regulations, directives, and legal acts from the Official Journal
Your feedback
Found a bug or have a feature request? Open an issue on the Issues tab and we will address it promptly. Your feedback helps us improve the scraper for everyone.