ECHA Scraper — EU Chemicals Database & Substance Info avatar

ECHA Scraper — EU Chemicals Database & Substance Info

Pricing

Pay per usage

Go to Apify Store
ECHA Scraper — EU Chemicals Database & Substance Info

ECHA Scraper — EU Chemicals Database & Substance Info

Scrape chemical substance data, registrations, CAS numbers, hazard classifications, and regulatory info from ECHA (European Chemicals Agency) at echa.europa.eu.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Jelle Desramaults

Jelle Desramaults

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

0

Monthly active users

8 days ago

Last modified

Categories

Share

ECHA Scraper

Scrape chemical substance records from the European Chemicals Agency (ECHA). Get CAS numbers, EC numbers, molecular formulas, hazard classifications, and registration details in structured form — no more clicking through ECHA's clunky interface one substance at a time.

What it does

The scraper crawls ECHA's substance information pages. It starts from either a search query (substance name or CAS number), a direct URL, or the registered substances index. For each substance it finds, it extracts identification numbers, classification data, and descriptions from the detail pages.

It handles pagination automatically, so a search returning hundreds of substances will be followed through all result pages.

Input

FieldTypeDescription
searchQueryStringSubstance name or CAS number. Examples: "benzene", "71-43-2", "bisphenol A"
startUrlsArrayDirect ECHA page URLs. Example: https://echa.europa.eu/substance-information/-/substanceinfo/100.000.685
maxResultsIntegerDefault: 100
proxyConfigurationObjectRecommended for larger runs.

If you provide neither searchQuery nor startUrls, the scraper starts from the registered substances index at https://echa.europa.eu/information-on-chemicals/registered-substances.

Output

{
"name": "Benzene",
"casNumber": "71-43-2",
"ecNumber": "200-753-7",
"molecularFormula": "C6H6",
"registrationType": "Full",
"hazardClassification": "Flam. Liq. 2, Carc. 1A, Muta. 1B, STOT RE 1, Asp. Tox. 1, ...",
"substanceType": "Mono-constituent substance",
"description": "Substance information page for Benzene",
"url": "https://echa.europa.eu/substance-information/-/substanceinfo/100.000.685",
"scrapedAt": "2026-03-16T08:00:00.000Z"
}

Fields: name, casNumber, ecNumber, molecularFormula, registrationType, hazardClassification, substanceType, description, url.

Who uses this

  • REACH compliance teams tracking registered substances
  • Chemical importers/distributors who need to check regulatory status before sourcing
  • EHS (Environment, Health & Safety) departments building internal substance databases
  • Researchers doing bulk lookups instead of manual ECHA searches

Limitations

  • ECHA's HTML structure varies between substance pages — some fields may be empty for certain substances if the data isn't present or is structured differently.
  • The site can be slow to respond. The scraper retries failed requests up to 3 times.
  • Very large extractions (1000+ substances) should use a proxy to avoid rate limiting.