PubChem Chemical Compound Scraper avatar

PubChem Chemical Compound Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
PubChem Chemical Compound Scraper

PubChem Chemical Compound Scraper

Search PubChem - the world's largest free chemistry database with 100M+ compounds. Search by name, get by CID, or fetch synonyms. Returns molecular formula, weight, SMILES, InChI, logP, H-bond counts, and more. No API key required.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Scrape PubChem — the world's largest free chemistry database with 100M+ compounds maintained by the NCBI. Search by compound name or PubChem CID, fetch detailed molecular properties, or retrieve all known synonyms. HTTP-only via the public PubChem REST API. No API key, no proxy required.

What this actor does

  • Four modes: searchCompounds, getByName, getByCID, getSynonyms
  • Name-based search: returns multiple matching compounds for a query term
  • Exact lookup: get full detail for a specific compound name or CID
  • Synonyms: retrieve all known names and identifiers for any compound
  • Rich properties: molecular formula, weight, SMILES, InChI, InChIKey, XLogP, H-bond counts, heavy atom count, complexity, charge
  • Empty fields are omitted — no nulls in output

Modes

ModeDescription
searchCompoundsSearch by keyword — returns multiple matching compounds
getByNameGet detailed info for an exact compound name
getByCIDGet compound by PubChem CID number
getSynonymsGet all known synonyms for a compound

Input

FieldTypeDescription
modeselectWhich mode to use (default: searchCompounds)
compoundNamestringCompound name to search or look up (e.g. aspirin, caffeine)
cidintegerPubChem CID for getByCID or getSynonyms mode
maxItemsintegerMaximum records to return, 1–200 (default: 20)

Output per compound

FieldTypeDescription
cidintegerPubChem Compound ID
iupacNamestringIUPAC systematic name
commonNamestringCommon/trade name (first synonym or provided name)
molecularFormulastringMolecular formula (e.g. C9H8O4)
molecularWeightfloatMolecular weight in g/mol
canonicalSmilesstringCanonical SMILES notation
inchiKeystringStandard InChIKey hash
xLogPfloatComputed XLogP3 lipophilicity
hBondDonorCountintegerNumber of hydrogen bond donors
hBondAcceptorCountintegerNumber of hydrogen bond acceptors
rotatableBondCountintegerNumber of rotatable bonds
heavyAtomCountintegerNumber of heavy (non-hydrogen) atoms
complexityfloatMolecular complexity score
chargeintegerFormal charge of the compound
synonymsarrayTop 5 known synonyms
pubchemUrlstringDirect link to PubChem compound page
scrapedAtstringISO 8601 timestamp of when the record was scraped

Data source

PubChem is a free chemistry database maintained by the National Center for Biotechnology Information (NCBI), part of the US National Institutes of Health. The PubChem REST API is completely free with no registration required — rate limited to 5 requests/second.

Example output

{
"cid": 2244,
"iupacName": "2-(acetyloxy)benzoic acid",
"commonName": "aspirin",
"molecularFormula": "C9H8O4",
"molecularWeight": 180.16,
"canonicalSmiles": "CC(=O)Oc1ccccc1C(=O)O",
"inchiKey": "BSYNRYMUTXBXSQ-UHFFFAOYSA-N",
"xLogP": 1.2,
"hBondDonorCount": 1,
"hBondAcceptorCount": 4,
"rotatableBondCount": 3,
"heavyAtomCount": 13,
"complexity": 212,
"charge": 0,
"synonyms": ["aspirin", "acetylsalicylic acid", "2-acetoxybenzoic acid", "ASA", "Ecotrin"],
"pubchemUrl": "https://pubchem.ncbi.nlm.nih.gov/compound/2244",
"scrapedAt": "2026-06-03T10:00:00+00:00"
}

FAQs

Do I need an API key? No. The PubChem REST API is completely free and requires no registration.

How many results can I get? Up to 200 compounds per run. The PubChem database contains over 100 million compounds.

What is the rate limit? PubChem allows up to 5 requests per second. This actor respects that limit automatically.

Can I look up by SMILES or InChI? Use getByName mode — PubChem's name endpoint also accepts SMILES strings and InChI identifiers.

What if a compound has no IUPAC name? Fields are only included when data is available. If an IUPAC name is missing, only the commonName (from synonyms) will appear.

Is the data current? PubChem is updated continuously by the NCBI. Data accuracy reflects the current PubChem database state.