PubChem Compound - Chemical & Drug Data avatar

PubChem Compound - Chemical & Drug Data

Pricing

from $2.00 / 1,000 results

Go to Apify Store
PubChem Compound - Chemical & Drug Data

PubChem Compound - Chemical & Drug Data

Search PubChem for chemical compound data. Find compounds by name, formula, or structure. For chemists, pharma researchers, toxicologists, and materials scientists. Pay per result.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Ava Torres

Ava Torres

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

PubChem Compound Search

Search 115M+ chemical compounds from NIH PubChem. Look up compounds by name, molecular formula, SMILES notation, or PubChem CID. Returns molecular properties including weight, IUPAC name, canonical SMILES, InChI, XLogP, TPSA, hydrogen bond counts, and more. No API key required.


Data Source

NIH PubChem PUG REST API (pubchem.ncbi.nlm.nih.gov). PubChem is the world's largest publicly accessible chemical database, maintained by the National Center for Biotechnology Information (NCBI) as part of the National Institutes of Health.


Output Fields

Output fields depend on the selected Property Set.

Basic Properties

FieldTypeDescription
cidintegerPubChem Compound ID
molecularFormulastringMolecular formula (e.g., C9H8O4)
molecularWeightnumberMolecular weight in g/mol
canonicalSMILESstringCanonical SMILES notation
isomericSMILESstringIsomeric SMILES notation
iupacNamestringIUPAC systematic name
inchistringInChI identifier
inchiKeystringHashed InChIKey (27 characters)

Physical Properties (adds to Basic)

FieldTypeDescription
xLogPnumberOctanol-water partition coefficient
exactMassnumberExact monoisotopic mass
tpsanumberTopological polar surface area (A^2)
hBondDonorCountintegerNumber of hydrogen bond donors
hBondAcceptorCountintegerNumber of hydrogen bond acceptors
rotatableBondCountintegerNumber of rotatable bonds

All Properties (adds to Physical)

FieldTypeDescription
monoisotopicMassnumberMonoisotopic mass
heavyAtomCountintegerNumber of non-hydrogen atoms
complexitynumberStructural complexity score
chargeintegerFormal charge

Use Cases

  • Drug discovery and cheminformatics -- retrieve molecular properties for compound screening, ADMET analysis, or structure-activity relationship (SAR) research.
  • Chemical database integration -- pull structured compound data into internal databases, ELN systems, or research workflows.
  • Regulatory and safety documentation -- retrieve InChI, InChIKey, SMILES, and molecular formula for compound identification in regulatory filings.
  • Academic research -- access compound properties for computational chemistry, machine learning training data, or literature-related compound lookups.
  • Pharmaceutical market intelligence -- look up drug compound structures and properties for competitive analysis or formulation research.
  • Educational tools -- build chemistry reference tools that surface structured property data for any compound by name.

How to Use

Set the input fields and run the actor. Results are pushed to the Apify dataset and can be exported as JSON, CSV, or Excel.

Input Parameters

ParameterTypeDefaultDescription
searchTypestringnamename (compound name), formula (molecular formula), smiles (SMILES notation), or cid (PubChem CID)
querystringSearch term. Required for all search types
propertiesstringallProperty set to retrieve: basic, physical, or all
maxResultsinteger10Maximum compounds to return (1-200)

Example -- Look Up a Drug by Name

{
"searchType": "name",
"query": "aspirin",
"properties": "all",
"maxResults": 1
}

Example -- Search by Molecular Formula

{
"searchType": "formula",
"query": "C9H8O4",
"properties": "physical",
"maxResults": 10
}

Example -- Search by SMILES

{
"searchType": "smiles",
"query": "CC(=O)OC1=CC=CC=C1C(O)=O",
"properties": "basic",
"maxResults": 5
}

Example -- Direct CID Lookup

{
"searchType": "cid",
"query": "2244",
"properties": "all",
"maxResults": 1
}

Cost

  • Actor start fee: ~$0.10 per run
  • Compute: minimal -- typical runs complete in seconds
  • Data cost: $0.005 per result

Most lookups cost under $0.15 total.


Output Formats

Results are available in the Apify dataset viewer and can be exported as:

  • JSON
  • CSV
  • Excel (XLSX)
  • XML
  • RSS

FAQ

Do I need an NIH or PubChem account? No. PubChem's PUG REST API is fully public and requires no authentication.

How many compounds does PubChem contain? PubChem contains over 115 million unique compounds as of 2024. It is the largest publicly accessible chemical database in the world.

What is an InChIKey? The InChIKey is a fixed-length, hashed representation of the full InChI identifier. It is widely used as a stable, searchable identifier for chemical compounds across databases and publications.

What is XLogP? XLogP is the calculated octanol-water partition coefficient, a key measure of lipophilicity used in drug discovery and ADMET property prediction. Higher values indicate greater lipophilicity.

What is TPSA? Topological Polar Surface Area is the sum of the surface area contributed by polar atoms. It is used to predict intestinal absorption, blood-brain barrier penetration, and other pharmacokinetic properties.

Can I search for drugs by brand name? Yes. PubChem includes synonyms for many compounds including brand names, generic names, and chemical names. Searching by brand name (e.g., Tylenol, Lipitor) will return the corresponding compound record.

What is the maximum number of results per run? The maximum is 200 compounds per run. For name-based searches PubChem typically returns the closest matching compound first.