PubChem Compound Scraper
Pricing
from $3.00 / 1,000 results
PubChem Compound Scraper
Scrape PubChem - the world's largest free chemistry database with 100M+ compounds. Search by name, CID, SMILES, or full-text. Returns molecular formula, weight, SMILES, InChI, logP, H-bond counts, synonyms, and more.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Crawler Bros
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
6 days ago
Last modified
Categories
Share
Scrape PubChem — the world's largest free chemistry database with 100M+ compounds maintained by the NCBI. Search by compound name, PubChem CID, SMILES string, or free-text query. Returns molecular identifiers, physicochemical properties, structural data, and synonyms. HTTP-only via the public PubChem REST API. No auth, no proxy required.
What this actor does
- Four modes:
searchByName,searchBySmiles,searchByCid,fullTextSearch - Compound lookup: by IUPAC name, common name, CID, or SMILES notation
- Rich properties: molecular formula, weight, SMILES, InChI, InChIKey, XLogP, H-bond counts, heavy atom count, complexity
- Synonyms: up to 10 synonyms per compound
- Empty fields are omitted — no nulls in output
Output per compound
| Field | Type | Description |
|---|---|---|
cid | integer | PubChem Compound ID |
iupacName | string | IUPAC systematic name |
molecularFormula | string | Molecular formula (e.g. C9H8O4) |
molecularWeight | float | Molecular weight in g/mol |
canonicalSmiles | string | Canonical SMILES notation |
isomericSmiles | string | Isomeric SMILES (with stereochemistry) |
inchiKey | string | Standard InChIKey hash |
inchi | string | Standard InChI string |
xlogp | float | Computed XLogP3 lipophilicity |
exactMolecularWeight | float | Exact monoisotopic mass |
hbondDonorCount | integer | Number of hydrogen bond donors |
hbondAcceptorCount | integer | Number of hydrogen bond acceptors |
heavyAtomCount | integer | Number of heavy (non-hydrogen) atoms |
rotatablebondCount | integer | Number of rotatable bonds |
synonyms | array | Up to 10 common synonyms |
sourceUrl | string | PubChem compound page URL |
recordType | string | Always "compound" |
scrapedAt | string | ISO 8601 timestamp |
Input
| Field | Type | Default | Description |
|---|---|---|---|
mode | string | searchByName | searchByName / searchBySmiles / searchByCid / fullTextSearch |
compoundNames | array | [] | Compound names to look up (mode=searchByName) |
smilesList | array | [] | SMILES strings (mode=searchBySmiles) |
cids | array | [] | PubChem CIDs (mode=searchByCid) |
searchQuery | string | aspirin | Free-text query (mode=fullTextSearch) |
maxItems | integer | 10 | Max compounds to return (1–1000) |
Example: look up common drug compounds
{"mode": "searchByName","compoundNames": ["aspirin", "caffeine", "ibuprofen", "acetaminophen"],"maxItems": 4}
Example: search by SMILES
{"mode": "searchBySmiles","smilesList": ["CC(=O)Oc1ccccc1C(=O)O", "Cn1cnc2c1c(=O)n(c(=O)n2C)C"],"maxItems": 2}
Example: full-text search
{"mode": "fullTextSearch","searchQuery": "acetylsalicylic acid","maxItems": 5}
FAQs
Do I need an API key? No. PubChem's REST API is freely accessible with no authentication required.
Are there rate limits? PubChem allows up to 5 requests per second. This actor enforces a 0.2s delay between requests automatically.
How many compounds can I scrape?
Up to 1000 per run. For fullTextSearch, the actor fetches matching CIDs first, then retrieves full data for each.
What is the difference between canonical and isomeric SMILES? Canonical SMILES is a standardized representation without stereochemistry. Isomeric SMILES includes stereochemical information (E/Z, R/S).
Can I search by molecular structure?
Yes, use searchBySmiles mode with a valid SMILES string.
Why are some fields missing from certain compounds? Not all compounds in PubChem have complete property sets. The actor omits any field for which PubChem returns no data.
What is XLogP? XLogP3 is a computed measure of lipophilicity (fat-solubility) — key for predicting drug absorption, distribution, and bioavailability.