# PubChem Compound Scraper (`parseforge/pubchem-compound-scraper`) Actor

Export chemical compound data from PubChem, the world's largest open chemistry database with 119M+ compounds. Look up by CID, name, SMILES, or InChIKey. Pull molecular formulas, weights, structures, synonyms, IUPAC names, and properties.

- **URL**: https://apify.com/parseforge/pubchem-compound-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Education, Developer tools, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $20.00 / 1,000 result items

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🧪 PubChem Compound Scraper

> 🚀 **Export chemistry data from PubChem in seconds.** Look up **119M+ compounds** by CID, name, SMILES, or InChIKey. Pull molecular formulas, weights, structures, IUPAC names, synonyms, and 23+ computed properties.

> 🕒 **Last updated:** 2026-05-22 · **📊 19 fields** per record · **🧪 119M+ compounds** · **🔬 NIH official source** · **🔍 4 lookup modes**

The **PubChem Compound Scraper** taps PubChem, the world's largest open chemistry database, maintained by the NIH National Library of Medicine. The Actor returns **19 structured fields per record**, including PubChem CID, IUPAC name, molecular formula and weight, canonical and isomeric SMILES, InChI, InChIKey, computed physicochemical properties, and the full synonym list.

The catalog covers **119 million unique chemical compounds, drawn from hundreds of contributing organizations**, including the FDA, EPA, DrugBank, ChEMBL, NIST, and pharma research consortia. This Actor exposes four lookup modes (CID, name, SMILES, InChIKey) and lets you cherry-pick which of 23 PubChem-computed properties to return.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Chemists, pharma R&D, cheminformaticians, materials scientists, drug-discovery teams, regulatory analysts, chemistry educators | Compound lookup and enrichment, SAR/QSAR feature engineering, ADMET screening inputs, regulatory dossiers, synonym normalization, structure-to-property mapping |

---

### 📋 What the PubChem Compound Scraper does

Four lookup workflows in a single Actor:

- 🔢 **CID lookup.** Numeric PubChem identifiers like `2244` (aspirin), `3672` (ibuprofen).
- 📛 **Name lookup.** Common names like `aspirin`, `caffeine`, `paclitaxel`.
- 🧬 **SMILES lookup.** Pass a structure string and resolve to the canonical PubChem record.
- 🔑 **InChIKey lookup.** Hash-based exact-match lookup, ideal for deduplication.

Pick from **23 PubChem-computed properties** (molecular formula, weight, exact mass, SMILES variants, InChI, IUPAC name, XLogP, TPSA, complexity, charge, H-bond donor/acceptor counts, rotatable bonds, heavy atoms, stereocenters, 3D volume, feature count, and more). Toggle synonym fetching to also pull every common name registered for each compound.

> 💡 **Why it matters:** PubChem is the de facto reference for compound metadata in cheminformatics. Building your own client means juggling the PUG REST API, throttling, retries, and per-property batching. This Actor delivers a tidy record per compound, ready for downstream modelling, dashboards, or reports.

---

### 🎬 Full Demo

_🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset._

---

### ⚙️ Input

<table>
<thead>
<tr><th>Input</th><th>Type</th><th>Default</th><th>Behavior</th></tr>
</thead>
<tbody>
<tr><td><code>maxItems</code></td><td>integer</td><td><code>10</code></td><td>Records to return. Free plan caps at 10, paid plan at 1,000,000.</td></tr>
<tr><td><code>mode</code></td><td>enum</td><td><code>"cid"</code></td><td>One of <code>cid</code>, <code>name</code>, <code>smiles</code>, <code>inchikey</code>.</td></tr>
<tr><td><code>identifiers</code></td><td>string[]</td><td>5 example CIDs</td><td>List of identifiers to resolve, in the chosen mode.</td></tr>
<tr><td><code>properties</code></td><td>string[]</td><td>13 core properties</td><td>Subset of 23 PubChem-computed properties.</td></tr>
<tr><td><code>includeSynonyms</code></td><td>boolean</td><td><code>true</code></td><td>Also fetch the list of common names and synonyms per compound.</td></tr>
</tbody>
</table>

**Example: lookup 5 common drugs by name with synonyms.**

```json
{
    "maxItems": 5,
    "mode": "name",
    "identifiers": ["aspirin", "ibuprofen", "caffeine", "paracetamol", "metformin"],
    "includeSynonyms": true
}
````

**Example: minimal property pull by CID for a screening library.**

```json
{
    "maxItems": 1000,
    "mode": "cid",
    "identifiers": ["2244", "3672", "1983", "5793", "2519"],
    "properties": ["MolecularFormula", "MolecularWeight", "CanonicalSMILES", "XLogP", "TPSA"],
    "includeSynonyms": false
}
```

> ⚠️ **Good to Know:** PubChem PUG REST applies rate limits to free public callers. The Actor batches and paces requests automatically so you avoid 503s.

***

### 📊 Output

Each record contains **19 fields**. Download the dataset as CSV, Excel, JSON, or XML.

#### 🧾 Schema

| Field | Type | Example |
|---|---|---|
| 🆔 `cid` | integer | `2244` |
| 🏷️ `title` | string | null | `"Aspirin"` |
| 🧬 `iupacName` | string | null | `"2-acetyloxybenzoic acid"` |
| ⚗️ `molecularFormula` | string | null | `"C9H8O4"` |
| ⚖️ `molecularWeight` | string | null | `"180.16"` |
| 📐 `canonicalSMILES` | string | null | `"CC(=O)OC1=CC=CC=C1C(=O)O"` |
| 🌀 `isomericSMILES` | string | null | `"CC(=O)OC1=CC=CC=C1C(=O)O"` |
| 🔗 `inchi` | string | null | `"InChI=1S/C9H8O4/..."` |
| 🔑 `inchiKey` | string | null | `"BSYNRYMUTXBXSQ-UHFFFAOYSA-N"` |
| 💧 `xLogP` | number | null | `1.2` |
| 🎯 `exactMass` | string | null | `"180.04225873"` |
| 🧮 `tpsa` | number | null | `63.6` |
| 🔋 `hBondDonorCount` | integer | null | `1` |
| 🔌 `hBondAcceptorCount` | integer | null | `4` |
| 🔄 `rotatableBondCount` | integer | null | `3` |
| 📝 `synonyms` | string\[] | null | `["Aspirin", "Acetylsalicylic acid", "ASA", ...]` |
| 🧱 `properties` | object | null | `{ "Complexity": 212, "HeavyAtomCount": 13, ... }` |
| 🔗 `url` | string | `"https://pubchem.ncbi.nlm.nih.gov/compound/2244"` |
| 🕓 `scrapedAt` | ISO 8601 | `"2026-05-22T00:00:00.000Z"` |

#### 📦 Sample records

<details>
<summary><strong>💊 Aspirin (CID 2244)</strong></summary>

```json
{
    "cid": 2244,
    "title": "Aspirin",
    "iupacName": "2-acetyloxybenzoic acid",
    "molecularFormula": "C9H8O4",
    "molecularWeight": "180.16",
    "canonicalSMILES": "CC(=O)OC1=CC=CC=C1C(=O)O",
    "isomericSMILES": "CC(=O)OC1=CC=CC=C1C(=O)O",
    "inchi": "InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12)",
    "inchiKey": "BSYNRYMUTXBXSQ-UHFFFAOYSA-N",
    "xLogP": 1.2,
    "exactMass": "180.04225873",
    "tpsa": 63.6,
    "hBondDonorCount": 1,
    "hBondAcceptorCount": 4,
    "rotatableBondCount": 3,
    "synonyms": ["Aspirin", "Acetylsalicylic acid", "ASA", "2-Acetoxybenzoic acid", "Acetysal"],
    "url": "https://pubchem.ncbi.nlm.nih.gov/compound/2244",
    "scrapedAt": "2026-05-22T00:00:00.000Z"
}
```

</details>

<details>
<summary><strong>☕ Caffeine (CID 2519)</strong></summary>

```json
{
    "cid": 2519,
    "title": "Caffeine",
    "iupacName": "1,3,7-trimethylpurine-2,6-dione",
    "molecularFormula": "C8H10N4O2",
    "molecularWeight": "194.19",
    "canonicalSMILES": "CN1C=NC2=C1C(=O)N(C(=O)N2C)C",
    "isomericSMILES": "CN1C=NC2=C1C(=O)N(C(=O)N2C)C",
    "inchi": "InChI=1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3",
    "inchiKey": "RYYVLZVUVIJVGH-UHFFFAOYSA-N",
    "xLogP": -0.1,
    "exactMass": "194.08037557",
    "tpsa": 58.4,
    "hBondDonorCount": 0,
    "hBondAcceptorCount": 3,
    "rotatableBondCount": 0,
    "synonyms": ["Caffeine", "1,3,7-Trimethylxanthine", "Theine", "Guaranine"],
    "url": "https://pubchem.ncbi.nlm.nih.gov/compound/2519",
    "scrapedAt": "2026-05-22T00:00:00.000Z"
}
```

</details>

***

### ✨ Why choose this Actor

| | Capability |
|---|---|
| 🌐 | **Massive coverage.** 119M+ compounds from the NIH National Library of Medicine. |
| 🔍 | **Four lookup modes.** CID, name, SMILES, and InChIKey resolve to the same canonical record. |
| 🧱 | **23 computed properties.** Pick only the ones your model needs and save downstream cleanup. |
| 📝 | **Synonym lists.** Resolve trade names, salts, generics, and historical spellings in one shot. |
| ⚡ | **Fast.** 100 compounds in under a minute, paced under the public rate limit. |
| 🔁 | **Always fresh.** Every run hits the live PubChem feed. |
| 🚫 | **No API key.** Public PubChem REST needs no registration. |

> 📊 PubChem is the most widely cited chemical reference in modern cheminformatics, drug discovery, and materials research.

***

### 📈 How it compares to alternatives

| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| **⭐ PubChem Compound Scraper** *(this Actor)* | $5 free credit, then pay-per-use | **119M+ compounds** | **Live per run** | CID, name, SMILES, InChIKey | ⚡ 2 min |
| Manual web download from PubChem | Free | Per-compound | Manual | None | 🐢 Hours |
| Hand-coded PUG REST client | Free | Full | Per-build | Custom | ⏳ Days |
| Commercial cheminformatics suites | $$$$/year | Curated | Vendor schedule | Vendor-defined | 🕒 Sales cycle |

Pick this Actor when you want broad coverage, multi-mode lookup, and zero infrastructure to maintain.

***

### 🚀 How to use

1. 📝 **Sign up.** [Create a free account with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp) (takes 2 minutes).
2. 🌐 **Open the Actor.** Go to the PubChem Compound Scraper page on the Apify Store.
3. 🎯 **Set input.** Pick a lookup mode, paste identifiers, choose which properties to fetch.
4. 🚀 **Run it.** Click **Start** and let the Actor collect your data.
5. 📥 **Download.** Grab your results in the **Dataset** tab as CSV, Excel, JSON, or XML.

> ⏱️ Total time from signup to downloaded dataset: **3-5 minutes.** No coding required.

***

### 💼 Business use cases

<table>
<tr>
<td width="50%" valign="top">

#### 💊 Pharma R\&D

- Hit triage and library enrichment
- ADMET property pulls for early screening
- Synonym normalization across legacy datasets
- Regulatory dossier reference checks

</td>
<td width="50%" valign="top">

#### 🧪 Cheminformatics and ML

- Build SAR/QSAR feature tables
- Train generative-chemistry models with real properties
- Standardize SMILES/InChI representations
- Benchmark predicted vs PubChem-computed properties

</td>
</tr>
<tr>
<td width="50%" valign="top">

#### 🧱 Materials and chemicals

- Specialty-chemical sourcing reference data
- Polymer monomer property tables
- Catalyst and ligand databases
- Raw-material substitution screens

</td>
<td width="50%" valign="top">

#### 📋 Regulatory and EHS

- Synonym matching for hazardous-substance lists
- Inventory reconciliation across regulatory IDs
- Safety data sheet (SDS) cross-referencing
- Tracking ingredient identifiers across jurisdictions

</td>
</tr>
</table>

***

### 🔌 Automating PubChem Compound Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

- 🟢 **Node.js.** Install the `apify-client` NPM package.
- 🐍 **Python.** Use the `apify-client` PyPI package.
- 📚 See the [Apify API documentation](https://docs.apify.com/api/v2) for full details.

The [Apify Schedules feature](https://docs.apify.com/platform/schedules) lets you trigger this Actor on any cron interval. Daily or weekly refreshes keep downstream databases in sync automatically.

***

### 🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Course datasets for medicinal-chemistry and cheminformatics classes
- Reproducible papers with cited, versioned compound pulls
- Open-science notebooks that ground analyses in PubChem
- Thesis projects on structure-property relationships

</td>
<td width="50%">

#### 🎨 Personal and creative

- Hobbyist science blogs and explainers
- Visualization projects on molecular property distributions
- Educational apps that teach chemistry through real compounds
- Side projects exploring natural-product chemistry

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Public-health communication around medicines and toxins
- Environmental advocacy with chemical-property evidence
- Citizen-science projects on consumer-product ingredients
- Educational resources for under-served STEM programs

</td>
<td width="50%">

#### 🧪 Experimentation

- Train property-prediction ML models on real labels
- Validate generative-chemistry tools against PubChem ground truth
- Prototype agent pipelines that answer chemistry questions
- Build LLM-grounded chemistry assistants with cited records

</td>
</tr>
</table>

***

### 🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

- 💬 [**ChatGPT**](https://chat.openai.com/?q=How%20do%20I%20use%20the%20PubChem%20Compound%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🧠 [**Claude**](https://claude.ai/new?q=How%20do%20I%20use%20the%20PubChem%20Compound%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🔍 [**Perplexity**](https://perplexity.ai/search?q=How%20do%20I%20use%20the%20PubChem%20Compound%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🅒 [**Copilot**](https://copilot.microsoft.com/?q=How%20do%20I%20use%20the%20PubChem%20Compound%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)

***

### ❓ Frequently Asked Questions

#### 🧩 How does it work?

Pick a lookup mode, paste your identifiers, choose which PubChem-computed properties to return, and click Start. The Actor calls the public PubChem feed, paces requests to stay within rate limits, and emits one tidy record per compound.

#### 📏 How accurate is the data?

All numeric properties are PubChem-computed values served live from the NIH source. Synonyms are aggregated from PubChem's depositor network and cover trade names, salts, generics, and historical spellings.

#### 🔁 How often is the dataset refreshed?

PubChem updates continuously as depositors submit new compounds and properties. Every Actor run pulls the current state of each compound at run time.

#### 🧬 What's the difference between canonical and isomeric SMILES?

Canonical SMILES is a normalized 2D representation. Isomeric SMILES preserves stereochemistry and isotope information. Use isomeric for accurate structure handling in modelling.

#### ⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run this Actor on any cron interval and keep a downstream database in sync.

#### ⚖️ Is this data legal to use?

PubChem data is in the public domain in the United States. Many international jurisdictions treat it similarly. Review the downstream terms of your specific use case before redistribution.

#### 💼 Can I use this data commercially?

Yes. PubChem's data policy permits commercial use. You are responsible for complying with any downstream regulatory requirements and the terms of contributing depositors.

#### 💳 Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit and gives you access to scheduling, higher concurrency, and larger datasets.

#### 🔁 What happens if a run fails or gets interrupted?

Apify automatically retries transient errors. If a run still fails, you can inspect the log in the Runs tab, fix the input, and re-run. Partial datasets from failed runs are preserved so you never lose progress.

#### 🆘 What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.

***

### 🔌 Integrate with any app

PubChem Compound Scraper connects to any cloud service via [Apify integrations](https://apify.com/integrations):

- [**Make**](https://docs.apify.com/platform/integrations/make) - Automate multi-step workflows
- [**Zapier**](https://docs.apify.com/platform/integrations/zapier) - Connect with 5,000+ apps
- [**Slack**](https://docs.apify.com/platform/integrations/slack) - Get run notifications in your channels
- [**Airbyte**](https://docs.apify.com/platform/integrations/airbyte) - Pipe compound data into your warehouse
- [**GitHub**](https://docs.apify.com/platform/integrations/github) - Trigger runs from commits and releases
- [**Google Drive**](https://docs.apify.com/platform/integrations/drive) - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push fresh compound data into your product backend, or alert your team in Slack.

***

### 🔗 Recommended Actors

- [**🧬 KEGG Pathways Scraper**](https://apify.com/parseforge/kegg-pathways-scraper) - Biological pathways, compounds, genes, drugs
- [**🏥 ClinicalTrials.gov Scraper**](https://apify.com/parseforge/clinicaltrials-gov-scraper) - Global clinical research registry
- [**📚 PubMed Scraper**](https://apify.com/parseforge/pubmed-scraper) - Biomedical literature search
- [**🔬 ArXiv Scraper**](https://apify.com/parseforge/arxiv-scraper) - Preprint research papers
- [**📊 GBIF Biodiversity Scraper**](https://apify.com/parseforge/gbif-biodiversity-scraper) - Global species occurrence data

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more reference-data scrapers.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) to request a new scraper, propose a custom data project, or report an issue.

***

> **⚠️ Disclaimer:** this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the NIH National Library of Medicine, PubChem, or any government body. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.

# Actor input Schema

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## `mode` (type: `string`):

How identifiers are interpreted. CID = numeric PubChem ID, NAME = common chemical name, SMILES = structure string, INCHIKEY = hashed InChI.

## `identifiers` (type: `array`):

List of identifiers to look up. Examples: \['2244','3672','1983'] for CIDs, or \['aspirin','ibuprofen','caffeine'] for names.

## `properties` (type: `array`):

Which compound properties to retrieve. Default selection covers core chemistry. Add more for advanced cheminformatics.

## `includeSynonyms` (type: `boolean`):

Also fetch the list of common names and synonyms for each compound.

## Actor input object example

```json
{
  "maxItems": 10,
  "mode": "cid",
  "identifiers": [
    "2244",
    "3672",
    "1983",
    "5793",
    "2519"
  ],
  "properties": [
    "MolecularFormula",
    "MolecularWeight",
    "CanonicalSMILES",
    "IsomericSMILES",
    "InChI",
    "InChIKey",
    "IUPACName",
    "XLogP",
    "ExactMass",
    "TPSA",
    "HBondDonorCount",
    "HBondAcceptorCount",
    "RotatableBondCount"
  ],
  "includeSynonyms": true
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10,
    "mode": "cid",
    "identifiers": [
        "2244",
        "3672",
        "1983",
        "5793",
        "2519"
    ],
    "properties": [
        "MolecularFormula",
        "MolecularWeight",
        "CanonicalSMILES",
        "IsomericSMILES",
        "InChI",
        "InChIKey",
        "IUPACName",
        "XLogP",
        "ExactMass",
        "TPSA",
        "HBondDonorCount",
        "HBondAcceptorCount",
        "RotatableBondCount"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/pubchem-compound-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "maxItems": 10,
    "mode": "cid",
    "identifiers": [
        "2244",
        "3672",
        "1983",
        "5793",
        "2519",
    ],
    "properties": [
        "MolecularFormula",
        "MolecularWeight",
        "CanonicalSMILES",
        "IsomericSMILES",
        "InChI",
        "InChIKey",
        "IUPACName",
        "XLogP",
        "ExactMass",
        "TPSA",
        "HBondDonorCount",
        "HBondAcceptorCount",
        "RotatableBondCount",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/pubchem-compound-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10,
  "mode": "cid",
  "identifiers": [
    "2244",
    "3672",
    "1983",
    "5793",
    "2519"
  ],
  "properties": [
    "MolecularFormula",
    "MolecularWeight",
    "CanonicalSMILES",
    "IsomericSMILES",
    "InChI",
    "InChIKey",
    "IUPACName",
    "XLogP",
    "ExactMass",
    "TPSA",
    "HBondDonorCount",
    "HBondAcceptorCount",
    "RotatableBondCount"
  ]
}' |
apify call parseforge/pubchem-compound-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/pubchem-compound-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "PubChem Compound Scraper",
        "description": "Export chemical compound data from PubChem, the world's largest open chemistry database with 119M+ compounds. Look up by CID, name, SMILES, or InChIKey. Pull molecular formulas, weights, structures, synonyms, IUPAC names, and properties.",
        "version": "1.0",
        "x-build-id": "a2VF6pGLY2e7zELQs"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~pubchem-compound-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-pubchem-compound-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~pubchem-compound-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-pubchem-compound-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~pubchem-compound-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-pubchem-compound-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    },
                    "mode": {
                        "title": "Lookup Mode",
                        "enum": [
                            "cid",
                            "name",
                            "smiles",
                            "inchikey"
                        ],
                        "type": "string",
                        "description": "How identifiers are interpreted. CID = numeric PubChem ID, NAME = common chemical name, SMILES = structure string, INCHIKEY = hashed InChI."
                    },
                    "identifiers": {
                        "title": "Identifiers",
                        "type": "array",
                        "description": "List of identifiers to look up. Examples: ['2244','3672','1983'] for CIDs, or ['aspirin','ibuprofen','caffeine'] for names.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "properties": {
                        "title": "Properties to Fetch",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Which compound properties to retrieve. Default selection covers core chemistry. Add more for advanced cheminformatics.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "MolecularFormula",
                                "MolecularWeight",
                                "CanonicalSMILES",
                                "IsomericSMILES",
                                "InChI",
                                "InChIKey",
                                "IUPACName",
                                "Title",
                                "XLogP",
                                "ExactMass",
                                "MonoisotopicMass",
                                "TPSA",
                                "Complexity",
                                "Charge",
                                "HBondDonorCount",
                                "HBondAcceptorCount",
                                "RotatableBondCount",
                                "HeavyAtomCount",
                                "AtomStereoCount",
                                "BondStereoCount",
                                "CovalentUnitCount",
                                "Volume3D",
                                "FeatureCount3D"
                            ]
                        }
                    },
                    "includeSynonyms": {
                        "title": "Include Synonyms",
                        "type": "boolean",
                        "description": "Also fetch the list of common names and synonyms for each compound.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
