MyChem.info Drug Annotation Scraper
Pricing
from $2.00 / 1,000 results
MyChem.info Drug Annotation Scraper
Resolve any drug name or InChIKey into a tidy annotation from MyChem.info. Returns DrugBank name and accession, ChEMBL and PubChem ids, UNII, ATC codes, chemical formula, molecular weight, indications, and mechanism classes. Great for drug reference tables and identifier crosswalks.
Pricing
from $2.00 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
9 days ago
Last modified
Categories
Share

💊 MyChem.info Drug Annotation Scraper
🚀 Turn any drug name into a clean annotation record in seconds. Resolve imatinib, aspirin, or a whole therapeutic area into DrugBank, ChEMBL, PubChem, UNII, ATC, formula, weight, indications, and mechanism, all from one keyless source.
🕒 Last updated: 2026-06-05 · 📊 23 fields per record · keyless public API · global drug and compound coverage
MyChem.info is the BioThings drug and chemical knowledge hub that aggregates DrugBank, ChEMBL, PubChem, DrugCentral, UNII, and more behind a single InChIKey-keyed API. This Actor queries MyChem.info, resolves each drug name or InChIKey to its annotation document, and returns a curated subset of the most useful fields instead of the full nested blob.
Coverage spans small molecules, biologics, and investigational compounds that carry a DrugBank annotation. You can look up drugs one by one, paste a batch of names or InChIKeys, or run a free-text query like a therapeutic area to pull back the best-matching annotated compounds.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Pharma and biotech researchers | Build a drug reference table across DrugBank, ChEMBL, and PubChem |
| Data scientists and bioinformaticians | Map drug names to standardized identifiers and ATC codes |
| Clinical and regulatory teams | Pull indications, mechanism classes, and approval status |
| Chemistry and cheminformatics teams | Collect formula, molecular weight, SMILES, and CAS numbers |
📋 What the MyChem.info Drug Annotation Scraper does
- Resolves drug names (for example imatinib, metformin) or InChIKeys to their MyChem.info annotation.
- Runs a free-text search query and returns the best-matching drug-annotated compounds.
- Maps a curated subset of fields from DrugBank, ChEMBL, PubChem, DrugCentral, and UNII.
- Returns standardized identifiers, ATC codes, chemical properties, indications, and mechanism classes.
- Combines a search query and an explicit drug list in a single run when you want both.
🎬 Full Demo (🚧 Coming soon)
⚙️ Input
| Field | Type | Required | Description |
|---|---|---|---|
searchQuery | string | one of two | A free-text term such as a drug name or therapeutic area (for example leukemia, kinase). |
drugList | array | one of two | Drug names or InChIKeys to resolve, one per entry. |
maxItems | integer | no | Cap on how many records are produced. Free plan is limited to 10. |
Provide a searchQuery, a drugList, or both. At least one is required.
{"drugList": ["imatinib", "dasatinib", "nilotinib", "aspirin", "metformin"],"maxItems": 10}
{"searchQuery": "leukemia","maxItems": 25}
⚠️ Good to Know: Only compounds carrying a DrugBank annotation are returned, so very obscure or purely chemical entries may not resolve. A few fields such as ATC codes, PubChem CID, or approval year can be absent for a given drug when the upstream source has no value, and those come back as null rather than being faked.
📊 Output
Each record is one drug or compound annotation.
| 🏷 Field | Description |
|---|---|
💊 drugName | Primary DrugBank name |
🧬 inchikey | InChIKey, the MyChem.info record id |
🆔 drugbankId | DrugBank accession (for example DB00619) |
📚 drugbankAccessions | All DrugBank accession numbers |
🧪 chemblId | ChEMBL molecule id |
🔬 pubchemCid | PubChem compound id |
🔖 unii | FDA UNII code |
🧾 casNumber | CAS registry number |
🏷 atcCodes | WHO ATC classification codes |
🧫 moleculeType | Molecule type, for example Small molecule |
📈 maxPhase | Highest development phase reached |
📅 firstApprovalYear | Year of first approval |
🗂 drugGroups | Status and route flags, for example approved, oral |
⚗️ molecularFormula | Chemical formula |
⚖️ molecularWeight | Molecular weight |
🧩 smiles | SMILES structure string |
🩺 primaryIndication | Lead indication |
📋 indications | Indication list |
🔬 mechanism | Mechanism and pharmacology classes |
🌐 source | Which input produced the record |
🔗 url | MyChem.info annotation endpoint |
🕒 scrapedAt | Timestamp of collection |
❌ error | Error message, null on success |
Real sample records from a live run:
{"inchikey": "KTUFNOKKBVMGRW-UHFFFAOYSA-N","drugName": "Imatinib","drugbankId": "DB00619","chemblId": "CHEMBL941","pubchemCid": "5291","unii": "BKJ8M8G5HI","casNumber": "152459-95-5","atcCodes": ["L01EA01"],"moleculeType": "Small molecule","maxPhase": 4,"firstApprovalYear": 2001,"drugGroups": ["approved", "oral"],"molecularFormula": "C29H31N7O","molecularWeight": 493.6,"primaryIndication": "Chronic Myelocytic Leukemia Accelerated Phase","mechanism": ["Kinase Inhibitor", "tyrosine kinase inhibitors", "Protein Kinase Inhibitors"],"url": "https://mychem.info/v1/chem/KTUFNOKKBVMGRW-UHFFFAOYSA-N","error": null}
{"inchikey": "ZBNZXTGUTAYRHI-UHFFFAOYSA-N","drugName": "Dasatinib","drugbankId": "DB01254","chemblId": "CHEMBL1421","pubchemCid": "3062316","unii": "X78UG0A0RN","casNumber": "302962-49-8","atcCodes": ["L01EA02"],"moleculeType": "Small molecule","maxPhase": 4,"firstApprovalYear": 2006,"drugGroups": ["approved", "oral"],"molecularFormula": "C22H26ClN7O2S","molecularWeight": 488,"primaryIndication": "Philadelphia Chromosome Positive Chronic Myelocytic Leukemia","mechanism": ["tyrosine kinase inhibitors", "Protein Kinase Inhibitors"],"url": "https://mychem.info/v1/chem/ZBNZXTGUTAYRHI-UHFFFAOYSA-N","error": null}
{"inchikey": "HHZIURLSWUIHRB-UHFFFAOYSA-N","drugName": "Nilotinib","drugbankId": "DB04868","chemblId": "CHEMBL255863","pubchemCid": "644241","unii": "F41401512X","casNumber": "641571-10-0","atcCodes": ["L01EA03"],"moleculeType": "Small molecule","maxPhase": 4,"firstApprovalYear": 2007,"drugGroups": ["approved", "oral"],"molecularFormula": "C28H22F3N7O","molecularWeight": 529.5,"primaryIndication": "Chronic Myelocytic Leukemia Accelerated Phase","mechanism": ["Kinase Inhibitor", "tyrosine kinase inhibitors"],"url": "https://mychem.info/v1/chem/HHZIURLSWUIHRB-UHFFFAOYSA-N","error": null}
✨ Why choose this Actor
- One curated record instead of a deeply nested aggregation blob.
- Cross-references DrugBank, ChEMBL, PubChem, DrugCentral, and UNII in a single row.
- Accepts names, InChIKeys, and free-text queries interchangeably.
- Keyless public source, so no API account or token juggling on the source side.
- Null is honest, never invented, so your downstream joins stay clean.
📈 How it compares to alternatives
| Approach | Identifiers | Indications and mechanism | Setup |
|---|---|---|---|
| This Actor | DrugBank, ChEMBL, PubChem, UNII, CAS, ATC in one row | Included | Paste names and run |
| Raw MyChem.info API | Available but deeply nested | Buried in nested blobs | Write your own parser |
| Manual DrugBank lookup | One source at a time | Partial | Slow and manual |
🚀 How to use
- Sign up for a free Apify account using this link.
- Open the MyChem.info Drug Annotation Scraper.
- Enter a
searchQuery, adrugListof names or InChIKeys, or both. - Set
maxItemsif you want to cap the run, then start the Actor. - Collect your results from the dataset once the run finishes.
💼 Business use cases
Pharma competitive intelligence
| Goal | How this helps |
|---|---|
| Track approved drug classes | Pull ATC codes and approval years across a target list |
| Benchmark mechanisms | Compare mechanism classes for a therapeutic area |
Data engineering and reference data
| Goal | How this helps |
|---|---|
| Build an identifier crosswalk | Map names to DrugBank, ChEMBL, PubChem, and UNII |
| Enrich an existing catalog | Add formula, weight, and CAS to your records |
Clinical and regulatory research
| Goal | How this helps |
|---|---|
| Review indications | Read primary and full indication lists per drug |
| Check development status | Use max phase and approval year as filters |
Cheminformatics
| Goal | How this helps |
|---|---|
| Seed a structure dataset | Collect SMILES and molecular properties |
| Standardize compound names | Resolve free text to canonical InChIKeys |
🔌 Automating MyChem.info Drug Annotation Scraper
Connect runs to the tools your team already uses:
- Make and Zapier to trigger runs and route records into other apps.
- Slack to post a summary when a run finishes.
- Airbyte to load results into a warehouse.
- GitHub Actions to schedule recurring pulls.
- Google Drive to archive each run output for your team.
🌟 Beyond business use cases
- Research: assemble a tidy drug reference table for a literature review.
- Personal: look up the formula, weight, and class of a medication you are curious about.
- Non-profit: support patient education resources with standardized drug facts.
- Experimentation: prototype a chatbot that answers questions about drug identifiers.
🤖 Ask an AI assistant
Paste your results into ChatGPT, Claude, Perplexity, or Copilot and ask it to summarize mechanisms, group drugs by ATC class, or spot gaps in your reference table.
❓ Frequently Asked Questions
Is MyChem.info free to query? Yes, MyChem.info is a keyless public BioThings API. This Actor adds resolution, curation, and clean output on top.
What can I put in the drug list? Drug names such as imatinib or metformin, or InChIKeys such as KTUFNOKKBVMGRW-UHFFFAOYSA-N. Both are accepted in the same list.
What does the search query return? It returns the best-matching compounds that carry a DrugBank annotation, so you get usable drug records rather than packaging entries.
Why is a field sometimes null? The upstream source had no value for that drug. Nulls are kept as null and never invented.
Which sources are combined? DrugBank, ChEMBL, PubChem, DrugCentral, and UNII, all keyed by InChIKey inside MyChem.info.
How is a name resolved to a record? The Actor searches MyChem.info for the name among drug-annotated compounds and selects the top match.
Can I pull a whole therapeutic area?
Yes, use a search query like leukemia or kinase and raise maxItems to collect more compounds.
Does it return chemical structure? Yes, each record includes a SMILES string plus molecular formula and weight when available.
What is the InChIKey used for? It is the stable record id in MyChem.info and a portable key for joining across chemical datasets.
How many records can I get? Free plan runs are limited to 10. Paid plans can collect up to 1,000,000.
🔌 Integrate with any app
Every run writes to a structured dataset you can pull through the Apify API or connect to your stack with the integrations above.
🔗 Recommended Actors
- deps.dev Package Insights Scraper for open source package metadata.
- Libraries.io Scraper for cross-ecosystem library data.
- CRAN R Packages Scraper for the R package universe.
💡 Pro Tip: browse the complete ParseForge collection.
🆘 Need Help? Open our contact form
⚠️ Disclaimer: independent tool, not affiliated with MyChem.info or BioThings. Only publicly available data collected.