MyChem.info Drug Annotation Scraper avatar

MyChem.info Drug Annotation Scraper

Pricing

from $2.00 / 1,000 results

Go to Apify Store
MyChem.info Drug Annotation Scraper

MyChem.info Drug Annotation Scraper

Resolve any drug name or InChIKey into a tidy annotation from MyChem.info. Returns DrugBank name and accession, ChEMBL and PubChem ids, UNII, ATC codes, chemical formula, molecular weight, indications, and mechanism classes. Great for drug reference tables and identifier crosswalks.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

9 days ago

Last modified

Share

ParseForge Banner

💊 MyChem.info Drug Annotation Scraper

🚀 Turn any drug name into a clean annotation record in seconds. Resolve imatinib, aspirin, or a whole therapeutic area into DrugBank, ChEMBL, PubChem, UNII, ATC, formula, weight, indications, and mechanism, all from one keyless source.

🕒 Last updated: 2026-06-05 · 📊 23 fields per record · keyless public API · global drug and compound coverage

MyChem.info is the BioThings drug and chemical knowledge hub that aggregates DrugBank, ChEMBL, PubChem, DrugCentral, UNII, and more behind a single InChIKey-keyed API. This Actor queries MyChem.info, resolves each drug name or InChIKey to its annotation document, and returns a curated subset of the most useful fields instead of the full nested blob.

Coverage spans small molecules, biologics, and investigational compounds that carry a DrugBank annotation. You can look up drugs one by one, paste a batch of names or InChIKeys, or run a free-text query like a therapeutic area to pull back the best-matching annotated compounds.

🎯 Target Audience💡 Primary Use Cases
Pharma and biotech researchersBuild a drug reference table across DrugBank, ChEMBL, and PubChem
Data scientists and bioinformaticiansMap drug names to standardized identifiers and ATC codes
Clinical and regulatory teamsPull indications, mechanism classes, and approval status
Chemistry and cheminformatics teamsCollect formula, molecular weight, SMILES, and CAS numbers

📋 What the MyChem.info Drug Annotation Scraper does

  • Resolves drug names (for example imatinib, metformin) or InChIKeys to their MyChem.info annotation.
  • Runs a free-text search query and returns the best-matching drug-annotated compounds.
  • Maps a curated subset of fields from DrugBank, ChEMBL, PubChem, DrugCentral, and UNII.
  • Returns standardized identifiers, ATC codes, chemical properties, indications, and mechanism classes.
  • Combines a search query and an explicit drug list in a single run when you want both.

🎬 Full Demo (🚧 Coming soon)

⚙️ Input

FieldTypeRequiredDescription
searchQuerystringone of twoA free-text term such as a drug name or therapeutic area (for example leukemia, kinase).
drugListarrayone of twoDrug names or InChIKeys to resolve, one per entry.
maxItemsintegernoCap on how many records are produced. Free plan is limited to 10.

Provide a searchQuery, a drugList, or both. At least one is required.

{
"drugList": ["imatinib", "dasatinib", "nilotinib", "aspirin", "metformin"],
"maxItems": 10
}
{
"searchQuery": "leukemia",
"maxItems": 25
}

⚠️ Good to Know: Only compounds carrying a DrugBank annotation are returned, so very obscure or purely chemical entries may not resolve. A few fields such as ATC codes, PubChem CID, or approval year can be absent for a given drug when the upstream source has no value, and those come back as null rather than being faked.

📊 Output

Each record is one drug or compound annotation.

🏷 FieldDescription
💊 drugNamePrimary DrugBank name
🧬 inchikeyInChIKey, the MyChem.info record id
🆔 drugbankIdDrugBank accession (for example DB00619)
📚 drugbankAccessionsAll DrugBank accession numbers
🧪 chemblIdChEMBL molecule id
🔬 pubchemCidPubChem compound id
🔖 uniiFDA UNII code
🧾 casNumberCAS registry number
🏷 atcCodesWHO ATC classification codes
🧫 moleculeTypeMolecule type, for example Small molecule
📈 maxPhaseHighest development phase reached
📅 firstApprovalYearYear of first approval
🗂 drugGroupsStatus and route flags, for example approved, oral
⚗️ molecularFormulaChemical formula
⚖️ molecularWeightMolecular weight
🧩 smilesSMILES structure string
🩺 primaryIndicationLead indication
📋 indicationsIndication list
🔬 mechanismMechanism and pharmacology classes
🌐 sourceWhich input produced the record
🔗 urlMyChem.info annotation endpoint
🕒 scrapedAtTimestamp of collection
errorError message, null on success

Real sample records from a live run:

{
"inchikey": "KTUFNOKKBVMGRW-UHFFFAOYSA-N",
"drugName": "Imatinib",
"drugbankId": "DB00619",
"chemblId": "CHEMBL941",
"pubchemCid": "5291",
"unii": "BKJ8M8G5HI",
"casNumber": "152459-95-5",
"atcCodes": ["L01EA01"],
"moleculeType": "Small molecule",
"maxPhase": 4,
"firstApprovalYear": 2001,
"drugGroups": ["approved", "oral"],
"molecularFormula": "C29H31N7O",
"molecularWeight": 493.6,
"primaryIndication": "Chronic Myelocytic Leukemia Accelerated Phase",
"mechanism": ["Kinase Inhibitor", "tyrosine kinase inhibitors", "Protein Kinase Inhibitors"],
"url": "https://mychem.info/v1/chem/KTUFNOKKBVMGRW-UHFFFAOYSA-N",
"error": null
}
{
"inchikey": "ZBNZXTGUTAYRHI-UHFFFAOYSA-N",
"drugName": "Dasatinib",
"drugbankId": "DB01254",
"chemblId": "CHEMBL1421",
"pubchemCid": "3062316",
"unii": "X78UG0A0RN",
"casNumber": "302962-49-8",
"atcCodes": ["L01EA02"],
"moleculeType": "Small molecule",
"maxPhase": 4,
"firstApprovalYear": 2006,
"drugGroups": ["approved", "oral"],
"molecularFormula": "C22H26ClN7O2S",
"molecularWeight": 488,
"primaryIndication": "Philadelphia Chromosome Positive Chronic Myelocytic Leukemia",
"mechanism": ["tyrosine kinase inhibitors", "Protein Kinase Inhibitors"],
"url": "https://mychem.info/v1/chem/ZBNZXTGUTAYRHI-UHFFFAOYSA-N",
"error": null
}
{
"inchikey": "HHZIURLSWUIHRB-UHFFFAOYSA-N",
"drugName": "Nilotinib",
"drugbankId": "DB04868",
"chemblId": "CHEMBL255863",
"pubchemCid": "644241",
"unii": "F41401512X",
"casNumber": "641571-10-0",
"atcCodes": ["L01EA03"],
"moleculeType": "Small molecule",
"maxPhase": 4,
"firstApprovalYear": 2007,
"drugGroups": ["approved", "oral"],
"molecularFormula": "C28H22F3N7O",
"molecularWeight": 529.5,
"primaryIndication": "Chronic Myelocytic Leukemia Accelerated Phase",
"mechanism": ["Kinase Inhibitor", "tyrosine kinase inhibitors"],
"url": "https://mychem.info/v1/chem/HHZIURLSWUIHRB-UHFFFAOYSA-N",
"error": null
}

✨ Why choose this Actor

  • One curated record instead of a deeply nested aggregation blob.
  • Cross-references DrugBank, ChEMBL, PubChem, DrugCentral, and UNII in a single row.
  • Accepts names, InChIKeys, and free-text queries interchangeably.
  • Keyless public source, so no API account or token juggling on the source side.
  • Null is honest, never invented, so your downstream joins stay clean.

📈 How it compares to alternatives

ApproachIdentifiersIndications and mechanismSetup
This ActorDrugBank, ChEMBL, PubChem, UNII, CAS, ATC in one rowIncludedPaste names and run
Raw MyChem.info APIAvailable but deeply nestedBuried in nested blobsWrite your own parser
Manual DrugBank lookupOne source at a timePartialSlow and manual

🚀 How to use

  1. Sign up for a free Apify account using this link.
  2. Open the MyChem.info Drug Annotation Scraper.
  3. Enter a searchQuery, a drugList of names or InChIKeys, or both.
  4. Set maxItems if you want to cap the run, then start the Actor.
  5. Collect your results from the dataset once the run finishes.

💼 Business use cases

Pharma competitive intelligence

GoalHow this helps
Track approved drug classesPull ATC codes and approval years across a target list
Benchmark mechanismsCompare mechanism classes for a therapeutic area

Data engineering and reference data

GoalHow this helps
Build an identifier crosswalkMap names to DrugBank, ChEMBL, PubChem, and UNII
Enrich an existing catalogAdd formula, weight, and CAS to your records

Clinical and regulatory research

GoalHow this helps
Review indicationsRead primary and full indication lists per drug
Check development statusUse max phase and approval year as filters

Cheminformatics

GoalHow this helps
Seed a structure datasetCollect SMILES and molecular properties
Standardize compound namesResolve free text to canonical InChIKeys

🔌 Automating MyChem.info Drug Annotation Scraper

Connect runs to the tools your team already uses:

  • Make and Zapier to trigger runs and route records into other apps.
  • Slack to post a summary when a run finishes.
  • Airbyte to load results into a warehouse.
  • GitHub Actions to schedule recurring pulls.
  • Google Drive to archive each run output for your team.

🌟 Beyond business use cases

  • Research: assemble a tidy drug reference table for a literature review.
  • Personal: look up the formula, weight, and class of a medication you are curious about.
  • Non-profit: support patient education resources with standardized drug facts.
  • Experimentation: prototype a chatbot that answers questions about drug identifiers.

🤖 Ask an AI assistant

Paste your results into ChatGPT, Claude, Perplexity, or Copilot and ask it to summarize mechanisms, group drugs by ATC class, or spot gaps in your reference table.

❓ Frequently Asked Questions

Is MyChem.info free to query? Yes, MyChem.info is a keyless public BioThings API. This Actor adds resolution, curation, and clean output on top.

What can I put in the drug list? Drug names such as imatinib or metformin, or InChIKeys such as KTUFNOKKBVMGRW-UHFFFAOYSA-N. Both are accepted in the same list.

What does the search query return? It returns the best-matching compounds that carry a DrugBank annotation, so you get usable drug records rather than packaging entries.

Why is a field sometimes null? The upstream source had no value for that drug. Nulls are kept as null and never invented.

Which sources are combined? DrugBank, ChEMBL, PubChem, DrugCentral, and UNII, all keyed by InChIKey inside MyChem.info.

How is a name resolved to a record? The Actor searches MyChem.info for the name among drug-annotated compounds and selects the top match.

Can I pull a whole therapeutic area? Yes, use a search query like leukemia or kinase and raise maxItems to collect more compounds.

Does it return chemical structure? Yes, each record includes a SMILES string plus molecular formula and weight when available.

What is the InChIKey used for? It is the stable record id in MyChem.info and a portable key for joining across chemical datasets.

How many records can I get? Free plan runs are limited to 10. Paid plans can collect up to 1,000,000.

🔌 Integrate with any app

Every run writes to a structured dataset you can pull through the Apify API or connect to your stack with the integrations above.

💡 Pro Tip: browse the complete ParseForge collection.

🆘 Need Help? Open our contact form

⚠️ Disclaimer: independent tool, not affiliated with MyChem.info or BioThings. Only publicly available data collected.