Pubmed Citation Scraper
Pricing
Pay per event
Pubmed Citation Scraper
Automate collection of detailed citation information from the world's largest biomedical literature database. Extract complete citation data including titles, authors, abstracts, publication dates, journals, DOIs, MeSH terms, and more from NCBI's PubMed database.
Pricing
Pay per event
Rating
5.0
(1)
Developer

ParseForge
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
🚀 Automatically collect comprehensive biomedical literature citations from PubMed with our powerful data extraction tool.
Designed for researchers, medical professionals, and academics, this tool pulls detailed citation information from PubMed—the world's largest biomedical literature database maintained by the National Center for Biotechnology Information (NCBI). Get critical data like titles, authors, abstracts, publication dates, journals, DOIs, MeSH terms, and more, all with no coding required.
Target Audience: Researchers, medical professionals, academics, pharmaceutical companies, healthcare institutions, students, librarians Primary Use Cases: Literature reviews, research analysis, citation tracking, systematic reviews, meta-analyses, academic research, medical research
What Does PubMed Citation Scraper Do?
This tool collects comprehensive citation data from PubMed, supporting both search queries and advanced filtering options. It delivers:
- Complete citation information including PMID, title, and URL
- Author details with affiliations and contact information
- Journal information including full name, ISO abbreviation, and ISSN
- Publication dates and full date breakdowns
- Volume, issue, and page numbers
- Complete abstracts with structured sections
- Digital Object Identifiers (DOIs) and PMC IDs
- MeSH terms for medical subject indexing
- Keywords and publication types
- Grant funding information
- Language and country data
- And more
Business Value: Save countless hours of manual literature searching and citation collection. Build comprehensive research databases, track publication trends, identify research gaps, and stay current with the latest biomedical research—all automatically.
How to use the PubMed Citation Scraper - Full Demo
[YouTube video embed or link]
Watch this 3-minute demo to see how easy it is to get started!
Input
To start PubMed web scraping, simply fill in the input form. You can scrape PubMed based on:
- searchTerm - Enter your PubMed search query using PubMed search syntax (e.g., "cancer AND therapy", "Smith J[Author]", "Nature[Journal]")
- startUrl - Provide a direct PubMed search URL if you've already created a search on the PubMed website
- dateFrom - Filter by publication date from (format: YYYY/MM/DD or YYYY)
- dateTo - Filter by publication date to (format: YYYY/MM/DD or YYYY)
- publicationType - Filter by publication type (e.g., Review, Clinical Trial, Meta-Analysis, Case Reports)
- journal - Filter by specific journal name (e.g., Nature, Science, The Lancet)
- author - Filter by author name (format: LastName FirstInitial, e.g., "Smith J")
- sort - Sort results by relevance, publication date, first author, or journal
- maxItems - Set the maximum number of citations to collect (free users: max 50, paid users: up to 1,000,000)
Here's what the filled-out input schema looks like:

And here it is written in JSON:
{"searchTerm": "cancer AND therapy","dateFrom": "2020","dateTo": "2023","publicationType": "Review","sort": "pub_date","maxItems": 100}
Pro Tip: 💡 Use PubMed's advanced search syntax for more precise results. Combine terms with AND, OR, NOT operators, and use field tags like [Author], [Journal], [Title] for targeted searches.
Output
After the Actor finishes its run, you'll get a dataset with the output. The length of the dataset depends on the amount of results you've set. You can download those results as an Excel, HTML, XML, JSON, and CSV document.
Here's an example of scraped PubMed citation data you'll get if you decide to scrape cancer therapy research:

{"pmid": "16988102","title": "Gene therapy for cancer treatment: past, present and future.","url": "https://pubmed.ncbi.nlm.nih.gov/16988102/","authors": ["Cross Deanna","Burmester James K"],"journal": "Clinical medicine & research","journalISOAbbreviation": "Clin Med Res","issn": "1539-4182","publicationDate": "2006/Sep","publicationDateFull": {"year": "2006","month": "Sep"},"volume": "4","issue": "3","pages": "218-27","doi": "10.3121/cmr.4.3.218","pmcId": "PMC2751499","abstract": "The broad field of gene therapy promises a number of innovative treatments...","authorDetails": [{"lastName": "Cross","foreName": "Deanna","initials": "D","affiliation": "Center for Human Genetics, Marshfield Clinic Research Foundation..."}],"abstractSections": [{"text": "The broad field of gene therapy promises..."}],"meshTerms": ["Clinical Trials as Topic","Gene Transfer Techniques","Genetic Therapy","Neoplasms"],"keywords": [],"publicationTypes": ["Historical Article","Journal Article","Review"],"grants": [],"language": ["eng"],"country": "United States","scrapedTimestamp": "2024-01-15T10:30:00Z"}
What You Get: Complete citation data with all metadata, author information, abstracts, MeSH terms, and identifiers for comprehensive literature analysis Download Options: CSV, Excel, or JSON formats for easy analysis in your research tools
Why Choose the PubMed Citation Scraper?
- Comprehensive Data: Get complete citation information including abstracts, MeSH terms, DOIs, and author details in one dataset
- Time Savings: Save 10-20 hours per week compared to manual literature searching and citation collection
- Advanced Filtering: Use PubMed's powerful search syntax with date ranges, publication types, journals, and authors
- Real-time Data: Access the most current biomedical literature available from NCBI's official database
- Structured Output: Get data in organized formats ready for analysis, bibliographic management, and reporting
- No Duplicates: Automatically handles citation deduplication
- Research-Ready: Export to Excel, CSV, or JSON for immediate use in reference managers and research tools
Time Savings: Save 10-20 hours per week compared to manual literature searching and citation collection Efficiency: Fraction of the time of manual research processes
How to Use
- Sign Up: Create a free account w/ $5 credit (takes 2 minutes)
- Find the Scraper: Visit the PubMed Citation Scraper page
- Set Input: Add your search term, filters, and max items
- Run It: Click "Start" and let it collect your data
- Download Data: Get your results in the "Dataset" tab as CSV, Excel, or JSON
Total Time: 5 minutes setup, 15-60 minutes for data collection depending on result size No Technical Skills Required: Everything is point-and-click
Business Use Cases
Researchers & Academics:
- Conduct comprehensive literature reviews
- Build bibliographic databases for research projects
- Track publication trends in specific fields
- Identify research gaps and opportunities
- Collect citations for systematic reviews and meta-analyses
Medical Professionals:
- Stay current with latest medical research
- Find relevant studies for clinical decision-making
- Track publications in specific medical specialties
- Monitor research from specific institutions or authors
- Build reference libraries for patient care
Pharmaceutical Companies:
- Monitor competitor research and publications
- Track drug development and clinical trial publications
- Identify key opinion leaders and researchers
- Analyze publication trends in therapeutic areas
- Support regulatory submissions with comprehensive literature
Healthcare Institutions:
- Build institutional research databases
- Track faculty and researcher publications
- Monitor research output and impact
- Support grant applications with literature reviews
- Maintain comprehensive medical libraries
Students & Librarians:
- Collect citations for thesis and dissertation research
- Build reference lists for academic papers
- Support library research services
- Create subject-specific bibliographies
- Assist with literature search requests
Using PubMed Citation Scraper with the Apify API
For advanced users who want to automate this process, you can control the scraper programmatically with the Apify API. This allows you to schedule regular data collection and integrate with your existing research tools.
- Node.js: Install the apify-client NPM package
- Python: Use the apify-client PyPI package
- See the Apify API reference for full details
Frequently Asked Questions
Q: How does it work? A: PubMed Citation Scraper uses NCBI's official E-utilities API to search and retrieve citation data. Simply configure your search parameters and let the tool collect the data automatically.
Q: How accurate is the data? A: We collect data directly from PubMed's official database in real-time, ensuring the most up-to-date and accurate citation information available.
Q: Can I schedule regular runs? A: Yes! Use the Apify API to schedule daily, weekly, or monthly runs automatically. Perfect for ongoing literature monitoring and research tracking.
Q: What if I need help? A: Our support team is available 24/7. Contact us through the Apify platform.
Q: Is my data secure? A: Absolutely. All data is encrypted in transit and at rest. We never share your data with third parties.
Q: Can I filter by specific publication types? A: Yes! You can filter by Review, Clinical Trial, Meta-Analysis, Case Reports, and many other publication types.
Q: How many citations can I scrape? A: Free users can collect up to 50 citations per run. Paid users can collect up to 1,000,000 citations per run.
Q: Does it support PubMed's advanced search syntax? A: Yes! You can use PubMed's full search syntax including Boolean operators (AND, OR, NOT), field tags ([Author], [Journal], [Title]), and complex query combinations.
Q: Can I get full-text articles? A: The scraper collects citation metadata and abstracts. For full-text access, use the provided DOIs or PMC IDs to access articles through your institution's library or open access repositories.
Integrate PubMed Citation Scraper with any app and automate your workflow
Last but not least, PubMed Citation Scraper can be connected with almost any cloud service or web app thanks to integrations on the Apify platform.
These includes:
Alternatively, you can use webhooks to carry out an action whenever an event occurs, e.g. get a notification whenever PubMed Citation Scraper successfully finishes a run.
🔗 Recommended Actors
Looking for more data collection tools? Check out these related actors:
| Actor | Description | Link |
|---|---|---|
| GSA eLibrary Scraper | Collects government publication data from GSA eLibrary | https://apify.com/parseforge/gsa-elibrary-scraper |
| PR Newswire Scraper | Extracts press release and news data from PR Newswire | https://apify.com/parseforge/pr-newswire-scraper |
| Hubspot Marketplace Scraper | Collects business app data from HubSpot marketplace | https://apify.com/parseforge/hubspot-marketplace-scraper |
| FINRA BrokerCheck Scraper | Extracts financial advisor and broker information from FINRA | https://apify.com/parseforge/finra-brokercheck-scraper |
| Greatschools Scraper | Collects school data and ratings from GreatSchools | https://apify.com/parseforge/greatschools-scraper |
Pro Tip: 💡 Browse our complete collection of data collection actors to find the perfect tool for your business needs.
Need Help? Our support team is here to help you get the most out of this tool.
⚠️ Disclaimer: This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by PubMed, NCBI, or any of its subsidiaries. All trademarks mentioned are the property of their respective owners.