medRxiv Scraper
Pricing
Pay per event
medRxiv Scraper
Extract comprehensive preprint data from medRxiv, including titles, authors, abstracts, full text, DOIs, citations, and metadata. Automate access to health-science preprints with structured outputs, ideal for researchers and analysts who need reliable, large-scale article data without manual work.
Pricing
Pay per event
Rating
5.0
(1)
Developer

ParseForge
Actor stats
0
Bookmarked
4
Total users
0
Monthly active users
3 days ago
Last modified
Categories
Share

๐ medRxiv Scraper
Collect comprehensive preprint research data from medRxiv without coding. Extract article metadata, abstracts, authors, citations, and full text from thousands of health sciences preprints. Perfect for researchers conducting systematic reviews, academics tracking emerging research, and data analysts building datasets for AI training. Get medRxiv data as CSV without needing technical skills.
The medRxiv Scraper collects detailed preprint article data including titles, authors, abstracts, full text, and metadata from medRxiv.org. It supports up to 1 million articles per run with parallel fetching for speed.
โจ What Does It Do
- ๐ Article Title - Extract article titles for literature tracking and research organization
- ๐ฅ Authors and Author Details - Collect author names, affiliations, and contact information for collaboration analysis
- ๐ Abstract and Full Text - Get complete article abstracts and full text content for deep analysis
- ๐ Publication Date - Track when articles were first published to identify trends over time
- ๐ DOI and Links - Capture DOI identifiers and PDF download links for citation and access
- ๐ Subject Areas and Keywords - Extract research categories and keywords for topic filtering and research categorization
- ๐ฌ Corresponding Author - Identify primary contact for each article to request reprints or clarifications
- ๐ Supplementary Materials - Collect links to supplementary data, code, and materials included with articles
- ๐ Citation Information - Gather citation metadata for bibliometric analysis
- ๐ Full Metadata - Access version history, licensing, funding statements, competing interests, and data availability statements
๐ง Input
- Start URL - medRxiv search URL to start scraping from. Use this for custom searches or specific page ranges. Cannot be used with search query. Example: https://www.medrxiv.org/search/asd
- Search Query - Search term to find medRxiv articles matching your topic. Cannot be used with start URL. Example: "bacterial infection"
- Sort Order - How to order results. Choose between best match, oldest first, or newest first
- Max Items - Maximum number of articles to collect. Free users up to 100, paid users up to 1,000,000
Example input:
{"searchQuery": "COVID-19 vaccine","orderBy": "newest","maxItems": 50}
๐ Output
Each article includes up to 24 data fields. Download as JSON, CSV, or Excel.
| ๐ Article Title | ๐ฅ Authors | ๐ Abstract |
|---|---|---|
| ๐ DOI | ๐ Publication Date | ๐ Full Text |
| ๐ฅ Subject Areas | ๐ Keywords | ๐ค Corresponding Author |
| ๐ฌ Citation Information | ๐ Supplementary Materials | ๐ Related Articles |
| ๐ Version History | โ๏ธ License Information | ๐ฐ Funding Statement |
| โก Competing Interests | ๐ Author Declarations | ๐ Data Availability |
| ๐๏ธ Data Code URL | โ Metadata | ๐ Scraped Timestamp |
๐ Why Choose the medRxiv Scraper?
| Feature | Our Actor | Similar Tools |
|---|---|---|
| Extract full article text and abstracts | โ๏ธ | โ |
| Parallel article fetching (20 at a time) | โ๏ธ | โ |
| Collect 24+ metadata fields per article | โ๏ธ | Partial |
| Support up to 1 million articles per run | โ๏ธ | โ |
| Supplementary materials and data links | โ๏ธ | โ |
| Author details and affiliation data | โ๏ธ | Partial |
| Funding and ethics statements included | โ๏ธ | โ |
| Version history and license tracking | โ๏ธ | โ |
| Competing interests and declarations | โ๏ธ | โ |
| Data repository and code links | โ๏ธ | โ |
| Free tier with 100 article limit | โ๏ธ | โ๏ธ |
| No coding required | โ๏ธ | โ๏ธ |
๐ How to Use
No technical skills required. Follow these simple steps:
- Sign Up: Create a free account with $5 credit
- Find the Tool: Search for "medRxiv Scraper" in the Apify Store and configure your input
- Run It: Click "Start" and watch your results appear
That's it. No coding, no setup, no complicated configuration. Now you can export your data in CSV, Excel, or JSON format.
๐ฏ Business Use Cases
- ๐ Systematic Review Researcher - Collect 500 articles on a specific disease to create a comprehensive dataset for meta-analysis and identify research gaps
- ๐ผ Academic Library Manager - Monitor new publications in neuroscience daily to update institutional research collections and notify faculty of relevant preprints
- ๐ฌ Pharma Data Analyst - Extract competitor research timelines by tracking articles from rival companies and universities to understand drug development progress
โ FAQ
๐ How does the scraper work? The scraper navigates medRxiv search results, extracts article metadata from search cards, and fetches detailed information from each article page in parallel for speed.
๐ How accurate is the data? Data is extracted directly from medRxiv.org article pages and search results. All major fields (title, authors, abstract, DOI) are captured with high accuracy. Supplementary material links are present if the article author uploaded them.
๐ Can I schedule regular runs? Yes. Use Apify's scheduler to run this scraper daily, weekly, or monthly to monitor new publications in your research area automatically.
โ๏ธ Is web scraping medRxiv allowed? medRxiv is a public preprint server and does not prohibit scraping in its robots.txt. However, you are responsible for complying with medRxiv's terms of service. Always respect rate limits and do not overload their servers.
๐ก๏ธ Will medRxiv block me? Unlikely. medRxiv does not have strict anti-bot protection. However, using a residential proxy is recommended for large-scale data collection to avoid any IP-based restrictions.
โก How long does a run take? Collection time depends on the number of articles. Expect 1-2 seconds per article. Collecting 100 articles typically takes 2-3 minutes. Larger runs (1000+) take 20-40 minutes.
โ ๏ธ Are there any limits? Free users can collect up to 100 results per run. Paid users can collect up to 1,000,000 results per run.
๐ Integrate medRxiv Scraper with any app
- Make - Automate workflows
- Zapier - Connect 5000+ apps
- GitHub - Version control integration
- Slack - Get notifications
- Airbyte - Data pipelines
- Google Drive - Export to spreadsheets
๐ก More ParseForge Actors
- Indeed Scraper - Extract job listings and career data
- Crunchbase Scraper - Collect startup and company information
- Etsy Scraper - Gather e-commerce product data
Browse our complete collection of data extraction tools for more.
๐ Ready to Start?
Create a free account with $5 credit and collect your first 100 results for free. No coding, no setup.
๐ Need Help?
- Check the FAQ section above for common questions
- Visit the Apify support page for documentation and tutorials
- Contact us to request a new scraper, propose a custom project, or report an issue at Tally contact form
โ ๏ธ Disclaimer
This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by medRxiv or any of its subsidiaries. All trademarks mentioned are the property of their respective owners.