medRxiv Scraper avatar

medRxiv Scraper

Pricing

Pay per event

Go to Apify Store
medRxiv Scraper

medRxiv Scraper

Extract comprehensive preprint data from medRxiv, including titles, authors, abstracts, full text, DOIs, citations, and metadata. Automate access to health-science preprints with structured outputs, ideal for researchers and analysts who need reliable, large-scale article data without manual work.

Pricing

Pay per event

Rating

5.0

(1)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

0

Monthly active users

3 days ago

Last modified

Share

ParseForge Banner

๐Ÿ“š medRxiv Scraper

Collect comprehensive preprint research data from medRxiv without coding. Extract article metadata, abstracts, authors, citations, and full text from thousands of health sciences preprints. Perfect for researchers conducting systematic reviews, academics tracking emerging research, and data analysts building datasets for AI training. Get medRxiv data as CSV without needing technical skills.

The medRxiv Scraper collects detailed preprint article data including titles, authors, abstracts, full text, and metadata from medRxiv.org. It supports up to 1 million articles per run with parallel fetching for speed.

โœจ What Does It Do

  • ๐Ÿ“ Article Title - Extract article titles for literature tracking and research organization
  • ๐Ÿ‘ฅ Authors and Author Details - Collect author names, affiliations, and contact information for collaboration analysis
  • ๐Ÿ“– Abstract and Full Text - Get complete article abstracts and full text content for deep analysis
  • ๐Ÿ“… Publication Date - Track when articles were first published to identify trends over time
  • ๐Ÿ”— DOI and Links - Capture DOI identifiers and PDF download links for citation and access
  • ๐Ÿ“Š Subject Areas and Keywords - Extract research categories and keywords for topic filtering and research categorization
  • ๐Ÿ’ฌ Corresponding Author - Identify primary contact for each article to request reprints or clarifications
  • ๐Ÿ“Ž Supplementary Materials - Collect links to supplementary data, code, and materials included with articles
  • ๐Ÿ“š Citation Information - Gather citation metadata for bibliometric analysis
  • ๐Ÿ“„ Full Metadata - Access version history, licensing, funding statements, competing interests, and data availability statements

๐Ÿ”ง Input

  • Start URL - medRxiv search URL to start scraping from. Use this for custom searches or specific page ranges. Cannot be used with search query. Example: https://www.medrxiv.org/search/asd
  • Search Query - Search term to find medRxiv articles matching your topic. Cannot be used with start URL. Example: "bacterial infection"
  • Sort Order - How to order results. Choose between best match, oldest first, or newest first
  • Max Items - Maximum number of articles to collect. Free users up to 100, paid users up to 1,000,000

Example input:

{
"searchQuery": "COVID-19 vaccine",
"orderBy": "newest",
"maxItems": 50
}

๐Ÿ“Š Output

Each article includes up to 24 data fields. Download as JSON, CSV, or Excel.

๐Ÿ“ Article Title๐Ÿ‘ฅ Authors๐Ÿ“– Abstract
๐Ÿ”— DOI๐Ÿ“… Publication Date๐Ÿ“„ Full Text
๐Ÿฅ Subject Areas๐Ÿ”‘ Keywords๐Ÿ‘ค Corresponding Author
๐Ÿ’ฌ Citation Information๐Ÿ“Ž Supplementary Materials๐Ÿ“š Related Articles
๐Ÿ”„ Version Historyโš–๏ธ License Information๐Ÿ’ฐ Funding Statement
โšก Competing Interests๐Ÿ“‹ Author Declarations๐Ÿ“Š Data Availability
๐Ÿ–‡๏ธ Data Code URLโœ… Metadata๐Ÿ• Scraped Timestamp

๐Ÿ’Ž Why Choose the medRxiv Scraper?

FeatureOur ActorSimilar Tools
Extract full article text and abstractsโœ”๏ธโŒ
Parallel article fetching (20 at a time)โœ”๏ธโŒ
Collect 24+ metadata fields per articleโœ”๏ธPartial
Support up to 1 million articles per runโœ”๏ธโŒ
Supplementary materials and data linksโœ”๏ธโŒ
Author details and affiliation dataโœ”๏ธPartial
Funding and ethics statements includedโœ”๏ธโŒ
Version history and license trackingโœ”๏ธโŒ
Competing interests and declarationsโœ”๏ธโŒ
Data repository and code linksโœ”๏ธโŒ
Free tier with 100 article limitโœ”๏ธโœ”๏ธ
No coding requiredโœ”๏ธโœ”๏ธ

๐Ÿ“‹ How to Use

No technical skills required. Follow these simple steps:

  1. Sign Up: Create a free account with $5 credit
  2. Find the Tool: Search for "medRxiv Scraper" in the Apify Store and configure your input
  3. Run It: Click "Start" and watch your results appear

That's it. No coding, no setup, no complicated configuration. Now you can export your data in CSV, Excel, or JSON format.

๐ŸŽฏ Business Use Cases

  • ๐Ÿ“Š Systematic Review Researcher - Collect 500 articles on a specific disease to create a comprehensive dataset for meta-analysis and identify research gaps
  • ๐Ÿ’ผ Academic Library Manager - Monitor new publications in neuroscience daily to update institutional research collections and notify faculty of relevant preprints
  • ๐Ÿ”ฌ Pharma Data Analyst - Extract competitor research timelines by tracking articles from rival companies and universities to understand drug development progress

โ“ FAQ

๐Ÿ” How does the scraper work? The scraper navigates medRxiv search results, extracts article metadata from search cards, and fetches detailed information from each article page in parallel for speed.

๐Ÿ“Š How accurate is the data? Data is extracted directly from medRxiv.org article pages and search results. All major fields (title, authors, abstract, DOI) are captured with high accuracy. Supplementary material links are present if the article author uploaded them.

๐Ÿ“… Can I schedule regular runs? Yes. Use Apify's scheduler to run this scraper daily, weekly, or monthly to monitor new publications in your research area automatically.

โš–๏ธ Is web scraping medRxiv allowed? medRxiv is a public preprint server and does not prohibit scraping in its robots.txt. However, you are responsible for complying with medRxiv's terms of service. Always respect rate limits and do not overload their servers.

๐Ÿ›ก๏ธ Will medRxiv block me? Unlikely. medRxiv does not have strict anti-bot protection. However, using a residential proxy is recommended for large-scale data collection to avoid any IP-based restrictions.

โšก How long does a run take? Collection time depends on the number of articles. Expect 1-2 seconds per article. Collecting 100 articles typically takes 2-3 minutes. Larger runs (1000+) take 20-40 minutes.

โš ๏ธ Are there any limits? Free users can collect up to 100 results per run. Paid users can collect up to 1,000,000 results per run.

๐Ÿ”— Integrate medRxiv Scraper with any app

๐Ÿ’ก More ParseForge Actors

Browse our complete collection of data extraction tools for more.

๐Ÿš€ Ready to Start?

Create a free account with $5 credit and collect your first 100 results for free. No coding, no setup.

๐Ÿ†˜ Need Help?

  • Check the FAQ section above for common questions
  • Visit the Apify support page for documentation and tutorials
  • Contact us to request a new scraper, propose a custom project, or report an issue at Tally contact form

โš ๏ธ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by medRxiv or any of its subsidiaries. All trademarks mentioned are the property of their respective owners.