Pricing

from $10.00 / 1,000 results

Go to Apify Store

Biorxiv Preprint Scraper

Try for free

Scrapes preprint paper metadata from the bioRxiv API by date range and optional category.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Donny

Actor stats

Bookmarked

Total users

Monthly active users

15 hours ago

Last modified

bioRxiv Preprint Paper Scraper

What it does

This actor scrapes preprint paper metadata from the bioRxiv API. Unlike keyword-based search, the bioRxiv API operates on date ranges, allowing you to retrieve all preprints published within a specified period. You can optionally filter results by scientific category (e.g., neuroscience, bioinformatics, genomics). Each record includes DOI, title, authors, corresponding author, publication date, category, abstract, and publication status.

Why use this actor

This actor provides a simple, reliable, and scalable way to extract data from public APIs without needing to write any code or manage infrastructure. It handles pagination, rate limiting, error recovery, and data normalization automatically. Whether you are a researcher, analyst, or developer, this actor saves you hours of manual data collection work. The structured JSON output integrates seamlessly with spreadsheets, databases, dashboards, and downstream data pipelines. Run it on Apify platform for automatic scheduling, monitoring, and proxy management.

Input parameters

startDate (string, optional): Start date in YYYY-MM-DD format. Default: "2024-01-01".
endDate (string, optional): End date in YYYY-MM-DD format. Default: "2024-12-31".
category (string, optional): Filter by scientific category (e.g., "neuroscience", "bioinformatics"). Leave empty for all categories.
maxResults (integer, optional): Maximum number of preprints to return. Range: 1-1000. Default: 100.

Output data

The actor outputs a dataset with the following fields:

doi: Digital Object Identifier for the preprint
title: Title of the preprint paper
authors: Full author list
correspondingAuthor: Name of the corresponding author
date: Publication date on bioRxiv
category: Scientific category (e.g., neuroscience)
abstract: Full abstract text
published: Journal publication status or name

Each record is validated and null-checked before being pushed to the dataset. Missing or unavailable fields are set to null rather than being omitted, ensuring consistent schema across all records.

Example output

[
    {
        "doi": "10.1101/2024.01.15.575123",
        "title": "Novel Neural Circuit Mechanisms in Mouse Hippocampus",
        "authors": "Smith, J.; Johnson, K.; Williams, L.",
        "correspondingAuthor": "Smith, J.",
        "date": "2024-01-15",
        "category": "neuroscience",
        "abstract": "We describe a novel neural circuit mechanism...",
        "published": "NA"
    }
]

Pricing

This actor is priced based on usage:

$0.01 per result returned in the dataset
$0.005 per actor start (flat fee per run)

These costs cover Apify platform compute and proxy resources. For large-scale scraping jobs, consider using the maxResults parameter to control costs and stay within your budget. Typical runs of 100 results cost approximately $1.01.

More scrapers from brave_paradise

Check out other actors published by brave_paradise on the Apify Store for more data extraction tools covering scientific databases, developer communities, news aggregators, and government open-data APIs. All actors follow the same high-quality patterns with robust error handling, automatic pagination, and clean structured output.

Osf Preprint Scraper

brave_paradise/osf-preprint-scraper

Scrapes preprints from the Open Science Framework API by keyword search.

Donny

medRxiv Scraper

parseforge/medrxiv-scraper

Extract comprehensive preprint data from medRxiv, including titles, authors, abstracts, full text, DOIs, citations, and metadata. Automate access to health-science preprints with structured outputs, ideal for researchers and analysts who need reliable, large-scale article data without manual work.

ParseForge

5.0

(1)

FINRA BrokerCheck Scraper

parseforge/finra-brokercheck-scraper

Supercharge your financial industry research! Automate collection of detailed broker and investment advisor information including employment history, regulatory actions, licensing details, and firm affiliations. Get complete professional backgrounds, disclosures, and compliance data from FINRA.

ParseForge

5.0

(3)

PDF Text Extractor

jirimoravcik/pdf-text-extractor

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

Jiří Moravčík

981

5.0

(1)

Pdf Text Extractor Pro

dainty_screw/pdf-text-extractor-pro

PDF Text Extractor lets you quickly extract text from PDF files with high accuracy. Supports text chunking for AI, chatbots, and large language models (LLMs), making PDF-to-text conversion fast, clean, and ready for NLP or machine learning.

codemaster devops

5.0

(1)

Eksi Sozluk Scraper

epctex/eksisozluk-scraper

epctex

5.0

(8)

arXiv Scraper

parseforge/arxiv-scraper

Comprehensive arXiv scraper for extracting scholarly article data across physics, math, CS, biology, finance, statistics, engineering, and economics. Automates access to arXiv’s large preprint archive, providing structured metadata for researchers, academics, and data scientists.

ParseForge

5.0

(1)

FRED Economic Data Scraper

parseforge/fred-scraper

Scrape economic data from the Federal Reserve’s FRED API, including series details, observations, categories, and metadata. Access indicators like CPI, GDP, unemployment rates, and thousands more. Ideal for economists, researchers, and analysts needing automated, up-to-date economic intelligence.

ParseForge

5.0

(1)

Academic Paper Scraper

labrat011/academic-paper-scraper

Search MILLIONS of academic papers from Semantic Scholar and arXiv by keyword, DOI, or citation graph. Returns titles, authors, abstracts, citation counts, and open access PDFs as clean JSON. Works as an MCP tool for AI agents.

Mick

Academic Paper Scraper

constant_quadruped/academic-paper-scraper

Search arXiv and PubMed in one request. Returns unified paper data: titles, authors, abstracts, DOIs, and PDF links. Filter by keywords, authors, categories, and date range. Built-in rate limiting and cross-source deduplication. Export to JSON, CSV, or Excel.