Under maintenance

Pricing

Pay per usage

Try for free

Go to Apify Store

VLDB Scraper

Under maintenance

Try for free

Scrape academic papers from VLDB (Very Large Data Bases) proceedings one of the top-tier venues in database research. Ideal for literature reviews, citation analysis, research trend tracking, and building academic datasets.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Iyadh Khan

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

VLDB Papers Scraper - Academic Research Data Extractor

An Apify Actor that scrapes academic papers from VLDB (Very Large Data Bases) proceedings — one of the top-tier venues in database research. Ideal for literature reviews, citation analysis, research trend tracking, and building academic datasets.

What it does

For each paper across the specified VLDB volumes, the scraper extracts:

Volume - PVLDB volume number
Year - Corresponding publication year(s)
Title - Paper title
Authors - Author names
Abstract - Paper abstract
PDF Link - Direct link to the PDF
Artifact Available - Whether the paper has associated artifacts
URL - Link to the paper page

Input

Field	Type	Description	Default
`volumes`	`string[]`	Select PVLDB volumes from a dropdown (Volume 1–20, 2008–present)	`["17", "18", "19"]`
`max_workers`	`integer`	Parallel Chrome instances (1–4). Use 1–2 for 4GB, 3–4 for 8GB+.	`2`

Available volumes

Volume	Year(s)
1	2008
2	2009
3	2010
4	2010–2011
5	2011–2012
6	2012–2013
7	2013–2014
8	2014–2015
9	2015–2016
10	2016–2017
11	2017–2018
12	2018–2019
13	2019–2020
14	2020–2021
15	2021–2022
16	2022–2023
17	2023–2024
18	2024–2025
19	2025–2026
20	2026–2027

Example input

{
    "volumes": ["17", "18", "19"],
    "max_workers": 2
}

Output

Results are stored in the default dataset. Each item has this structure:

{
    "volume": 18,
    "year": "2024–2025",
    "title": "Example Paper Title",
    "authors": "Author One, Author Two",
    "abstract": "This paper presents...",
    "pdfLink": "https://www.vldb.org/pvldb/vol18/example.pdf",
    "artifactAvailable": true,
    "url": "https://www.vldb.org/pvldb/vol18/paper/example"
}

How it works

Visits each selected VLDB volume index page and collects all paper links
Splits papers across parallel Chrome workers (configurable, 1–4)
Each worker scrapes its assigned papers using Selenium + BeautifulSoup
Pushes each paper's data to the Apify dataset

Getting started

On Apify platform

Go to the Actor's page on the Apify platform
Select the volumes you want to scrape and set concurrency
Click Start and wait for the run to finish
Download the results from the Dataset tab (JSON, CSV, Excel, etc.)

Local development

Requires Chrome/Chromium and Chromedriver installed locally.

$apify run

arXiv Paper & Author Scraper

automly/arxiv-paper-scraper

Extract academic papers, abstracts, and author details from arXiv using the official API. Ideal for research monitoring, literature reviews, and building academic datasets.

Automly

📄 Academic Paper Scraper — Research & Citations

nexgendata/academic-paper-scraper

Scrape academic papers, research articles, citations, author profiles, and h-index data from Google Scholar. Extract abstracts, publication dates, journal names, and citation counts for literature reviews.

NexGenData

ArXiv Academic Paper Scraper

fortuitous_pirate/arxiv-scraper

Scrape academic papers from ArXiv. Extract titles, authors, abstracts, categories, and PDF links. Essential for research and literature reviews.

Fortuitous Pirate

Google Scholar Lite - Cheap Bulk Academic Papers API

johnvc/google-scholar-lite-api

Search Google Scholar for academic papers in bulk and export clean JSON: title, authors, journal, year, citation count, and PDF links. Fast bibliometric search for literature reviews, citation discovery, and research datasets. Pay per paper from $1.50 per 1,000, with no setup or per-run fee.

John

5.0

Google Scholar Scraper - Low-cost💲🔥📚🎓

delectable_incubator/google-scholar-scraper-low-cost

Scrape Google Scholar academic papers 📚🔍 with a powerful research scraper. Extract paper titles, authors, publication dates, journals/sources, citations, and direct links to full texts. Ideal for academic research, literature reviews, citation analysis, AI/NLP training, and knowledge discovery 🚀

Prime Scrape

arXiv Papers Scraper with AI Topic Tags

and_krm/arxiv-scraper

Search arXiv.org for academic papers by keyword, author, or category. Get clean structured data with optional AI topic tagging via Claude. Perfect for literature reviews, research monitoring, and academic datasets.

Andrei

OpenAlex Scraper - Academic Papers & Citations

benthepythondev/openalex-scraper

OpenAlex Scraper to search 250M+ academic papers via the free OpenAlex API. Extract title, authors, institutions, year, venue, DOI, citation count, open-access status, concepts and PDF links. Filter by year and open access. For literature reviews, citation analysis and AI/RAG datasets.

ben

Semantic Scholar Scraper

openclawmara/semantic-scholar-scraper

Scrape Semantic Scholar for academic papers, citations, abstracts, and author profiles. Search by topic, author, or venue. Extract citation graphs, reference lists, and research trends. Essential for literature reviews, academic research, and AI/ML paper discovery.

OpenClaw Mara

Google Scholar Research Assistant

juridical_alligator/ScholarResearchAssistant

Extract the most relevant first page of Google Scholar for links to academic papers and get AI-powered research insights. Enter keywords to search. Perfect for quick literature reviews, research planning, and staying current with academic developments in any field.

Yi Hong

OpenAlex Academic Works Scraper

crawlerbros/philpapers-scraper

Search and scrape academic papers from OpenAlex - the free, open academic database with 200M+ works. Filter by keyword, author, year, open access status, and type. No API key required.