Pricing

from $20.00 / 1,000 results

📄 Academic Paper Scraper — Research & Citations

Scrape academic papers, research articles, citations, author profiles, and h-index data from Google Scholar. Extract abstracts, publication dates, journal names, and citation counts for literature reviews.

Pricing

from $20.00 / 1,000 results

Rating

0.0

(0)

Developer

NexGenData

Actor stats

Bookmarked

Total users

Monthly active users

11 days ago

Last modified

📚 Academic Paper Scraper — Semantic Scholar, Connected Papers & Scite Alternative

Search and extract structured metadata from academic papers across major open repositories — title, authors, affiliations, abstract, DOI, citation count, publication venue, year, and reference list. Built as a pay-per-result alternative to Semantic Scholar API (rate-limited), Connected Papers (UI-only), Scite ($20-100/mo), Web of Science (institutional license), and Scopus for systematic reviews, citation tracking, and research-trend monitoring.

Why Academic Paper Scraper Beats Semantic Scholar, Connected Papers, Scite & Web of Science

Feature	NexGenData Academic Paper Scraper	Semantic Scholar API	Connected Papers	Scite	Web of Science
Cost	$0.005 / paper, pay-per-result	Free + rate-limited	Free (UI only)	$20-100 / month	Institutional license
Bulk search by keyword / topic	Yes	Yes (rate-limited)	UI only	Yes	Yes
Full reference list per paper	Yes	Yes	Visualization only	Yes	Yes
Citation counts	Yes	Yes	Yes	Yes	Yes
Cross-repository coverage	arXiv, PubMed, DOAJ, OpenAlex, etc.	Multi-source	Multi-source	Multi-source	Multi-source
Bulk export	JSON / CSV / Excel	DIY pagination	None	CSV (plan-gated)	Plan-gated
API access	Apify REST + SDKs	Yes (free + rate-limited)	None	Paid plan	Institutional
Auth required	Apify token	Optional API key	None	Account + plan	Institutional login
Monthly minimum	None	None	None	$20+	Institutional

Most academic + R&D teams pick this actor instead of the Semantic Scholar API because the free tier rate-limits cap any systematic review past ~5K papers. Cheaper than Scite for the bulk-metadata use case and a drop-in alternative to Web of Science for teams without an institutional license.

What You Get Per Paper

title, abstract, doi, arxiv_id, pubmed_id, openalex_id
authors — array of {name, affiliation, orcid} records
published_date, year, venue, journal, conference
citation_count, reference_count, influential_citation_count
references — array of cited papers {title, doi, year}
topics — MeSH terms / arXiv categories / OpenAlex concepts
is_open_access, pdf_url, landing_page_url
funding_acknowledgements — when extractable

Use Cases

Systematic reviews — pull all papers on a topic with full metadata for PRISMA workflows
Citation tracking — monitor papers citing your work or competitor work weekly
Grant writing — assemble prior-art bibliographies + recent-literature surveys
Research-trend monitoring — detect rising topics in a field by citation velocity
Pharma / biotech competitive intel — track competitor publications and KOL output
AI / NLP training — bulk-export labeled paper abstracts for scientific-language models
Patent prior art — surface academic publications relevant to a patent application

Quick Start (Python)

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("nexgendata/academic-paper-scraper").call(run_input={
    "queries": ["large language models", "graph neural networks"],
    "year_from": 2023,
    "max_per_query": 500,
    "include_references": True
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["title"], item["citation_count"], item["doi"])

Pricing — Pay Per Paper

Actor start: $0.005
Paper: $0.005

A 500-paper literature review = $2.505. A weekly 100-paper citation-tracker = $0.505/run. No monthly minimum.

Use case	Actor
arXiv preprint search + metadata	arxiv-scraper
PubMed biomedical search	pubmed-research-search
Google Scholar search	google-scholar-scraper
Academic research MCP server (AI / Claude)	academic-research-mcp-server
NIH RePORTER grants database	nih-reporter-grants-scraper
IRS 990 nonprofit research funding	irs-990-nonprofit-explorer-scraper
SEC EDGAR filings (corporate R&D)	sec-edgar-scraper
Federal Register rules (regulatory science)	federal-register-rules-scraper
Hacker News scraper (CS / ML discourse)	hacker-news-scraper

FAQ

Q: Coverage scope? A: Multi-source — arXiv, PubMed, DOAJ, OpenAlex, and Semantic Scholar's graph. Each paper is deduped across sources by DOI / arXiv ID.

Q: Citation counts — how fresh? A: Pulled live per run from the source graph (Semantic Scholar / OpenAlex). Typically within 24-48h of the source's own refresh.

Q: Closed-access papers? A: Metadata + abstract are returned. PDF is included only when the paper is open-access (is_open_access: true). For closed-access PDFs, follow the doi URL with your institutional access.

Q: Full-text mining? A: This actor returns metadata + abstract. For full-text PDFs, use the pdf_url field with a separate PDF-parser actor.

Q: How does it differ from pubmed-research-search? A: PubMed search is biomedical-only with deeper MeSH coverage. This actor is cross-discipline (CS, physics, bio, social sciences) via the OpenAlex + Semantic Scholar graph.

How NexGenData Pricing Works

Every NexGenData actor uses pay-per-event pricing — you only pay for results that actually land in your dataset. No monthly minimum, no seat fees, no surprise overage bills.

Actor Start: a single-event charge each time you spin the actor up (scaled to memory size)
Result / item: charged per item written to the default dataset
No charge for retries, internal proxy rotation, or failed sub-requests — those are absorbed by the platform

Apify Platform Bonus

New to Apify? Sign up with the NexGenData referral link — you get free platform credits on signup (enough for several thousand free results) and you help fund the maintenance of this actor fleet.

Integration Surface

Every actor in the NexGenData catalog can be triggered from:

Apify console — point-and-click run
Apify API — REST + webhooks
Apify Python / JS SDKs — programmatic batch
Zapier, Make.com, n8n — official integrations
MCP — many actors are exposed as MCP tools for Claude / ChatGPT / Cursor agents
Schedules — built-in cron for daily / weekly / monthly runs
Webhooks — POST results to any HTTPS endpoint on dataset write

Support

NexGenData maintains 260+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get a response within 24 hours. Roadmap requests are welcome — high-demand features ship in the next version.

Home: thenextgennexus.com Full catalog: apify.com/nexgendata

Google Scholar Scraper — Academic Papers & Citations

muhammadafzal/google-scholar-scraper

Extract academic paper titles, authors, abstracts, citation counts, publication details, and PDF links from Google Scholar. Fast, reliable, no browser overhead. Search by keyword, topic, or author name. MCP-optimized for AI agents.

Muhammad Afzal

Google Scholar Scraper - Academic Papers & Citations

klondikeking/google-scholar-scraper-v2

Extract academic papers, citations, authors, and PDF links from Google Scholar.

Pierrick McD0nald

Google Scholar Scraper - Low-cost💲🔥📚🎓

delectable_incubator/google-scholar-scraper-low-cost

Scrape Google Scholar academic papers 📚🔍 with a powerful research scraper. Extract paper titles, authors, publication dates, journals/sources, citations, and direct links to full texts. Ideal for academic research, literature reviews, citation analysis, AI/NLP training, and knowledge discovery 🚀

Prime Scrape

Semantic Scholar Scraper

openclawmara/semantic-scholar-scraper

Scrape Semantic Scholar for academic papers, citations, abstracts, and author profiles. Search by topic, author, or venue. Extract citation graphs, reference lists, and research trends. Essential for literature reviews, academic research, and AI/ML paper discovery.

OpenClaw Mara

🎓 Google Scholar Scraper — Papers & Citations

nexgendata/google-scholar-scraper

Scrape Google Scholar for papers, citations, authors & h-index data. Semantic Scholar, Scopus & Web of Science alternative for literature reviews, citation analysis, author clustering and research analytics. Pay per paper.

NexGenData

Google Scholar Article Scraper

agenscrape/google-scholar-article-scraper

Extract academic articles, citations, authors, and publication data from Google Scholar. Perfect for research analysis and literature reviews with fast, reliable scraping.

Agenscrape

Semantic Scholar Paper Scraper

agenscrape/semantic-scholar-paper-scraper

Scrape academic papers from Semantic Scholar. Search by keyword and extract paper titles, abstracts, authors, citation counts, publication dates, DOIs, open access PDFs... Perfect for literature reviews, citation analysis, and research databases. Real time data output with pagination support.

Agenscrape

Google Scholar Scraper

george.the.developer/google-scholar-scraper

Scrape Google Scholar for academic papers, citations, author profiles. No API key needed. Extract titles, authors, abstracts, citation counts, PDF links, h-index, i10-index. Export JSON, CSV, Excel. Anti-bot protection with residential proxies, UA rotation, CAPTCHA detection.

George Kioko

5.0

Semantic Scholar Scraper

parseforge/semantic-scholar-scraper

Extract detailed academic paper data from Semantic Scholar, including abstracts, citations, authors, and publication details. Ideal for researchers, academics, and analysts who need structured scholarly data for literature reviews, research workflows, and large-scale academic analysis.

ParseForge

5.0

Google Scholar Scraper

solidcode/google-scholar-scraper

[💰 $2.0 / 1K] Extract academic papers, author profiles, h-index, i10-index, citation counts, abstracts, and PDF links from Google Scholar. Batch search queries and author IDs, filter by year range, sort by relevance or date.