Pricing

Pay per event

Try for free

Go to Apify Store

Semantic Scholar Scraper

Try for free

Extract detailed academic paper data from Semantic Scholar, including abstracts, citations, authors, and publication details. Ideal for researchers, academics, and analysts who need structured scholarly data for literature reviews, research workflows, and large-scale academic analysis.

Pricing

Pay per event

Rating

5.0

(1)

Developer

ParseForge

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

📚 Semantic Scholar Scraper

🚀 Collect academic paper data from Semantic Scholar in minutes. Search by keyword, author, venue, or year range. Export titles, abstracts, citations, authors, and PDF links. No coding, no API key required.

🕒 Last updated: 2026-04-23 · 📊 20+ fields per paper · 🔍 6 search filters · 📄 PDF availability · 🚫 No auth required

The Semantic Scholar Scraper collects academic paper data from Semantic Scholar, returning 20+ fields per paper: title, abstract, authors, citation count, reference count, year, venue, DOI, PDF URL, and paper URL. Filter by keyword, author, venue, year range, and PDF availability. Runs support up to 1,000,000 papers on a paid plan.

Semantic Scholar indexes over 200 million academic papers. This Actor queries its database with 6 filters and returns structured results ready for literature reviews, citation analysis, or research dashboards.

🎯 Target Audience	💡 Primary Use Cases
Academic researchers, data scientists, R&D teams, librarians, science journalists, bibliometric analysts	Literature reviews, citation analysis, research trend tracking, author profiling, venue benchmarking

📋 What the Semantic Scholar Scraper does

Six search filters:

🔍 Keyword search. Free-text search across titles and abstracts.
🔗 URL mode. Paste a direct Semantic Scholar search URL.
👤 Author filter. Search by author name.
📅 Year range. Min and max publication year.
📄 PDF filter. Only papers with available PDFs.
🏛️ Venue filter. Conference or journal name.

Each paper record includes title, abstract, authors (with IDs), citation count, reference count, year, venue, DOI, fields of study, PDF URL, and Semantic Scholar URL.

💡 Why it matters: searching for papers one at a time on Semantic Scholar or Google Scholar is slow and doesn't support bulk export. This Actor downloads structured academic data at scale for systematic reviews, bibliometric analysis, or research intelligence.

🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.

⚙️ Input

Input	Type	Default	Behavior
searchQuery	string	""	Keyword search across titles and abstracts.
startUrl	string	""	Direct Semantic Scholar URL.
author	string	""	Author name filter.
yearMin	integer	-	Minimum publication year.
yearMax	integer	-	Maximum publication year.
hasPdf	boolean	false	Only papers with available PDFs.
venues	array	[]	Conference or journal names.
maxItems	integer	10	Max papers. Free: limited. Paid: up to 1,000,000.

Example: recent AI papers with PDFs available.

{
    "searchQuery": "large language models",
    "yearMin": 2024,
    "hasPdf": true,
    "maxItems": 100
}

Example: papers by a specific author.

{
    "author": "Yoshua Bengio",
    "maxItems": 50
}

📊 Output

Each paper record contains 20+ fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

Field	Type	Example
📝 title	string	`"Attention Is All You Need"`
📄 abstract	string	`"We propose a new simple network..."`
👤 authors	array	`[{ name, authorId }]`
📊 citationCount	number	`95000`
📚 referenceCount	number	`38`
📅 year	number	`2017`
🏛️ venue	string	`"NeurIPS"`
🔗 doi	string	`"10.5555/3295222.3295349"`
📂 fieldsOfStudy	array	`["Computer Science"]`
📄 pdfUrl	string \| null	`"https://arxiv.org/pdf/1706.03762"`
🔗 url	string	`"https://www.semanticscholar.org/paper/..."`
🕒 scrapedAt	ISO 8601	`"2026-04-16T00:00:00.000Z"`

📦 Sample records

✨ Why choose this Actor

	Capability
📚	200M+ papers indexed. Full Semantic Scholar database.
🔍	6 search filters. Keyword, author, year, venue, PDF, and URL.
📊	Citation and reference counts. Quantitative impact metrics.
📄	PDF links. Direct download URLs when available.
👤	Author profiles. Name and Semantic Scholar ID per author.
⚡	Scalable. From single-paper lookups to full topic sweeps.
🚫	No authentication. No API key needed.

📈 How it compares to alternatives

Approach	Cost	Coverage	Refresh	Filters	Setup
⭐ Semantic Scholar Scraper (this Actor)	$5 free credit, then pay-per-use	200M+ papers	Live per run	keyword, author, year, venue, PDF	⚡ 2 min
Semantic Scholar API (direct)	Free with rate limits	Full	Real-time	Many	⏳ Hours (API setup)
Google Scholar	Free	Broad	Manual	Limited	🕒 Per search
Paid academic databases	$1,000-50,000/year	Multi-source	Varies	Many	🐢 Weeks

Pick this Actor when you want academic paper metadata on demand, with filters, without writing API client code.

🚀 How to use

📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
🌐 Open the Actor. Go to the Semantic Scholar Scraper page on the Apify Store.
🎯 Set input. Enter a search query, author, or year range.
🚀 Run it. Click Start.
📥 Download. Grab results in the Dataset tab.

⏱️ Total time: 3-5 minutes. No coding required.

💼 Business use cases

📊 Literature Reviews & Bibliometrics

Build systematic review datasets
Analyze citation networks by topic
Track research trends over time
Compare venue impact by field

🏢 R&D & Industry Research

Monitor competitor publications
Track emerging technologies by keyword
Build prior-art search databases
Identify expert authors by citation count

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

Empirical datasets for papers, thesis work, and coursework
Longitudinal studies tracking changes across snapshots
Reproducible research with cited, versioned data pulls
Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

Side projects, portfolio demos, and indie app launches
Data visualizations, dashboards, and infographics
Content research for bloggers, YouTubers, and podcasters
Hobbyist collections and personal trackers

🤝 Non-profit and civic

Transparency reporting and accountability projects
Advocacy campaigns backed by public-interest data
Community-run databases for local issues
Investigative journalism on public records

🧪 Experimentation

Prototype AI and machine-learning pipelines with real data
Validate product-market hypotheses before engineering spend
Train small domain-specific models on niche corpora
Test dashboard concepts with live input

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

❓ Frequently Asked Questions

🔌 Automating Semantic Scholar Scraper

🟢 Node.js. Install the apify-client NPM package.
🐍 Python. Use the apify-client PyPI package.
📚 See the Apify API documentation for full details.

🔌 Integrate with any app

Make - Automate workflows
Zapier - Connect 5,000+ apps
Slack - Get notifications
Airbyte - Data pipelines
GitHub - Trigger from commits
Google Drive - Export to Sheets

🔗 Recommended Actors

📚 Rate My Professors Scraper - Professor ratings
🏥 ClinicalTrials.gov Scraper - Clinical trial data
📰 PR Newswire Scraper - Press releases
📊 Indexmundi Scraper - Global indicators
🔗 Broken Link Checker - URL validation

💡 Pro Tip: browse the complete ParseForge collection for more research and data scrapers.

🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.

⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Semantic Scholar or the Allen Institute for AI. All trademarks mentioned are the property of their respective owners. Only publicly available academic metadata is collected.

Semantic Scholar Scraper

openclawmara/semantic-scholar-scraper

Scrape Semantic Scholar for academic papers, citations, abstracts, and author profiles. Search by topic, author, or venue. Extract citation graphs, reference lists, and research trends. Essential for literature reviews, academic research, and AI/ML paper discovery.

OpenClaw Mara

Semantic Scholar Paper Scraper

agenscrape/semantic-scholar-paper-scraper

Scrape academic papers from Semantic Scholar. Search by keyword and extract paper titles, abstracts, authors, citation counts, publication dates, DOIs, open access PDFs... Perfect for literature reviews, citation analysis, and research databases. Real time data output with pagination support.

Agenscrape

Semantic Scholar Scraper - Cheap 📚🔎🤖

scrapestorm/semantic-scholar-scraper---cheap

🔎 Easily collect research papers from Semantic Scholar Provide one or multiple search keywords, paper URLs or author profiles and extract structured academic data such as 📄 Paper Title👨‍🔬 Authors 📅 Publication Year 🔗 Paper URL & more Perfect for academic research & AI research monitoring 📚

Storm_Scraper

5.0

Semantic Scholar Search Scraper

powerai/semantic-scholar-search-scraper

Scrape academic papers from Semantic Scholar by keyword search, with automatic pagination and comprehensive research data extraction.

PowerAI

Google Scholar Article Scraper

agenscrape/google-scholar-article-scraper

Extract academic articles, citations, authors, and publication data from Google Scholar. Perfect for research analysis and literature reviews with fast, reliable scraping.

Agenscrape

Semantic Scholar Paper Search

ryanclinton/semantic-scholar-search

Search and extract academic research papers from Semantic Scholar's database of over 200 million publications.

Ryan Clinton

Semantic Scholar Academic Paper Scraper

cloud9_ai/semantic-scholar-scraper

Search and extract academic papers, citations, and authors from Semantic Scholar. 200M+ papers with citation graphs and impact metrics. Search papers, get detailed paper info, or find researchers. API key optional. For research and AI.

cloud9

Semantic Scholar Scraper - Papers, Authors, Citations

gio21/semantic-scholar-scraper

Search and fetch academic papers, authors, citations, and references via the Semantic Scholar Graph API.

Gio

Semantic Scholar Scraper

solidcode/semanticscholar-scraper

[💰 $6 / 1K] Extract academic papers, abstracts, citations, references, authors, and open-access PDF links from Semantic Scholar's 200M+ database. Search by keyword, paper ID/DOI/URL, or author. Filter by year, field, and citations. No API key.

SolidCode

Semantic Scholar Scraper

fortuitous_pirate/semantic-scholar-scraper

Search 200M+ academic papers from Semantic Scholar: titles, abstracts, authors, citations, open-access PDFs, and fields of study. Filter by year, venue, or citation count. Free API.