Semantic Scholar Scraper
Pricing
Pay per event
Semantic Scholar Scraper
Search and extract academic paper data from Semantic Scholar. Find papers, analyze citations, track references. 200M+ papers, no API key needed.
Pricing
Pay per event
Rating
0.0
(0)
Developer

Stas Persiianenko
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Search and extract academic paper data from Semantic Scholar. Find research papers, analyze citations, and track references. No API key needed.
π What does Semantic Scholar Scraper do?
Semantic Scholar Scraper extracts academic research data from Semantic Scholar's database of 200M+ papers. Four modes:
πΉ Search β Find papers by keyword, filter by year, field of study, and citation count πΉ Details β Get full metadata for specific papers by ID or DOI πΉ Citations β List all papers that cite a given paper (who cited this?) πΉ References β List all papers referenced by a given paper (bibliography)
β Why use Semantic Scholar Scraper?
πΉ No API key needed β Uses the public Semantic Scholar Academic Graph API πΉ 200M+ papers β Access one of the largest academic paper databases πΉ Citation analysis β Track who cites a paper and build citation graphs πΉ Open access detection β Find papers with free PDF access πΉ Rich metadata β Authors, abstracts, venues, DOIs, ArXiv IDs, fields of study πΉ Influential citations β Distinguish routine citations from influential ones
π‘ Use cases
πΉ Literature reviews β Find all relevant papers on a topic with citation counts πΉ Research tracking β Monitor new publications in your field πΉ Citation analysis β Build citation networks and find influential papers πΉ Academic SEO β Track citation impact of your publications πΉ Competitive research β Monitor competitor institutions' publications πΉ Dataset building β Create structured datasets of academic literature for ML/NLP
π Sample output
Paper data
| Field | Example |
|---|---|
| title | Attention Is All You Need |
| year | 2017 |
| citationCount | 120000 |
| authors | Ashish Vaswani, Noam Shazeer, ... |
| venue | NeurIPS |
| doi | 10.48550/arXiv.1706.03762 |
| fieldsOfStudy | Computer Science |
| isOpenAccess | true |
π° Pricing
| Event | Price |
|---|---|
| Start (per run) | $0.005 |
| Paper scraped | $0.001 |
Free plan estimate: ~200 papers per month on the Apify Free plan.
π’ How to search academic papers
- Go to the Semantic Scholar Scraper page on Apify
- Select mode (search, details, citations, or references)
- Enter keywords or paper IDs
- Set filters (year, field of study, minimum citations)
- Click "Start" and download results as JSON, CSV, or Excel
π₯ Input parameters
| Parameter | Type | Description |
|---|---|---|
| mode | string | search, details, citations, or references |
| searchTerms | string[] | Keywords to search (search mode) |
| paperIds | string[] | Paper IDs, DOIs, or ArXiv IDs (details/citations/references mode) |
| year | string | Year filter (e.g. "2023", "2020-2024") |
| fieldsOfStudy | string[] | Research fields to filter by |
| openAccessOnly | boolean | Only papers with free PDF (default: false) |
| minCitations | number | Minimum citation count filter |
| maxResults | number | Max papers per query (default: 50) |
π€ Output fields
type, paperId, title, year, citationCount, authors, authorIds, abstract, venue, publicationDate, doi, arxivId, url, pdfUrl, pdfLicense, fieldsOfStudy, isOpenAccess, influentialCitationCount, referenceCount, searchTerm, scrapedAt
π‘ Tips
πΉ Paper IDs β You can use Semantic Scholar IDs (40-char hex), DOIs (prefix with DOI:), ArXiv IDs (prefix with ARXIV:), or Corpus IDs (prefix with CorpusId:).
πΉ Year ranges β Use 2020-2024 for a range, or 2023- for 2023 onwards.
πΉ Fields of study β Common values: Computer Science, Medicine, Physics, Biology, Chemistry, Mathematics, Economics, Psychology, Engineering.
πΉ Influential citations β The influentialCitationCount field counts citations that meaningfully build on the work (not just routine mentions).
πΉ Rate limits β Semantic Scholar allows ~1 request/second without an API key. For higher throughput, request a free key at semanticscholar.org/product/api.
π Integrations
Export paper data to Google Sheets, Slack, Zapier, Make, or any webhook. Connect via the Apify API for automated research monitoring. Schedule weekly runs to track new publications.
π» API usage
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_TOKEN' });const run = await client.actor('automation-lab/semantic-scholar-scraper').call({mode: 'search',searchTerms: ['transformer neural network'],year: '2023-',minCitations: 100,maxResults: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_TOKEN')run = client.actor('automation-lab/semantic-scholar-scraper').call(run_input={'mode': 'citations','paperIds': ['649def34f8be52c8b66281af98ae884c09aef38b'],'maxResults': 100,})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items)
cURL
curl "https://api.apify.com/v2/acts/automation-lab~semantic-scholar-scraper/runs" \-X POST -H "Content-Type: application/json" \-H "Authorization: Bearer YOUR_TOKEN" \-d '{"mode": "search", "searchTerms": ["CRISPR gene editing"], "maxResults": 50}'
βοΈ Legality
Semantic Scholar Scraper accesses publicly available data through the official Semantic Scholar Academic Graph API. This API is provided by the Allen Institute for AI (AI2) and is designed for programmatic access to academic paper metadata. All data is derived from public academic publications.
β FAQ
Q: Do I need an API key? A: No. The Semantic Scholar API works without authentication, with a rate limit of ~1 request/second. For higher throughput, you can get a free API key from Semantic Scholar.
Q: How many papers are in the database? A: Over 200 million papers from all fields of science, indexed from major publishers, ArXiv, PubMed, and other sources.
Q: Can I search by author? A: The search mode uses keyword matching on titles and abstracts. For author-specific searches, find a paper by that author first, then use the details or citations mode.
Q: What's the difference between citations and references? A: Citations are papers that cite the target paper (who cited this?). References are papers that the target paper cites (its bibliography).
Q: Does this include full paper text?
A: No. The API provides metadata (title, abstract, authors, etc.) and the pdfUrl field links to open-access PDFs when available.
π Related scrapers
πΉ ArXiv Scraper β Search and extract preprints from ArXiv πΉ CrossRef Scraper β Extract DOI metadata and citation data πΉ OpenAlex Scraper β Academic paper data from OpenAlex