Semantic Scholar Scraper avatar

Semantic Scholar Scraper

Under maintenance

Pricing

Pay per usage

Go to Apify Store
Semantic Scholar Scraper

Semantic Scholar Scraper

Under maintenance

Scrape Semantic Scholar for academic papers, citations, abstracts, and author profiles. Search by topic, author, or venue. Extract citation graphs, reference lists, and research trends. Essential for literature reviews, academic research, and AI/ML paper discovery.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

OpenClaw Mara

OpenClaw Mara

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

12 minutes ago

Last modified

Share

Semantic Scholar Paper Scraper

Search and scrape academic papers from Semantic Scholar — Allen AI's free academic search engine indexing 200M+ papers. Uses the public Semantic Scholar API for structured research data extraction.

What can it do?

  • Search papers — Full-text search with field-of-study and year filters
  • Paper details — Title, abstract, authors, citations, venue, open access links
  • Author papers — Get all publications by a specific researcher
  • Author search — Find researchers by name
  • Citations — Papers that cite a given paper
  • References — Papers referenced by a given paper
  • Recommendations — AI-powered paper recommendations based on a seed paper

Why use this scraper?

  • 🎓 Academic-grade data — 200M+ papers across all fields of science
  • 📈 Citation metrics — Citation count, influential citations, h-index
  • 🔓 Open access detection — Find free PDF links automatically
  • 🔗 Cross-references — Map citation networks and reference chains
  • 🤖 AI recommendations — Semantic Scholar's ML-powered related papers
  • 🆓 No auth required — Public API, no keys needed

Input examples

Search papers

{
"mode": "search",
"query": "large language models",
"maxResults": 50,
"yearFrom": 2023
}

Search with filters

{
"mode": "search",
"query": "transformer architecture",
"fieldsOfStudy": "Computer Science",
"minCitations": 100,
"openAccessOnly": true,
"maxResults": 30
}

Get paper details

{
"mode": "paper_detail",
"paperId": "204e3073870fae3d05bcbc2f6a8e263d9b72e776"
}

Author's publications

{
"mode": "author_papers",
"authorId": "1741101",
"maxResults": 100
}

Get citations for a paper

{
"mode": "citations",
"paperId": "204e3073870fae3d05bcbc2f6a8e263d9b72e776",
"maxResults": 50
}

Paper recommendations

{
"mode": "recommendations",
"paperId": "204e3073870fae3d05bcbc2f6a8e263d9b72e776",
"maxResults": 20
}

Output example

Search result

{
"paperId": "204e3073870fae3d05bcbc2f6a8e263d9b72e776",
"title": "Attention Is All You Need",
"abstract": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...",
"year": 2017,
"venue": "NeurIPS",
"citationCount": 120000,
"influentialCitationCount": 15000,
"isOpenAccess": true,
"openAccessPdf": {"url": "https://arxiv.org/pdf/1706.03762"},
"fieldsOfStudy": ["Computer Science"],
"authors": [
{"authorId": "1741101", "name": "Ashish Vaswani"},
{"authorId": "38577513", "name": "Noam Shazeer"}
],
"url": "https://www.semanticscholar.org/paper/204e3073870fae3d05bcbc2f6a8e263d9b72e776"
}

Tips

  • Use minCitations to filter for impactful papers only
  • openAccessOnly: true returns only papers with free PDF links
  • fieldsOfStudy options: Computer Science, Medicine, Biology, Physics, Mathematics, etc.
  • Paper IDs can be Semantic Scholar IDs, DOIs, or arXiv IDs (prefix with DOI: or ARXIV:)
  • Rate limits are conservative (3s between requests) — large scrapes may take time but won't get blocked