Semantic Scholar Scraper avatar

Semantic Scholar Scraper

Pricing

Pay per usage

Go to Apify Store
Semantic Scholar Scraper

Semantic Scholar Scraper

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Donny Nguyen

Donny Nguyen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Overview

Semantic Scholar Scraper extracts academic paper data from Semantic Scholar, the AI-powered research tool by the Allen Institute for AI. It collects paper titles, authors, publication years, citation counts, abstracts, venues, fields of study, and open-access PDF links. This actor leverages the Semantic Scholar public API to provide structured academic research data with powerful citation filtering capabilities, making it invaluable for research impact analysis and literature discovery.

Features

  • Search Semantic Scholar by multiple research topics simultaneously
  • Filter papers by minimum citation count for high-impact research
  • Extract comprehensive paper metadata including citation counts
  • Access open-access PDF links when available
  • Collect fields of study for cross-disciplinary analysis
  • Automatic pagination for large result sets
  • Built-in rate limiting to comply with API usage policies
  • Fallback data ensures results delivery even if API is unavailable

Input Parameters

ParameterTypeDefaultDescription
searchTermsarray["attention mechanism", "reinforcement learning"]Research topics to search
maxResultsinteger200Maximum number of papers to extract
minCitationsinteger10Minimum citation count filter

Output Format

Each paper in the dataset includes:

  • title - Paper title
  • authors - Comma-separated author list
  • year - Publication year
  • citationCount - Total number of citations
  • abstract - Paper abstract
  • venue - Publication venue or conference
  • fieldsOfStudy - Academic fields of study
  • pdfUrl - Open-access PDF link (if available)
  • paperId - Semantic Scholar paper identifier
  • semanticScholarUrl - Direct link to Semantic Scholar page
  • doi - Digital Object Identifier
  • searchTerm - Search term that found this paper
  • scrapedAt - Timestamp of data extraction

Use Cases

This scraper is designed for researchers conducting literature reviews with citation-based filtering, research managers assessing publication impact across teams, academic institutions tracking research output and influence, AI practitioners finding seminal papers in their field, venture capital firms evaluating research commercialization potential, and science journalists identifying breakthrough publications. The citation count filtering enables efficient identification of the most influential papers in any research domain.

Pricing

This actor uses pay-per-event pricing at $0.30 per 1,000 papers scraped. The free Semantic Scholar API keeps operational costs minimal. No subscription or platform fees required. The citation filtering feature helps reduce unnecessary data extraction, further optimizing cost efficiency.

Limitations

  • Semantic Scholar API has rate limits (100 requests per 5 minutes)
  • Open-access PDF links are not available for all papers
  • Citation counts may differ from other sources like Google Scholar
  • Some niche research areas may have limited coverage

Built by consummate_mandala on Apify.