Semantic Scholar Scraper
Pricing
Pay per usage
Semantic Scholar Scraper
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Donny Nguyen
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Overview
Semantic Scholar Scraper extracts academic paper data from Semantic Scholar, the AI-powered research tool by the Allen Institute for AI. It collects paper titles, authors, publication years, citation counts, abstracts, venues, fields of study, and open-access PDF links. This actor leverages the Semantic Scholar public API to provide structured academic research data with powerful citation filtering capabilities, making it invaluable for research impact analysis and literature discovery.
Features
- Search Semantic Scholar by multiple research topics simultaneously
- Filter papers by minimum citation count for high-impact research
- Extract comprehensive paper metadata including citation counts
- Access open-access PDF links when available
- Collect fields of study for cross-disciplinary analysis
- Automatic pagination for large result sets
- Built-in rate limiting to comply with API usage policies
- Fallback data ensures results delivery even if API is unavailable
Input Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| searchTerms | array | ["attention mechanism", "reinforcement learning"] | Research topics to search |
| maxResults | integer | 200 | Maximum number of papers to extract |
| minCitations | integer | 10 | Minimum citation count filter |
Output Format
Each paper in the dataset includes:
title- Paper titleauthors- Comma-separated author listyear- Publication yearcitationCount- Total number of citationsabstract- Paper abstractvenue- Publication venue or conferencefieldsOfStudy- Academic fields of studypdfUrl- Open-access PDF link (if available)paperId- Semantic Scholar paper identifiersemanticScholarUrl- Direct link to Semantic Scholar pagedoi- Digital Object IdentifiersearchTerm- Search term that found this paperscrapedAt- Timestamp of data extraction
Use Cases
This scraper is designed for researchers conducting literature reviews with citation-based filtering, research managers assessing publication impact across teams, academic institutions tracking research output and influence, AI practitioners finding seminal papers in their field, venture capital firms evaluating research commercialization potential, and science journalists identifying breakthrough publications. The citation count filtering enables efficient identification of the most influential papers in any research domain.
Pricing
This actor uses pay-per-event pricing at $0.30 per 1,000 papers scraped. The free Semantic Scholar API keeps operational costs minimal. No subscription or platform fees required. The citation filtering feature helps reduce unnecessary data extraction, further optimizing cost efficiency.
Limitations
- Semantic Scholar API has rate limits (100 requests per 5 minutes)
- Open-access PDF links are not available for all papers
- Citation counts may differ from other sources like Google Scholar
- Some niche research areas may have limited coverage
Built by consummate_mandala on Apify.