Google Scholar Scraper
Pricing
from $5.00 / 1,000 results
Go to Apify Store
Google Scholar Scraper
Scrape academic papers, citations, and author profiles from Google Scholar
Pricing
from $5.00 / 1,000 results
Rating
0.0
(0)
Developer
codingfrontend
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Features
- Academic Papers: Extracts research papers and academic articles from Google Scholar search results
- Citations Count: Captures how many times each paper has been cited by other works
- Author Information: Records the names of all authors for each paper
- Publication Venue: Extracts the journal or conference where the paper was published
- Publication Year: Captures the year each paper was published
- PDF Links: Collects direct PDF links when available for open-access papers
- Text Snippets: Retrieves descriptive text snippets shown in Google Scholar results
- Date Range Filtering: Filter papers by publication year range (yearFrom and yearTo)
- Sort Options: Sort results by relevance or publication date
- Proxy Support: Built-in Apify Proxy with residential proxies to avoid Scholar rate limiting
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | String | Yes | "machine learning" | Academic search query to look up on Google Scholar |
maxItems | Integer | No | 50 | Maximum number of results to retrieve (1–10000) |
yearFrom | Integer | No | — | Filter results published from this year onwards (e.g., 2020) |
yearTo | Integer | No | — | Filter results published up to this year (e.g., 2024) |
sortBy | String | No | "relevance" | Sort results by: relevance or date |
proxyConfiguration | Object | No | Apify Residential | Proxy settings for the scraper |
Input Schema Example
{"query": "deep learning natural language processing","maxItems": 100,"yearFrom": 2020,"yearTo": 2025,"sortBy": "date","proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Output Schema
The scraper outputs structured JSON data for each academic paper found on Google Scholar.
Main Fields
| Field | Type | Description |
|---|---|---|
position | Integer | Result position in search results |
title | String | Paper title |
link | String | URL to the paper |
authors | String | Paper authors |
publication | String | Publication venue (journal, conference, etc.) |
year | Integer | Publication year |
citedBy | Integer | Number of citations |
snippet | String | Text snippet from the paper |
pdfLink | String | Direct link to PDF if available |
searchQuery | String | The search query used |
searchUrl | String | Google Scholar search URL |
scrapedAt | String | ISO timestamp of when the data was scraped |
Academic Paper Example
{"position": 1,"title": "Attention Is All You Need","link": "https://arxiv.org/abs/1706.03762","authors": "A Vaswani, N Shazeer, N Parmar, J Uszkoreit","publication": "Advances in neural information processing systems, 2017","year": 2017,"citedBy": 98450,"snippet": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder...","pdfLink": "https://arxiv.org/pdf/1706.03762","searchQuery": "deep learning natural language processing","searchUrl": "https://scholar.google.com/scholar?q=deep+learning+natural+language+processing","scrapedAt": "2025-01-15T10:30:00.000Z"}