Google Scholar Scraper avatar

Google Scholar Scraper

Pricing

from $5.00 / 1,000 results

Go to Apify Store
Google Scholar Scraper

Google Scholar Scraper

Scrape academic papers, citations, and author profiles from Google Scholar

Pricing

from $5.00 / 1,000 results

Rating

0.0

(0)

Developer

codingfrontend

codingfrontend

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Features

  • Academic Papers: Extracts research papers and academic articles from Google Scholar search results
  • Citations Count: Captures how many times each paper has been cited by other works
  • Author Information: Records the names of all authors for each paper
  • Publication Venue: Extracts the journal or conference where the paper was published
  • Publication Year: Captures the year each paper was published
  • PDF Links: Collects direct PDF links when available for open-access papers
  • Text Snippets: Retrieves descriptive text snippets shown in Google Scholar results
  • Date Range Filtering: Filter papers by publication year range (yearFrom and yearTo)
  • Sort Options: Sort results by relevance or publication date
  • Proxy Support: Built-in Apify Proxy with residential proxies to avoid Scholar rate limiting

Input Parameters

ParameterTypeRequiredDefaultDescription
queryStringYes"machine learning"Academic search query to look up on Google Scholar
maxItemsIntegerNo50Maximum number of results to retrieve (1–10000)
yearFromIntegerNoFilter results published from this year onwards (e.g., 2020)
yearToIntegerNoFilter results published up to this year (e.g., 2024)
sortByStringNo"relevance"Sort results by: relevance or date
proxyConfigurationObjectNoApify ResidentialProxy settings for the scraper

Input Schema Example

{
"query": "deep learning natural language processing",
"maxItems": 100,
"yearFrom": 2020,
"yearTo": 2025,
"sortBy": "date",
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}

Output Schema

The scraper outputs structured JSON data for each academic paper found on Google Scholar.

Main Fields

FieldTypeDescription
positionIntegerResult position in search results
titleStringPaper title
linkStringURL to the paper
authorsStringPaper authors
publicationStringPublication venue (journal, conference, etc.)
yearIntegerPublication year
citedByIntegerNumber of citations
snippetStringText snippet from the paper
pdfLinkStringDirect link to PDF if available
searchQueryStringThe search query used
searchUrlStringGoogle Scholar search URL
scrapedAtStringISO timestamp of when the data was scraped

Academic Paper Example

{
"position": 1,
"title": "Attention Is All You Need",
"link": "https://arxiv.org/abs/1706.03762",
"authors": "A Vaswani, N Shazeer, N Parmar, J Uszkoreit",
"publication": "Advances in neural information processing systems, 2017",
"year": 2017,
"citedBy": 98450,
"snippet": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder...",
"pdfLink": "https://arxiv.org/pdf/1706.03762",
"searchQuery": "deep learning natural language processing",
"searchUrl": "https://scholar.google.com/scholar?q=deep+learning+natural+language+processing",
"scrapedAt": "2025-01-15T10:30:00.000Z"
}