Pricing

Pay per event

Google Scholar Scraper

Search Google Scholar and extract academic papers. Get titles, authors, citation counts, abstracts, PDF links, and publication details. Supports year filtering.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

7 days ago

Last modified

What does Google Scholar Scraper do?

Google Scholar Scraper searches Google Scholar and extracts structured data from academic search results. For each paper, it returns the title, authors, publication year, citation count, abstract snippet, PDF link, and publication source.

🔍 Search by query — supports Google Scholar syntax: exact phrases, author search, exclusions, OR/AND operators
📅 Year filtering — restrict results to a specific date range
📊 Sort by relevance or date — newest first or most relevant
📄 PDF link extraction — direct links to PDF/full-text when available
📈 Citation data — citation counts and links to citing papers
🔗 Direct URL support — paste any Google Scholar search URL to scrape it
📑 Pagination — automatically follows pages up to your maxResults limit

Who is it for?

🎓 Academic researchers building literature reviews and meta-analyses
🧬 R&D teams tracking publications in their domain
📊 Bibliometricians analyzing citation networks and research trends
🤖 AI/ML engineers curating training datasets from scientific papers
📰 Science journalists finding expert sources and trending research
🏢 Patent analysts monitoring prior art and competitor research
📚 Librarians building curated reading lists and research guides

Why use Google Scholar Scraper?

⚡ No browser overhead — pure HTTP requests mean fast execution and low cost
🔄 Automated pagination — scrapes hundreds of results across multiple pages
📋 Structured output — clean JSON with all metadata fields ready for analysis
💰 Pay per result — only pay for the papers you actually extract
🔗 PDF links included — direct access to full-text documents when available
📊 Citation tracking — citation counts help identify influential papers

Data extraction fields

Field	Description
`title`	Paper title
`url`	Link to the paper
`authors`	Author names
`year`	Publication year
`source`	Journal, conference, or publisher
`citationCount`	Number of citations
`snippet`	Abstract or snippet from Google Scholar
`pdfUrl`	Direct link to PDF/full-text (when available)
`type`	Result type: PDF, BOOK, HTML, or CITATION
`citedByUrl`	Google Scholar link to papers that cite this one
`relatedUrl`	Google Scholar link to related articles
`versionCount`	Number of versions available
`clusterId`	Google Scholar cluster ID
`query`	Search query used
`scrapedAt`	ISO 8601 timestamp of when the data was scraped

How much does it cost to scrape Google Scholar?

This actor uses pay-per-event pricing:

Event	Price
Run started	$0.01
Paper scraped	$0.003 per paper

Example costs:

50 papers ≈ $0.16
100 papers ≈ $0.31
1,000 papers ≈ $3.01

How to scrape Google Scholar papers

Go to Google Scholar Scraper on Apify Store.
Enter one or more search queries in the queries field (e.g., machine learning, author:"Yann LeCun").
Optionally set year range with yearFrom and yearTo to filter by publication date.
Choose sort order: relevance (default) or date (newest first).
Set maxResults to control how many papers to extract per query.
Click Start and wait for the run to complete.
Download your data as JSON, CSV, or Excel from the Dataset tab.

Input parameters

Parameter	Type	Description	Default
`queries`	string[]	Search queries (supports Scholar syntax)	—
`urls`	string[]	Direct Google Scholar search URLs	—
`yearFrom`	integer	Only papers published this year or later	—
`yearTo`	integer	Only papers published this year or earlier	—
`sortBy`	string	Sort: `relevance` or `date`	`relevance`
`includePatents`	boolean	Include patent results	`true`
`includeCitations`	boolean	Include citation entries	`true`
`maxResults`	integer	Max papers per query (1–1000)	`100`
`maxRequestRetries`	integer	Retry attempts for failed requests	`5`

Input examples

Basic search:

{
    "queries": ["machine learning"],
    "maxResults": 50
}

Filtered search with year range:

{
    "queries": ["transformer neural network"],
    "yearFrom": 2020,
    "yearTo": 2025,
    "sortBy": "date",
    "maxResults": 100
}

Author search:

{
    "queries": ["author:\"Yann LeCun\" deep learning"],
    "maxResults": 30
}

Multiple queries:

{
    "queries": [
        "\"large language models\" safety",
        "reinforcement learning robotics",
        "graph neural networks"
    ],
    "yearFrom": 2023,
    "maxResults": 50
}

Direct Google Scholar URL:

{
    "urls": ["https://scholar.google.com/scholar?q=CRISPR+gene+editing&as_ylo=2022"],
    "maxResults": 100
}

Output example

Each paper is returned as a JSON object:

{
    "title": "Attention Is All You Need",
    "url": "https://arxiv.org/abs/1706.03762",
    "authors": "A Vaswani, N Shazeer, N Parmar...",
    "year": 2017,
    "source": "Advances in neural information processing systems",
    "citationCount": 125000,
    "snippet": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...",
    "pdfUrl": "https://arxiv.org/pdf/1706.03762",
    "type": "PDF",
    "citedByUrl": "https://scholar.google.com/scholar?cites=...",
    "relatedUrl": "https://scholar.google.com/scholar?q=related:...",
    "versionCount": 15,
    "clusterId": "1234567890",
    "query": "transformer neural network",
    "scrapedAt": "2026-03-22T10:00:00.000Z"
}

Google Scholar search syntax

Syntax	Example	Description
`"exact phrase"`	`"deep learning"`	Exact phrase match
`author:"Name"`	`author:"Yann LeCun"`	Search by author
`-term`	`machine learning -survey`	Exclude a term
`OR`	`"CNN" OR "convolutional"`	Match either term
`intitle:`	`intitle:transformer`	Term must appear in title
`source:`	`source:Nature`	Restrict to a journal

Tips

🎯 Use exact phrases for precise results: "attention is all you need" finds that specific paper
👤 Author search works well: author:"Geoffrey Hinton" finds papers by that author
📅 Year filtering is useful for finding recent work in fast-moving fields
🔄 Sort by date to find the newest papers on a topic
📊 Citation count is a good proxy for paper importance and influence
🔗 Direct URLs let you use Google Scholar's Advanced Search UI to build complex queries, then paste the URL here
⏱️ Rate limits: Google Scholar may rate-limit heavy usage. The scraper includes automatic delays between pages. For large-scale scraping, use Apify proxy.

Integrations

Connect Google Scholar Scraper with any cloud service or web app using integrations on the Apify platform. You can connect with Make, Zapier, Slack, Airbyte, GitHub, Google Sheets, Google Drive, and more. You can also use webhooks to trigger actions when a run finishes or fails.

API usage

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('automation-lab/google-scholar-scraper').call({
    queries: ['machine learning'],
    yearFrom: 2023,
    maxResults: 50,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(paper => {
    console.log(`${paper.title} (${paper.year}) — Cited by ${paper.citationCount}`);
});

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("automation-lab/google-scholar-scraper").call(run_input={
    "queries": ["machine learning"],
    "yearFrom": 2023,
    "maxResults": 50,
})

for paper in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{paper['title']} ({paper['year']}) — Cited by {paper['citationCount']}")

cURL

curl "https://api.apify.com/v2/acts/automation-lab~google-scholar-scraper/runs" \
  -X POST \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"queries": ["machine learning"], "yearFrom": 2023, "maxResults": 50}'

Use with AI agents via MCP

Google Scholar Scraper is available as a tool for AI assistants via the Model Context Protocol (MCP).

Setup for Claude Code

$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/google-scholar-scraper"

Setup for Claude Desktop, Cursor, or VS Code

{
    "mcpServers": {
        "apify": {
            "url": "https://mcp.apify.com?tools=automation-lab/google-scholar-scraper"
        }
    }
}

Example prompts

"Search Google Scholar for papers about large language models from 2024"
"Find the most cited papers on CRISPR gene editing"
"Get papers by author Geoffrey Hinton on deep learning"

Learn more in the Apify MCP documentation.

Legality

Google Scholar displays publicly available academic metadata — titles, authors, abstracts, and citation counts. This scraper extracts only publicly visible information that Google Scholar itself aggregates from academic publishers. It does not bypass authentication, access paywalled content, or download full papers.

You should review Google's Terms of Service and ensure your use case complies with applicable laws. Use reasonable request rates and respect robots.txt guidelines.

FAQ

Q: How many papers can I scrape at once? A: Up to 1,000 papers per query. Google Scholar paginates at 10 results per page, so 1,000 papers requires fetching 100 pages.

Q: Does it return full paper text? A: No. The scraper returns metadata (title, authors, abstract snippet, citation count) and PDF links when available. Use the pdfUrl field to access full papers.

Q: Can I search by author only? A: Yes. Use author:"Name" syntax in your query, e.g., author:"Yann LeCun".

Q: Why am I getting fewer results than expected? A: Google Scholar may return fewer results for very specific queries. Also, rate limiting may kick in for large scrapes. Try broader search terms or reduce maxResults.

Q: Can I filter by journal or conference? A: Yes. Use Google Scholar's source: operator in your query, e.g., source:Nature deep learning.

ArXiv Scraper — scrape preprint papers from arXiv
Crossref Scraper — extract scholarly article metadata via CrossRef
OpenAlex Scraper — extract academic paper metadata, citations, and author data
ClinicalTrials Scraper — extract clinical trial data from ClinicalTrials.gov
Google Search Scraper — scrape Google Search results

Google Scholar Scraper — Papers & Citations

muhammadafzal/google-scholar-scraper

Scrape Google Scholar results with paper titles, authors, publication details, citation counts, related links, and research metadata.

Muhammad Afzal

Google Scholar Scraper - Academic Papers Search

gio21/google-scholar-scraper

Search Google Scholar for academic papers. Get title, authors, year, publication, snippet, cited-by count, PDF links. Filter by year range, language.

Gio

Google Scholar Scraper

johnlenflure/google-scholar-scraper

Scrape Google Scholar search results. Extract paper titles, authors, abstracts, citation counts, years, PDF links, and related article URLs.

Sinan Donmez

🔍 Google Scholar Scraper

scraper-engine/google-scholar-scraper

Google Scholar Scraper research papers from Google Scholar, including titles, authors, publication years, journals, citations, abstracts, PDFs, and profile links. Export structured data to JSON, CSV, Excel, or XML for academic research, literature reviews, citation analysis, and AI workflows.

Scraper Engine

Google Scholar Scraper - Academic Papers & Citations

klondikeking/google-scholar-scraper-v2

Extract academic papers, citations, authors, and PDF links from Google Scholar.

Pierrick McD0nald

Google Scholar Scraper

moving_beacon-owner1/google-scholar-scraper

Scrapes Google Scholar search results, including paper titles, authors, publication years, citation counts, article URLs, and PDF links. Supports multiple queries and year filters for research, literature reviews, and citation analysis.

Jamshaid Arif

Google Scholar Scraper

masked_hacker/google-scholar-scraper

Scrape Google Scholar search results into structured academic records (title, authors, year, venue, citation count, PDF link, cluster id).

Masked Hacker

Google Scholar Scraper

solidcode/google-scholar-scraper

[💰 $2.0 / 1K] Extract academic papers, author profiles, h-index, i10-index, citation counts, abstracts, and PDF links from Google Scholar. Batch search queries and author IDs, filter by year range, sort by relevance or date.

SolidCode

Google Scholar Scraper

kawsar/google-scholar-scraper

Google Scholar scraper that collects paper titles, authors, citations, and PDF links from search results, so you get structured academic data without the manual work.

Kawsar

Google Scholar Scraper

cloud9_ai/google-scholar-scraper

Extract academic papers from Google Scholar: title, authors, year, journal, citation count, abstract snippet, PDF links. Search by keyword with year range filters. Stricter rate limiting for reliability. Perfect for literature review, research trend analysis, citation tracking.