Pricing

from $5.00 / 1,000 result returneds

Google Scholar Scraper — Papers, Citations & Author Profiles

Scrape Google Scholar across 6 modes: paper search, citation export (BibTeX/APA/MLA/Chicago), author profiles (h-index, i10-index), publication lists, citation history, and co-author networks. MCP-ready. Hybrid Camoufox + SerpApi managed/BYOK fallback for high reliability.

Pricing

from $5.00 / 1,000 result returneds

Rating

0.0

(0)

Developer

Khadin Akbar

Actor stats

Bookmarked

Total users

Monthly active users

3 days ago

Last modified

What does Google Scholar Scraper do?

Google Scholar has no official API and blocks scrapers aggressively. This Actor solves both problems. It extracts structured bibliographic data — titles, authors, publication venues, years, citation counts, PDF links, h-index, i10-index, citation histories, and co-author graphs — and returns one clean, flat JSON record per result. Pick a mode for the job:

Mode	What you get
`search`	Papers matching a keyword query, with filters for year range, document type, patents, case law, and review-only
`cite`	Citation strings in APA, MLA, Chicago, Harvard, Vancouver plus BibTeX, EndNote, RefMan, RefWorks export links
`author_profile`	One author's metrics: affiliation, interests, h-index, i10-index, total & recent citations
`author_articles`	An author's full publication list, paginated and sortable
`author_citation`	An author's year-by-year citation history (great for tracking growth)
`author_co_authors`	An author's co-author network for mapping research communities

Why use this Google Scholar Scraper?

Reliability first. Google Scholar's CAPTCHA wall breaks most scrapers (competitors sit at 84–98% success). This Actor tries a stealth Camoufox browser first, then transparently falls back to a managed SerpApi path — so you get data, not empty runs.
Bring your own key, pay less. Supply your own SerpApi key (BYOK) and the reliable path runs at the standard per-result price with no managed-fallback premium.
Six tools in one. No need to wire up four separate scrapers for search, citations, authors, and co-authors.
Agent-native. Narrow inputs, flat structured JSON output, and clear cost signals make it a clean tool call for Claude, ChatGPT, or any MCP client.
Built for literature reviews, bibliometrics, and RAG. Feed structured scholarly data straight into knowledge graphs, vector stores, or analytics notebooks.

How to use Google Scholar Scraper

Open the Input tab.
Choose a Mode (defaults to paper search).
Fill the field that mode needs — queries for search, resultIds for cite, authorIds for the author modes.
(Optional) Set maxResults, a year range, or your own serpApiKey.
Click Start and watch results stream into the dataset.
Download as JSON, CSV, Excel, or HTML, or pull via the Apify API.

Input

The only field you usually set is Mode plus its matching field. Example — search for recent transformer papers:

{
  "mode": "search",
  "queries": ["transformer architecture"],
  "yearFrom": 2020,
  "maxResults": 50
}

Look up an author's profile:

{ "mode": "author_profile", "authorIds": ["LSsXyncAAAAJ"] }

Export citation formats for a paper (needs a SerpApi key):

{ "mode": "cite", "resultIds": ["TY8gM2sAAAAJ"], "serpApiKey": "your-key" }

Output

Each result is one flat JSON record. A search paper looks like:

{
  "mode": "search",
  "query": "transformer architecture",
  "position": 1,
  "title": "Attention is all you need",
  "resultId": "u-CT435A0vkJ",
  "link": "https://proceedings.neurips.cc/paper/2017/...",
  "snippet": "The dominant sequence transduction models...",
  "authors": [{ "name": "A Vaswani", "authorId": "...", "profileUrl": "..." }],
  "publicationInfo": "A Vaswani, N Shazeer… - Advances in neural…, 2017",
  "year": 2017,
  "citedByCount": 145203,
  "citedByLink": "https://scholar.google.com/scholar?cites=...",
  "versionsCount": 53,
  "pdfUrl": "https://proceedings.neurips.cc/...pdf",
  "pdfFormat": "PDF",
  "source": "serpapi",
  "scrapedAt": "2026-05-30T12:00:00.000Z"
}

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.

Data fields

Field	Description
`mode`	The operation that produced the record
`title`	Paper title
`authors`	Array of `{ name, authorId, profileUrl }`
`year`	Publication year
`citedByCount`	Number of citing papers
`publicationInfo`	Venue / journal / publisher summary
`link`, `pdfUrl`	Article URL and direct PDF link
`resultId`	Paper/cluster ID — feed into `cite` mode
`name`, `authorId`, `hIndex`, `i10Index`, `citationsTotal`	Author-mode fields
`citationsByYear`	Year-by-year citation history (author_citation)
`coAuthorName`, `coAuthorId`	Co-author edges (author_co_authors)
`citations`, `exportLinks`	Citation strings + export links (cite)
`source`	`camoufox` (direct) or `serpapi` (fallback)
`scrapedAt`	ISO 8601 timestamp

How much does it cost to scrape Google Scholar?

This Actor uses pay-per-event pricing plus optional usage-based billing:

$0.00005 per actor start
$0.005 per result (paper, citation export, citation-history record, or co-author) — capped by your maxResults
$0.01 per author profile (author_profile mode only)

A 50-paper search costs about $0.25. The free Apify tier covers small jobs. Bring your own SerpApi key to avoid any managed-fallback overhead on large runs.

Tips and advanced options

For big or time-sensitive jobs, set forceSerpApi: true with a serpApiKey to skip the direct-scrape attempt and go straight to the reliable path.
Chain modes: run search, grab resultId and authors[].authorId from the output, then feed those into cite and author_profile.
Filter tightly with yearFrom/yearTo, reviewArticlesOnly, and sortByDate to keep result counts (and cost) down.
Residential proxies are the default and strongly recommended — Google Scholar blocks datacenter IPs instantly.

FAQ, disclaimer, and support

Is scraping Google Scholar legal? This Actor collects only publicly available bibliographic metadata for research, bibliometric, and indexing use. You are responsible for complying with Google's Terms of Service and applicable laws in your jurisdiction. Do not use it to violate copyright or republish protected content.

Why do some runs use the SerpApi source? Google Scholar serves CAPTCHAs to datacenter and residential traffic alike. When the direct Camoufox scrape is blocked, the Actor falls back to a managed SerpApi path so your run still returns data. Provide your own key for the cheapest reliable path.

Found a bug or need a field added? Open an issue on the Actor's Issues tab. Custom scraping solutions are available on request.

Google Patents Scraper — patents, citations & assignee/inventor portfolios for IP context alongside Scholar.
Goodreads Scraper — books, reviews & authors for non-academic bibliographic and biography work.
Google SERP Scraper — Google search results when you need broader web coverage beyond Scholar.
Google Trends Scraper — interest trends for research topics you find in Scholar.
SEC EDGAR Scraper — 10-K, 8-K & 13F filings to anchor academic findings against issuer disclosures.

Google Scholar | Research Papers, Citations & Author Profiles

johnvc/google-scholar-api

Scrape Google Scholar at scale. Search research papers, get citation formats (MLA, APA, Chicago, BibTeX), author profiles with h-index and i10-index, list an author's publications, view per-article citation history, & map co-author networks. Six modes in one for lit reviews, bibliometrics, & agents.

John

5.0

Google Scholar Scraper — Papers, Authors, Cites

scrape.badger/google-scholar-scraper

Scrape Google Scholar at scale: paper search with year range + language filters, author profile lookup (h-index, i10-index, interests, co-authors, full article list), citation formats (MLA, APA, Chicago, Harvard, Vancouver) with BibTeX / RIS / EndNote / RefWorks exports.

Scrape Badger

Google Scholar Scraper

solidcode/google-scholar-scraper

[💰 $2.0 / 1K] Extract academic papers, author profiles, h-index, i10-index, citation counts, abstracts, and PDF links from Google Scholar. Batch search queries and author IDs, filter by year range, sort by relevance or date.

SolidCode

Google Scholar Scraper

lulzasaur/google-scholar-scraper

Scrape Google Scholar search results with titles, authors, citations, abstracts, and PDF links. Also supports author profile mode to extract h-index, i10-index, and publication lists.

lulz bot

Google Scholar Scraper

george.the.developer/google-scholar-scraper

Scrape Google Scholar for academic papers, citations, author profiles. No API key needed. Extract titles, authors, abstracts, citation counts, PDF links, h-index, i10-index. Export JSON, CSV, Excel. Anti-bot protection with residential proxies, UA rotation, CAPTCHA detection.

George Kioko

5.0

🎓 Google Scholar Scraper — Papers & Citations

nexgendata/google-scholar-scraper

Scrape Google Scholar for papers, citations, authors & h-index data. Semantic Scholar, Scopus & Web of Science alternative for literature reviews, citation analysis, author clustering and research analytics. Pay per paper.

NexGenData

📄 Academic Paper Scraper — Research & Citations

nexgendata/academic-paper-scraper

Scrape academic papers, research articles, citations, author profiles, and h-index data from Google Scholar. Extract abstracts, publication dates, journal names, and citation counts for literature reviews.

NexGenData

Google Scholar Intelligence: Papers, Citations, BibTeX

scrapemint/google-scholar-scraper

Search Google Scholar at scale. Pulls paper metadata, author affiliations, h-index, cited by counts, citing paper lists, BibTeX, and PDF links. One row per paper. Pay per row.

Ken M

Semantic Scholar Author Profiles Scraper

parseforge/semantic-scholar-author-profiles-scraper

Collect researcher profiles from Semantic Scholar. Extract h-index, citation counts, publication history, affiliations, and external IDs for any academic author. Search by name or author ID. Download structured data as CSV, JSON, or Excel for research evaluation, talent scouting, and grant reviews.

ParseForge

Google Scholar Scraper — Academic Papers & Citations

muhammadafzal/google-scholar-scraper

Extract academic paper titles, authors, abstracts, citation counts, publication details, and PDF links from Google Scholar. Fast, reliable, no browser overhead. Search by keyword, topic, or author name. MCP-optimized for AI agents.