Google Scholar | Research Papers, Citations & Author Profiles
Pricing
from $0.01 / 1,000 results
Google Scholar | Research Papers, Citations & Author Profiles
Scrape Google Scholar at scale. Search research papers, get citation formats (MLA, APA, Chicago, BibTeX), author profiles with h-index and i10-index, list an author's publications, view per-article citation history, & map co-author networks. Six modes in one for lit reviews, bibliometrics, & agents.
Pricing
from $0.01 / 1,000 results
Rating
5.0
(3)
Developer
John
Maintained by CommunityActor stats
4
Bookmarked
5
Total users
4
Monthly active users
2 hours ago
Last modified
Categories
Share
Google Scholar Scraper
Scrape Google Scholar at scale. One actor, six modes - search research papers, pull citation formats, fetch author profiles with h-index and i10-index, paginate an author's full publication list, view per-article citation history, and map an author's full co-author network. Built for literature reviews, bibliometrics, citation tracking, and academic AI agents.
What this actor returns
- Research paper search results: title, link, snippet, authors, publication info, cited-by counts, and version links.
- Citation strings for any paper in MLA, APA, Chicago, Harvard, and Vancouver formats, plus BibTeX / EndNote / RefMan / RefWorks export links.
- Full author profiles with name, affiliations, email domain, interests, photo, and the standard h-index / i10-index / total-citations table (overall and recent window).
- Year-by-year citation history graphs for both authors and individual papers.
- Author publication list, paginated up to 100 results per page.
- Per-article bibliographic detail including journal, volume, issue, pages, publisher, and abstract.
- Full co-author list with profile URLs, affiliations, email domains, and photos.
The six modes
Choose with the mode input parameter.
| mode value | What it does | Required fields |
|---|---|---|
search | Search Google Scholar for papers | q (or cites, or cluster) |
cite | Get citation formats and BibTeX export for one paper | result_id |
author_profile | Fetch an author's profile + citation metrics + graph | author_id |
author_articles | Paginate the author's full publication list | author_id |
author_citation | Per-article bibliographic record with citation history | author_id, citation_id |
author_co_authors | Full list of an author's co-authors | author_id |
Example: search
{"mode": "search","q": "transformer neural network","as_ylo": 2020,"as_yhi": 2024,"num": 10,"max_pages": 3}
Example: cite
{"mode": "cite","result_id": "K7uerNYAAAAJ:u5HHmVD_uO8C"}
Example: author_profile
{"mode": "author_profile","author_id": "LSsXyncAAAAJ"}
Example: author_articles
{"mode": "author_articles","author_id": "LSsXyncAAAAJ","sort": "pubdate","num": 20,"max_pages": 2}
Example: author_citation
{"mode": "author_citation","author_id": "LSsXyncAAAAJ","citation_id": "u5HHmVD_uO8C"}
Example: author_co_authors
{"mode": "author_co_authors","author_id": "LSsXyncAAAAJ"}
Input parameters
| Parameter | Type | Modes | Description |
|---|---|---|---|
mode | string | all (required) | Which operation to run. |
q | string | search | Free-text search. Supports author: and source: operators. |
cites | string | search | Find papers that cite this article ID. |
cluster | string | search | Fetch all versions of a paper by cluster ID. |
result_id | string | cite | Result ID of a paper to fetch citation formats for. |
author_id | string | author_profile, author_articles, author_citation, author_co_authors | Google Scholar author identifier. |
citation_id | string | author_citation | Per-article ID within an author's profile. |
hl | enum | all | UI language (en, es, fr, de, ...). |
lr | string | search | Restrict to languages, e.g. lang_en|lang_fr. |
as_ylo | integer | search | Earliest publication year. |
as_yhi | integer | search | Latest publication year. |
scisbd | enum | search | 0 relevance, 1 abstracts-only by date, 2 all by date. |
as_sdt | enum | search | 0 exclude patents, 7 include patents, 4 case law. |
safe | enum | search | active / off. |
filter | enum | search | 1 enable similar-results filter (default), 0 disable. |
as_vis | enum | search | 0 include citations (default), 1 exclude citations. |
as_rr | enum | search | 1 review articles only, 0 all (default). |
sort | enum | author_profile, author_articles | title / pubdate. Omit for default citation-count sort. |
max_pages | integer | search (1-20 per page), author_articles (1-100 per page) | Max pages to fetch. 0 = no limit. Default 1. |
num | integer | search, author_articles | Per-page size. |
Example output (mode=search)
{"_mode": "search","_query_index": 1,"search_parameters": { "mode": "search", "q": "transformer", "as_ylo": 2020 },"search_metadata_status": "Success","search_timestamp": "2026-05-13T20:00:00Z","position": 0,"result_id": "K7uerNYAAAAJ:u5HHmVD_uO8C","paper_title": "Attention Is All You Need","link": "https://arxiv.org/abs/1706.03762","snippet": "...","publication_info": {"summary": "A Vaswani, N Shazeer, N Parmar - Advances in NIPS, 2017","authors": [{ "name": "Ashish Vaswani", "author_id": "..." }]},"inline_links": {"cited_by_total": 120000,"cited_by_link": "https://scholar.google.com/...","versions_total": 95,"versions_cluster_id": "13755340029141322000"}}
Example output (mode=author_profile)
{"_mode": "author_profile","_query_index": 1,"author": {"name": "Geoffrey Hinton","affiliations": "Emeritus Prof. Comp Sci, University of Toronto","email": "Verified email at cs.toronto.edu","interests": [{ "title": "Machine Learning", "link": "..." }]},"cited_by_summary": {"citations_all": 800000,"citations_recent": 500000,"h_index_all": 150,"i10_index_all": 380,"recent_since_year": 2020},"cited_by_graph": [{ "year": 2018, "citations": 35000 },{ "year": 2019, "citations": 45000 }]}
Pricing
Pay-per-event. No subscription.
- Setup: $0.02 per run (charged once).
- Query executed: $0.02 per upstream call. For paginated modes (
search,author_articles), that is once per page. For single-shot modes (cite,author_profile,author_citation,author_co_authors), that is once per run.
Worked examples:
mode=searchwithmax_pages=5-> $0.02 setup + 5 * $0.02 = $0.12.mode=author_profile-> $0.02 setup + 1 * $0.02 = $0.04.mode=author_articleswithmax_pages=3,num=100(full author bibliography) -> $0.02 + 3 * $0.02 = $0.08.
Use cases
- Build a literature review: search for a topic, then loop through
organic_results[].result_idto pull citation strings viamode=cite. - Track citation growth: run
mode=author_profileon a watchlist of researchers and store thecited_by_graphover time. - Map a research community: take any
author_idand runmode=author_co_authorsto harvest the full collaborator network. - Bibliometric analysis: page through an author's entire publication list with
mode=author_articlesandmax_pages=0for unlimited. - AI agents and RAG pipelines: feed structured Google Scholar JSON straight into a knowledge graph or vector store.
How to get started
- Open the actor in the Apify console and click Try for free.
- Pick a
modeand fill in the required fields shown above. - Click Run.
- Results land in the default dataset. Download as JSON, CSV, or Excel from the Storage tab, or use the Apify API.
You can also call this actor from your code via the Apify SDK (Python, JavaScript, or curl) or as a tool in any MCP-aware AI agent.
FAQ
Q: Why is my run charged for setup even when there are no results?
The $0.02 setup fee covers the run's instance provisioning. If you only want results, set tight inputs (max_pages=1, narrow query) so the setup is the only charge.
Q: How do I find a result_id or author_id?
Run mode=search first. Each item in the output contains result_id (use it for mode=cite or as a cites / cluster value) and publication_info.authors[].author_id (use it for any author mode).
Q: What languages are supported?
The hl enum exposes the most common 29 languages. The upstream API supports more; if you need one that is not in the list, open an issue and we will add it.
Q: Why doesn't pagination always reach max_pages?
Google Scholar stops returning results when it runs out of matches. Pagination ends early when the upstream API returns fewer items than num or signals no next page.
Q: My run failed with an authentication error. The actor needs an API key configured by the publisher. If you see "Missing SERPAI_KEY", the deployment is misconfigured - please report it.
Links
- Source code on GitHub: https://github.com/johnvc/ApifyGoogleScholar
- More scrapers from this publisher: https://apify.com/johnvc
Last Updated: 2026.05.17