Google Scholar Scraper — Papers, Authors, Cites avatar

Google Scholar Scraper — Papers, Authors, Cites

Pricing

from $7.00 / 1,000 scholar pages

Go to Apify Store
Google Scholar Scraper — Papers, Authors, Cites

Google Scholar Scraper — Papers, Authors, Cites

Scrape Google Scholar at scale: paper search with year range + language filters, author profile lookup (h-index, i10-index, interests, co-authors, full article list), citation formats (MLA, APA, Chicago, Harvard, Vancouver) with BibTeX / RIS / EndNote / RefWorks exports.

Pricing

from $7.00 / 1,000 scholar pages

Rating

0.0

(0)

Developer

Scrape Badger

Scrape Badger

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

5 days ago

Last modified

Share

What does Google Scholar Scraper do?

Scrape Google Scholar at scale — papers, author profiles, author citation charts, and citation format exports (MLA / APA / Chicago / BibTeX / RIS).

Why use Google Scholar Scraper?

  • Five modes. Search Papers, Search Author Profiles, Get Author Profile (with articles), Get Author Citation Chart, Get Paper Citation Formats.
  • Year range filter. as_ylo / as_yhi for date-bound literature reviews.
  • Citation export. Direct MLA / APA / Chicago / Harvard / Vancouver + BibTeX / RIS.
  • Author citation chart. Per-year citation counts for academic impact tracking.
  • Cheapest scholar actor on Apify. $0.70 / 1k papers.

What data can Google Scholar Scraper extract?

FieldTypeDescription
record_typestringpaper / author / article / citation_chart_point / citation_format
titlestringPaper / author name
authorsstringAuthor list
publicationstringJournal / venue
cited_bynumberCitation count
yearnumberPublication year
linkstringPaper URL
snippetstringAbstract preview
cluster_idstringGoogle Scholar's paper ID
author_idstringGoogle Scholar's author ID

How to scrape Google Scholar

  1. Click Try for free.
  2. Pick mode.
  3. For Search Papers: enter q, optional as_ylo / as_yhi year range.
  4. For Author Profile: enter author_id.
  5. For Citation Formats: enter q (cluster_id).
  6. Click Start — papers / authors stream into the dataset.

How much will it cost?

$0.007 per page (list modes) · $0.003 per single-shot call. ≈ $0.70 per 1,000 papers. Author chart / cite format = single-shot $0.003 each.

Competitor benchmark

ActorAuthorPriceNotes
easyapi/google-scholar-scrapereasyapi~$10 / 1k papersSearch-only
scrapestorm/google-scholar-scraperScrapeStorm~$8.99 / 1kSearch-only
marco_gullo/google-scholarMarco GulloVariablePer-run
scrape-badger/google-scholar-scraperScrapeBadger$0.70 / 1k papers5 modes in one actor

Input

Configure the run in the Input tab above, or pass a JSON object matching the fields below when calling the Actor via the Apify API.

FieldRequiredDescription
modeOne of 5 modes.
qSearch modesQuery (keyword / cluster_id / author name).
author_idAuthor modesGoogle Scholar author ID.
as_ylo / as_yhiYear range filter.
as_sdtArticle type filter.
num / pagePagination.
mauthorsSearch Author ProfilesAuthor name query.
cstart / pagesizeGet Author Profile articlesPagination.

Output

Every successful run streams records into the run's dataset. Download as JSON, CSV, XML, Excel, or HTML from the Dataset tab; consume programmatically via the Apify API or webhooks.

Example record:

{
"record_type": "paper",
"title": "Attention Is All You Need",
"authors": "Vaswani, Ashish et al.",
"publication": "NeurIPS 2017",
"cited_by": 98234,
"year": 2017,
"link": "https://arxiv.org/abs/1706.03762",
"snippet": "The dominant sequence transduction models\u2026",
"cluster_id": "1234567890"
}

Tips / Advanced options

  • Use as_ylo for literature reviews. Bound by date — e.g. as_ylo: 2020 for post-2020 ML papers.
  • Citation chart for academic impact. Per-year citation counts — perfect for tenure-track impact reports.
  • BibTeX export for reference managers. Get Paper Citation Formats → BibTeX → drop into Zotero / Mendeley.
  • Dedupe by cluster_id. Same paper on arXiv + journal + preprint server — Google merges by cluster_id.

FAQ, Disclaimers, Support

What's cluster_id?

Google Scholar's internal paper identifier. All versions (arXiv, journal, preprint) of the same paper share a cluster_id.

How do I find author_id?

Run Search Author Profiles first — each returned author has an author_id. Or grab it from the author's profile URL (&user=…).

Does this include PDF text?

No — only Scholar's metadata (title, authors, snippet, link). Download the PDF from link separately.

What's the citation count source?

Google Scholar's own cited_by count. Differs from Web of Science / Scopus / OpenCitations.

Disclaimer

This Actor scrapes public Google data only. You're responsible for compliance with Google's Terms of Service and any applicable data-protection laws (GDPR, CCPA, etc.) in your jurisdiction. ScrapeBadger does not store the scraped results — they are delivered directly to your Apify dataset.

Support

Something not working? Open a ticket in the Issues tab above — we triage within one business day. Full API reference: docs.scrapebadger.com.

Powered by

ScrapeBadger — Google-optimised residential proxy pool + browser-farm fallback, 99.7% uptime, unmetered bandwidth. No CAPTCHAs reach you.