Google Scholar Scraper — Papers, Authors, Cites
Pricing
from $7.00 / 1,000 scholar pages
Google Scholar Scraper — Papers, Authors, Cites
Scrape Google Scholar at scale: paper search with year range + language filters, author profile lookup (h-index, i10-index, interests, co-authors, full article list), citation formats (MLA, APA, Chicago, Harvard, Vancouver) with BibTeX / RIS / EndNote / RefWorks exports.
Pricing
from $7.00 / 1,000 scholar pages
Rating
0.0
(0)
Developer
Scrape Badger
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
5 days ago
Last modified
Categories
Share
What does Google Scholar Scraper do?
Scrape Google Scholar at scale — papers, author profiles, author citation charts, and citation format exports (MLA / APA / Chicago / BibTeX / RIS).
Why use Google Scholar Scraper?
- Five modes. Search Papers, Search Author Profiles, Get Author Profile (with articles), Get Author Citation Chart, Get Paper Citation Formats.
- Year range filter.
as_ylo/as_yhifor date-bound literature reviews. - Citation export. Direct MLA / APA / Chicago / Harvard / Vancouver + BibTeX / RIS.
- Author citation chart. Per-year citation counts for academic impact tracking.
- Cheapest scholar actor on Apify. $0.70 / 1k papers.
What data can Google Scholar Scraper extract?
| Field | Type | Description |
|---|---|---|
| record_type | string | paper / author / article / citation_chart_point / citation_format |
| title | string | Paper / author name |
| authors | string | Author list |
| publication | string | Journal / venue |
| cited_by | number | Citation count |
| year | number | Publication year |
| link | string | Paper URL |
| snippet | string | Abstract preview |
| cluster_id | string | Google Scholar's paper ID |
| author_id | string | Google Scholar's author ID |
How to scrape Google Scholar
- Click Try for free.
- Pick
mode. - For Search Papers: enter
q, optionalas_ylo/as_yhiyear range. - For Author Profile: enter
author_id. - For Citation Formats: enter
q(cluster_id). - Click Start — papers / authors stream into the dataset.
How much will it cost?
$0.007 per page (list modes) · $0.003 per single-shot call. ≈ $0.70 per 1,000 papers. Author chart / cite format = single-shot $0.003 each.
Competitor benchmark
| Actor | Author | Price | Notes |
|---|---|---|---|
| easyapi/google-scholar-scraper | easyapi | ~$10 / 1k papers | Search-only |
| scrapestorm/google-scholar-scraper | ScrapeStorm | ~$8.99 / 1k | Search-only |
| marco_gullo/google-scholar | Marco Gullo | Variable | Per-run |
| scrape-badger/google-scholar-scraper | ScrapeBadger | $0.70 / 1k papers | 5 modes in one actor |
Input
Configure the run in the Input tab above, or pass a JSON object matching the fields below when calling the Actor via the Apify API.
| Field | Required | Description |
|---|---|---|
| mode | ✅ | One of 5 modes. |
| q | Search modes | Query (keyword / cluster_id / author name). |
| author_id | Author modes | Google Scholar author ID. |
| as_ylo / as_yhi | — | Year range filter. |
| as_sdt | — | Article type filter. |
| num / page | — | Pagination. |
| mauthors | Search Author Profiles | Author name query. |
| cstart / pagesize | Get Author Profile articles | Pagination. |
Output
Every successful run streams records into the run's dataset. Download as JSON, CSV, XML, Excel, or HTML from the Dataset tab; consume programmatically via the Apify API or webhooks.
Example record:
{"record_type": "paper","title": "Attention Is All You Need","authors": "Vaswani, Ashish et al.","publication": "NeurIPS 2017","cited_by": 98234,"year": 2017,"link": "https://arxiv.org/abs/1706.03762","snippet": "The dominant sequence transduction models\u2026","cluster_id": "1234567890"}
Tips / Advanced options
- Use
as_ylofor literature reviews. Bound by date — e.g.as_ylo: 2020for post-2020 ML papers. - Citation chart for academic impact. Per-year citation counts — perfect for tenure-track impact reports.
- BibTeX export for reference managers. Get Paper Citation Formats → BibTeX → drop into Zotero / Mendeley.
- Dedupe by
cluster_id. Same paper on arXiv + journal + preprint server — Google merges by cluster_id.
FAQ, Disclaimers, Support
What's cluster_id?
Google Scholar's internal paper identifier. All versions (arXiv, journal, preprint) of the same paper share a cluster_id.
How do I find author_id?
Run Search Author Profiles first — each returned author has an author_id. Or grab it from the author's profile URL (&user=…).
Does this include PDF text?
No — only Scholar's metadata (title, authors, snippet, link). Download the PDF from link separately.
What's the citation count source?
Google Scholar's own cited_by count. Differs from Web of Science / Scopus / OpenCitations.
Disclaimer
This Actor scrapes public Google data only. You're responsible for compliance with Google's Terms of Service and any applicable data-protection laws (GDPR, CCPA, etc.) in your jurisdiction. ScrapeBadger does not store the scraped results — they are delivered directly to your Apify dataset.
Support
Something not working? Open a ticket in the Issues tab above — we triage within one business day. Full API reference: docs.scrapebadger.com.
Powered by
ScrapeBadger — Google-optimised residential proxy pool + browser-farm fallback, 99.7% uptime, unmetered bandwidth. No CAPTCHAs reach you.