Google Scholar Profiles Scraper
Pricing
from $1.21 / 1,000 profile results
Google Scholar Profiles Scraper
Extract public Google Scholar author profiles, citation metrics, h-index, i10-index, interests, coauthors, and publication rows from profile URLs or user IDs.
Pricing
from $1.21 / 1,000 profile results
Rating
0.0
(0)
Developer
Hanna Nosova
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Extract public Google Scholar author profiles, citation metrics, interests, coauthors, and publication rows from profile URLs or user IDs.
What does Google Scholar Profiles Scraper do?
Google Scholar Profiles Scraper turns public Google Scholar author pages into a clean Apify dataset.
It helps you collect structured academic profile data without copying fields by hand.
You can provide full profile URLs or the user IDs from Google Scholar profile links.
The actor returns author identity fields, affiliation, verified email domain, research interests, citation metrics, visible publication rows, and optional coauthor cards.
It is designed for small to medium research-intelligence workflows that need repeatable exports.
Who is it for?
- ๐ University research offices tracking faculty visibility.
- ๐งช Research intelligence teams building scholar directories.
- ๐งโ๐ผ Academic recruiters finding subject-matter experts.
- ๐ Publishers and journals mapping authors in a field.
- ๐ SEO and content analysts reviewing public academic authority signals.
- ๐งฉ Data enrichment teams matching researcher names to public profile evidence.
Why use this Google Scholar scraper?
Manual profile review is slow and inconsistent.
This actor gives you the same set of fields for every public profile you submit.
It is useful when you already have profile URLs from search, CRM records, author pages, or internal lists.
The output is ready for CSV, JSON, Excel, API, Make, Zapier, or database pipelines.
Use Google Scholar profile data as a public researcher API
Use this actor as a repeatable public researcher-data workflow for profile enrichment, citation monitoring, publication exports, and academic authority review. It is not an official Google Scholar API, and it only collects data visible on public Google Scholar profile pages.
What data can you extract?
Citation metrics are foregrounded in the output: total citations, recent citations, h-index, i10-index, public interests, coauthors, and visible publication rows.
| Field | Description |
|---|---|
profileUrl | Final public Google Scholar profile URL |
userId | Google Scholar user ID from the URL |
name | Public author name |
affiliation | Public affiliation line |
verifiedEmailDomain | Verified email domain when shown |
interests | Public research interest tags |
citations | Total citation count |
citationsSince2019 | Recent citation count shown by Google Scholar |
hIndex | Total h-index |
hIndexSince2019 | Recent h-index |
i10Index | Total i10-index |
i10IndexSince2019 | Recent i10-index |
publicationCount | Number of publication rows saved |
publications | Visible publication rows with title, authors, venue, year, citations, and URL |
coauthors | Visible coauthor cards when enabled |
scrapedAt | Timestamp for freshness tracking |
How much does it cost to scrape Google Scholar profiles?
This actor uses pay-per-event pricing.
You pay a small start fee per run and a formula-derived per-profile result fee for each saved profile. The BRONZE tier is about $2.03 per 1,000 saved profiles, with lower per-profile rates on higher platform tiers.
The default test input is intentionally small so your first run stays inexpensive.
For best cost control, start with one or two profiles and increase volume after checking the output.
Input options
Google Scholar profile URLs
Use profileUrls when you have full profile links.
Example:
[{ "url": "https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en" }]
Google Scholar user IDs
Use userIds when you only have the user= value.
Example:
["qc6CJjYAAAAJ"]
Maximum publications per profile
Use maxPublications to control how many visible publication rows are included for each profile.
Set it to 0 if you only need profile-level metrics.
Include coauthors
Use includeCoauthors to include or skip visible coauthor cards.
Example input
{"profileUrls": [{ "url": "https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en" }],"userIds": [],"maxPublications": 10,"includeCoauthors": true,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Example output
{"profileUrl": "https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en","userId": "qc6CJjYAAAAJ","name": "Albert Einstein","affiliation": "Professor of Physics","verifiedEmailDomain": "example.edu","interests": ["Physics", "Relativity"],"citations": 123456,"citationsSince2019": 12345,"hIndex": 99,"hIndexSince2019": 40,"i10Index": 200,"i10IndexSince2019": 80,"publicationCount": 10,"publications": [{"title": "Example publication title","authors": "A Einstein","venue": "Journal name","year": 1915,"citations": 1000,"url": "https://scholar.google.com/citations?..."}],"coauthors": [],"scrapedAt": "2026-06-24T00:00:00.000Z"}
How to scrape Google Scholar author profiles
- Open the actor input form.
- Paste one or more public Google Scholar profile URLs.
- Or paste one or more Google Scholar user IDs.
- Choose the maximum number of publication rows per profile.
- Decide whether to include coauthors.
- Run the actor.
- Download the dataset as CSV, JSON, Excel, XML, or HTML.
Tips for better results
- โ Use public profile URLs, not search-result URLs.
- โ Keep the first run small to confirm field coverage.
- โ Use user IDs when you have already normalized profiles in your own database.
- โ
Lower
maxPublicationsif you only need citation metrics. - โ Enable proxy settings only if your run is rate-limited.
Working with rate limits
Google Scholar can limit automated traffic.
If a run reports a challenge or unusual-traffic page, retry later with a smaller batch.
For larger batches, use Apify residential proxy and keep runs conservative.
The actor is designed to stop clearly when challenged instead of saving empty rows.
Integrations
Use the actor with Apify integrations to automate research workflows.
- ๐ Export to Google Sheets for manual review.
- ๐งฑ Send JSON to a data warehouse.
- ๐ Trigger a Make scenario when a dataset is ready.
- ๐งฉ Enrich CRM records with public citation metrics.
- ๐ Feed dashboards with profile-level metrics.
API usage
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: process.env.APIFY_TOKEN });const run = await client.actor('fetch_cat/google-scholar-profiles-scraper').call({profileUrls: [{ url: 'https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en' }],maxPublications: 10,includeCoauthors: true});console.log(run.defaultDatasetId);
Python
from apify_client import ApifyClientimport osclient = ApifyClient(os.environ['APIFY_TOKEN'])run = client.actor('fetch_cat/google-scholar-profiles-scraper').call(run_input={'userIds': ['qc6CJjYAAAAJ'],'maxPublications': 10,'includeCoauthors': True,})print(run['defaultDatasetId'])
cURL
curl -X POST "https://api.apify.com/v2/acts/fetch_cat~google-scholar-profiles-scraper/runs?token=$APIFY_TOKEN" \-H 'Content-Type: application/json' \-d '{"userIds":["qc6CJjYAAAAJ"],"maxPublications":10,"includeCoauthors":true,"proxyConfiguration":{"useApifyProxy":true,"apifyProxyGroups":["RESIDENTIAL"]}}'
MCP usage
You can run this actor from AI tools through the Apify MCP server.
Use this MCP URL pattern:
https://mcp.apify.com/?tools=fetch_cat/google-scholar-profiles-scraper
Claude Code setup example:
$claude mcp add apify-google-scholar-profiles "https://mcp.apify.com/?tools=fetch_cat/google-scholar-profiles-scraper"
Claude Desktop JSON example:
{"mcpServers": {"apify-google-scholar-profiles": {"url": "https://mcp.apify.com/?tools=fetch_cat/google-scholar-profiles-scraper"}}}
Example prompts:
- "Run the Google Scholar Profiles Scraper for these three profile URLs and summarize the h-index values."
- "Extract publication rows for this Scholar user ID and export the dataset link."
- "Compare citation metrics for these public author profiles."
Data quality notes
The actor returns fields visible on the public profile page at run time.
Some profiles may hide verified email information.
Some profiles may have fewer visible publication rows than your requested maximum.
Citation metrics can change over time as Google Scholar updates its index.
Limitations
This actor does not log into Google accounts.
It does not scrape private, hidden, or account-only data.
It does not guarantee that every Google Scholar profile is reachable at all times.
Large batches may require slower runs or proxy settings.
Legality
This actor extracts publicly visible profile information.
You are responsible for using the data lawfully and respecting applicable privacy, copyright, database, and platform rules.
Avoid collecting or processing personal data unless you have a valid legal basis.
FAQ and troubleshooting
Why did my run stop with a challenge message?
Google Scholar may have returned a rate-limit or unusual-traffic page.
Try a smaller batch, wait before retrying, or enable proxy settings.
Why are some publication fields empty?
Google Scholar pages do not always show every field for every publication row.
The actor saves the fields that are visible on the profile page.
Why did a profile produce no row?
The profile may be deleted, private, malformed, or temporarily unavailable.
Check that the URL contains a valid user= parameter.
Can I scrape publications for each Google Scholar profile?
Yes. Set maxPublications above 0 to include visible publication rows with titles, authors, venues, years, citations, and URLs where Google Scholar exposes them.
Can I monitor citation metrics over time?
Yes. Run the actor on a schedule and compare citations, citationsSince2019, hIndex, hIndexSince2019, i10Index, and i10IndexSince2019 across datasets.
Does this access private or logged-in Google data?
No. The actor is limited to public Google Scholar profile pages. It does not log in, bypass Google restrictions, or access private account data.
Related scrapers
Explore related actors from fetch_cat:
- https://apify.com/fetch_cat/google-news-scraper
- https://apify.com/fetch_cat/google-autocomplete-scraper
- https://apify.com/fetch_cat/bing-search-results-scraper
Support
If you need a field that is visible on public Google Scholar pages but missing from the dataset, open an issue with an example profile URL.
Include your run ID and a short description of the expected field.
Changelog
0.1
Initial build with public profile metadata, citation metrics, interests, publication rows, coauthors, and Apify dataset output.