Google Scholar Profiles Scraper avatar

Google Scholar Profiles Scraper

Pricing

from $1.21 / 1,000 profile results

Go to Apify Store
Google Scholar Profiles Scraper

Google Scholar Profiles Scraper

Extract public Google Scholar author profiles, citation metrics, h-index, i10-index, interests, coauthors, and publication rows from profile URLs or user IDs.

Pricing

from $1.21 / 1,000 profile results

Rating

0.0

(0)

Developer

Hanna Nosova

Hanna Nosova

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Categories

Share

Extract public Google Scholar author profiles, citation metrics, interests, coauthors, and publication rows from profile URLs or user IDs.

What does Google Scholar Profiles Scraper do?

Google Scholar Profiles Scraper turns public Google Scholar author pages into a clean Apify dataset.

It helps you collect structured academic profile data without copying fields by hand.

You can provide full profile URLs or the user IDs from Google Scholar profile links.

The actor returns author identity fields, affiliation, verified email domain, research interests, citation metrics, visible publication rows, and optional coauthor cards.

It is designed for small to medium research-intelligence workflows that need repeatable exports.

Who is it for?

  • ๐ŸŽ“ University research offices tracking faculty visibility.
  • ๐Ÿงช Research intelligence teams building scholar directories.
  • ๐Ÿง‘โ€๐Ÿ’ผ Academic recruiters finding subject-matter experts.
  • ๐Ÿ“š Publishers and journals mapping authors in a field.
  • ๐Ÿ”Ž SEO and content analysts reviewing public academic authority signals.
  • ๐Ÿงฉ Data enrichment teams matching researcher names to public profile evidence.

Why use this Google Scholar scraper?

Manual profile review is slow and inconsistent.

This actor gives you the same set of fields for every public profile you submit.

It is useful when you already have profile URLs from search, CRM records, author pages, or internal lists.

The output is ready for CSV, JSON, Excel, API, Make, Zapier, or database pipelines.

Use Google Scholar profile data as a public researcher API

Use this actor as a repeatable public researcher-data workflow for profile enrichment, citation monitoring, publication exports, and academic authority review. It is not an official Google Scholar API, and it only collects data visible on public Google Scholar profile pages.

What data can you extract?

Citation metrics are foregrounded in the output: total citations, recent citations, h-index, i10-index, public interests, coauthors, and visible publication rows.

FieldDescription
profileUrlFinal public Google Scholar profile URL
userIdGoogle Scholar user ID from the URL
namePublic author name
affiliationPublic affiliation line
verifiedEmailDomainVerified email domain when shown
interestsPublic research interest tags
citationsTotal citation count
citationsSince2019Recent citation count shown by Google Scholar
hIndexTotal h-index
hIndexSince2019Recent h-index
i10IndexTotal i10-index
i10IndexSince2019Recent i10-index
publicationCountNumber of publication rows saved
publicationsVisible publication rows with title, authors, venue, year, citations, and URL
coauthorsVisible coauthor cards when enabled
scrapedAtTimestamp for freshness tracking

How much does it cost to scrape Google Scholar profiles?

This actor uses pay-per-event pricing.

You pay a small start fee per run and a formula-derived per-profile result fee for each saved profile. The BRONZE tier is about $2.03 per 1,000 saved profiles, with lower per-profile rates on higher platform tiers.

The default test input is intentionally small so your first run stays inexpensive.

For best cost control, start with one or two profiles and increase volume after checking the output.

Input options

Google Scholar profile URLs

Use profileUrls when you have full profile links.

Example:

[
{ "url": "https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en" }
]

Google Scholar user IDs

Use userIds when you only have the user= value.

Example:

["qc6CJjYAAAAJ"]

Maximum publications per profile

Use maxPublications to control how many visible publication rows are included for each profile.

Set it to 0 if you only need profile-level metrics.

Include coauthors

Use includeCoauthors to include or skip visible coauthor cards.

Example input

{
"profileUrls": [
{ "url": "https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en" }
],
"userIds": [],
"maxPublications": 10,
"includeCoauthors": true,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}

Example output

{
"profileUrl": "https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en",
"userId": "qc6CJjYAAAAJ",
"name": "Albert Einstein",
"affiliation": "Professor of Physics",
"verifiedEmailDomain": "example.edu",
"interests": ["Physics", "Relativity"],
"citations": 123456,
"citationsSince2019": 12345,
"hIndex": 99,
"hIndexSince2019": 40,
"i10Index": 200,
"i10IndexSince2019": 80,
"publicationCount": 10,
"publications": [
{
"title": "Example publication title",
"authors": "A Einstein",
"venue": "Journal name",
"year": 1915,
"citations": 1000,
"url": "https://scholar.google.com/citations?..."
}
],
"coauthors": [],
"scrapedAt": "2026-06-24T00:00:00.000Z"
}

How to scrape Google Scholar author profiles

  1. Open the actor input form.
  2. Paste one or more public Google Scholar profile URLs.
  3. Or paste one or more Google Scholar user IDs.
  4. Choose the maximum number of publication rows per profile.
  5. Decide whether to include coauthors.
  6. Run the actor.
  7. Download the dataset as CSV, JSON, Excel, XML, or HTML.

Tips for better results

  • โœ… Use public profile URLs, not search-result URLs.
  • โœ… Keep the first run small to confirm field coverage.
  • โœ… Use user IDs when you have already normalized profiles in your own database.
  • โœ… Lower maxPublications if you only need citation metrics.
  • โœ… Enable proxy settings only if your run is rate-limited.

Working with rate limits

Google Scholar can limit automated traffic.

If a run reports a challenge or unusual-traffic page, retry later with a smaller batch.

For larger batches, use Apify residential proxy and keep runs conservative.

The actor is designed to stop clearly when challenged instead of saving empty rows.

Integrations

Use the actor with Apify integrations to automate research workflows.

  • ๐Ÿ“„ Export to Google Sheets for manual review.
  • ๐Ÿงฑ Send JSON to a data warehouse.
  • ๐Ÿ”” Trigger a Make scenario when a dataset is ready.
  • ๐Ÿงฉ Enrich CRM records with public citation metrics.
  • ๐Ÿ“Š Feed dashboards with profile-level metrics.

API usage

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('fetch_cat/google-scholar-profiles-scraper').call({
profileUrls: [{ url: 'https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en' }],
maxPublications: 10,
includeCoauthors: true
});
console.log(run.defaultDatasetId);

Python

from apify_client import ApifyClient
import os
client = ApifyClient(os.environ['APIFY_TOKEN'])
run = client.actor('fetch_cat/google-scholar-profiles-scraper').call(run_input={
'userIds': ['qc6CJjYAAAAJ'],
'maxPublications': 10,
'includeCoauthors': True,
})
print(run['defaultDatasetId'])

cURL

curl -X POST "https://api.apify.com/v2/acts/fetch_cat~google-scholar-profiles-scraper/runs?token=$APIFY_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"userIds":["qc6CJjYAAAAJ"],"maxPublications":10,"includeCoauthors":true,"proxyConfiguration":{"useApifyProxy":true,"apifyProxyGroups":["RESIDENTIAL"]}}'

MCP usage

You can run this actor from AI tools through the Apify MCP server.

Use this MCP URL pattern:

https://mcp.apify.com/?tools=fetch_cat/google-scholar-profiles-scraper

Claude Code setup example:

$claude mcp add apify-google-scholar-profiles "https://mcp.apify.com/?tools=fetch_cat/google-scholar-profiles-scraper"

Claude Desktop JSON example:

{
"mcpServers": {
"apify-google-scholar-profiles": {
"url": "https://mcp.apify.com/?tools=fetch_cat/google-scholar-profiles-scraper"
}
}
}

Example prompts:

  • "Run the Google Scholar Profiles Scraper for these three profile URLs and summarize the h-index values."
  • "Extract publication rows for this Scholar user ID and export the dataset link."
  • "Compare citation metrics for these public author profiles."

Data quality notes

The actor returns fields visible on the public profile page at run time.

Some profiles may hide verified email information.

Some profiles may have fewer visible publication rows than your requested maximum.

Citation metrics can change over time as Google Scholar updates its index.

Limitations

This actor does not log into Google accounts.

It does not scrape private, hidden, or account-only data.

It does not guarantee that every Google Scholar profile is reachable at all times.

Large batches may require slower runs or proxy settings.

Legality

This actor extracts publicly visible profile information.

You are responsible for using the data lawfully and respecting applicable privacy, copyright, database, and platform rules.

Avoid collecting or processing personal data unless you have a valid legal basis.

FAQ and troubleshooting

Why did my run stop with a challenge message?

Google Scholar may have returned a rate-limit or unusual-traffic page.

Try a smaller batch, wait before retrying, or enable proxy settings.

Why are some publication fields empty?

Google Scholar pages do not always show every field for every publication row.

The actor saves the fields that are visible on the profile page.

Why did a profile produce no row?

The profile may be deleted, private, malformed, or temporarily unavailable.

Check that the URL contains a valid user= parameter.

Can I scrape publications for each Google Scholar profile?

Yes. Set maxPublications above 0 to include visible publication rows with titles, authors, venues, years, citations, and URLs where Google Scholar exposes them.

Can I monitor citation metrics over time?

Yes. Run the actor on a schedule and compare citations, citationsSince2019, hIndex, hIndexSince2019, i10Index, and i10IndexSince2019 across datasets.

Does this access private or logged-in Google data?

No. The actor is limited to public Google Scholar profile pages. It does not log in, bypass Google restrictions, or access private account data.

Explore related actors from fetch_cat:

Support

If you need a field that is visible on public Google Scholar pages but missing from the dataset, open an issue with an example profile URL.

Include your run ID and a short description of the expected field.

Changelog

0.1

Initial build with public profile metadata, citation metrics, interests, publication rows, coauthors, and Apify dataset output.