ORCID Researcher Profile Scraper avatar

ORCID Researcher Profile Scraper

Pricing

Pay per event

Go to Apify Store
ORCID Researcher Profile Scraper

ORCID Researcher Profile Scraper

🔎 Extract public ORCID researcher profiles, affiliations, funding, works, identifiers, keywords, and contact links from the official API.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Categories

Share

Search ORCID public profiles and extract structured researcher identity, affiliation, funding, publication, keyword, and public contact data from the official ORCID public API.

Use it to enrich academic CRMs, map institutional researchers, monitor public profile changes, and build research-intelligence datasets without browser automation or login flows.

What does ORCID Researcher Profile Scraper do?

ORCID Researcher Profile Scraper turns ORCID public API search results and ORCID iDs into clean Apify dataset rows.

It can search by free-text terms, names, affiliations, keywords, or any ORCID-supported Lucene query.

It can also fetch a list of exact ORCID iDs or ORCID profile URLs.

Each output row represents one researcher profile.

The actor normalizes sparse public ORCID records into predictable fields.

Who is it for?

🎓 University research offices can enrich faculty and researcher databases.

📚 Scholarly publishers can validate author identities and public research links.

💼 Academic recruiters can discover researchers by affiliation or topic.

🧪 Grant intelligence teams can map public funding and work summaries.

🧩 CRM and data vendors can append ORCID identifiers to existing profiles.

Why use this ORCID scraper?

It uses the official ORCID public API rather than scraping rendered web pages.

That makes runs fast, low cost, and reliable.

It does not require an ORCID account, OAuth token, captcha solving, or a browser.

It emits one stable dataset schema for easy exports to CSV, JSON, Excel, BigQuery, or your own API pipeline.

What data can you extract?

The exact fields depend on what each researcher has made public in ORCID.

GroupExample fields
IdentityORCID iD, ORCID URL, given names, family name, credit name, display name
Profilebiography, keywords, websites, researcher URLs, public emails, countries
Identifiersexternal identifiers with type, value, and URL
Affiliationsemployments, educations, memberships, services
Research activityfunding summaries, work/publication summaries, counts
Metadatalast modified date, source, search query, fetch timestamp

How much does it cost to extract ORCID researcher profiles?

This actor uses pay-per-event pricing.

There is a small start event per run and a per-profile event for each saved researcher profile.

Current validated pricing is $0.005 per run plus tiered per-profile pricing.

The BRONZE per-profile price is $0.00004907, with lower prices on higher Apify plans.

For a small test, keep maxItems at 10.

For enrichment jobs, raise maxItems after confirming your query returns the right population.

Input options

The actor accepts four main inputs.

searchQuery is an ORCID public API Lucene query.

orcidIds is an optional list of ORCID iDs or ORCID profile URLs.

maxItems caps the number of researcher profiles saved.

detailDepth controls how much nested public profile detail is normalized.

ORCID search query examples

Search by institution:

{
"searchQuery": "affiliation-org-name:\"Stanford University\"",
"maxItems": 25,
"detailDepth": "activities"
}

Search by name:

{
"searchQuery": "family-name:Smith AND given-names:Jane",
"maxItems": 10,
"detailDepth": "profileOnly"
}

Search by topic:

{
"searchQuery": "machine learning",
"maxItems": 100,
"detailDepth": "works"
}

Fetch exact ORCID iDs

Use orcidIds when you already have profile identifiers.

{
"orcidIds": [
"0000-0002-1825-0097",
"https://orcid.org/0000-0002-9510-6777"
],
"maxItems": 2,
"detailDepth": "activities"
}

The actor deduplicates ORCID iDs across search results and explicit IDs.

Detail depth

profileOnly extracts public identity, biography, keywords, URLs, emails, countries, and summary counts.

activities adds affiliations and funding summaries.

works adds publication/work summaries.

Choose works for research intelligence exports.

Choose profileOnly for fast identity enrichment.

Output example

{
"orcidId": "0000-0002-1825-0097",
"orcidUri": "https://orcid.org/0000-0002-1825-0097",
"displayName": "Example Researcher",
"keywords": ["machine learning"],
"employmentsCount": 2,
"worksCount": 18,
"source": "orcid-public-api",
"detailDepth": "works",
"fetchedAt": "2026-06-29T00:00:00.000Z"
}

Nested arrays contain affiliations, funding summaries, works, and external identifiers.

Tips for better ORCID results

Use specific institution names for affiliation searches.

Quote multi-word organizations in the ORCID Lucene query.

Start with maxItems 10 to validate query quality.

Use exact ORCID iDs when you need deterministic enrichment.

Expect sparse fields because ORCID users control public visibility.

Integrations

Export results to Google Sheets for research-office review.

Send dataset rows to a CRM enrichment workflow.

Join ORCID iDs with Crossref, PubMed, OpenAlex, or internal publication records.

Schedule recurring Apify runs to monitor public profile changes.

Use webhooks to trigger downstream compliance or data-quality checks.

API usage with Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/orcid-researcher-profile-scraper').call({
searchQuery: 'affiliation-org-name:"Stanford University"',
maxItems: 25,
detailDepth: 'activities'
});
console.log(run.defaultDatasetId);

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('MY-APIFY-TOKEN')
run = client.actor('automation-lab/orcid-researcher-profile-scraper').call(run_input={
'searchQuery': 'machine learning',
'maxItems': 50,
'detailDepth': 'works',
})
print(run['defaultDatasetId'])

API usage with cURL

curl -X POST 'https://api.apify.com/v2/acts/automation-lab~orcid-researcher-profile-scraper/runs?token=MY-APIFY-TOKEN' \
-H 'Content-Type: application/json' \
-d '{"searchQuery":"machine learning","maxItems":25,"detailDepth":"works"}'

MCP usage

Use this actor from Claude Desktop, Claude Code, or another MCP-capable client through Apify MCP.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/orcid-researcher-profile-scraper

Claude Code setup:

$claude mcp add apify-orcid --transport http "https://mcp.apify.com/?tools=automation-lab/orcid-researcher-profile-scraper"

Claude Desktop, Cursor, and VS Code JSON setup:

Add this server entry to your MCP configuration file. For Claude Desktop, use the app's claude_desktop_config.json. For Cursor and VS Code, add the same mcpServers block to the editor MCP settings JSON.

{
"mcpServers": {
"apify-orcid": {
"transport": "http",
"url": "https://mcp.apify.com/?tools=automation-lab/orcid-researcher-profile-scraper"
}
}
}

Example prompts:

  • "Find public ORCID profiles for researchers affiliated with Stanford University."
  • "Fetch these ORCID IDs and summarize their public affiliations."
  • "Export ORCID profiles related to machine learning with publication summary counts."

Legality and ethical use

This actor reads public ORCID API data.

Only collect and process data for lawful purposes.

Respect ORCID public visibility settings and applicable privacy obligations.

Do not use public researcher data for spam, harassment, or discriminatory profiling.

FAQ

Why are some ORCID fields empty?

ORCID users control visibility. Empty fields usually mean the researcher did not make that profile section public.

Can this actor access private ORCID data?

No. It only uses public ORCID API responses and does not use OAuth or private credentials.

Which detail depth should I choose?

Use profileOnly for enrichment, activities for affiliation/funding mapping, and works for publication intelligence.

Troubleshooting

If a profile field is empty, the researcher may not have made that field public.

If a search returns few results, try a broader ORCID Lucene query.

If you receive no results, verify the query syntax in ORCID's public API documentation.

If a run is slow, lower detailDepth or maxItems.

Explore related Automation Lab actors for research and publication workflows.

Dataset columns

The main table view includes ORCID iD, display name, name fields, URL, keywords, countries, counts, last modified date, source, query, detail depth, and fetched timestamp.

Full JSON exports include all nested arrays.

Performance

The actor runs as a lightweight API actor with 256 MB memory.

It uses conservative sequential requests and backoff for HTTP 429 responses.

No proxy is expected for normal ORCID public API usage.

Limits

maxItems is capped at 1000 per run.

The actor saves only records that can be fetched from the public API.

Private ORCID fields are not available.

Changelog

Initial version supports public ORCID search, exact ORCID iD fetches, profile fields, affiliations, funding summaries, and work summaries.