OpenAlex Institutions Scraper
Pricing
Pay per usage
OpenAlex Institutions Scraper
Gather structured records from Openalex Institutions with names, identifiers, dates, descriptions, status flags and source links. Loved by research, intelligence and operational dashboards. Run on demand or on a recurring schedule and feed every row into your favourite analytics or workflow stack.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
21 hours ago
Last modified
Categories
Share

🎓 OpenAlex Institutions Scraper
🚀 Pull 124,000+ research institutions in seconds. Universities, hospitals, companies, government labs, archives and facilities from the OpenAlex scholarly graph, with citation counts, ROR identifiers, geo coordinates and homepage URLs.
🕒 Last updated: 2026-05-27 · 📊 22 fields per record · 124K+ institutions · Global coverage
OpenAlex is the open replacement for Microsoft Academic Graph, indexing every research institution on Earth with citation analytics and rich metadata. This scraper turns the institutions endpoint into a clean dataset you can pull into spreadsheet, Power BI, Tableau or your own database - no API key, no rate-limit headaches, no scraping HTML.
Every record is normalised to a ROR identifier and tagged with country, type, works count, h-index, i10 index and full geo data. Filter by country code, institution type or any keyword search, and stream as many as a million records into your downstream pipeline.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Academic researchers | Build a directory of universities and labs in a country |
| Research intelligence teams | Benchmark institutional output and citation impact |
| Grant funders | Identify and segment partner research bodies |
| EdTech and SaaS sales | Source TAM lists of higher-ed and healthcare institutions |
| Data scientists | Enrich author and paper data with institutional metadata |
📋 What the OpenAlex Institutions Scraper does
- Streams institution records from the official OpenAlex
/institutionsendpoint - Supports full-text
search, ISO country code filter and institutiontypefilter - Returns normalised metadata: works count, citations, h-index, i10 index, ROR, GRID, Wikipedia, Wikidata
- Adds geo data: city, region, country, latitude and longitude
- No login, no API key, no usage quotas
💡 Why it matters: institutional metadata is the bedrock of every research analytics, TAM-building and academic partnership workflow. OpenAlex is the most complete open source, and this actor delivers it in the records your stack already understands.
🎬 Full Demo (🚧 Coming soon)
⚙️ Input
| Field | Type | Description |
|---|---|---|
| search | string | Full-text search on institution name |
| maxItems | integer | How many records to return (free plan capped at 10) |
| country | string | ISO 2-letter country code, lowercase (us, gb, de, jp) |
| type | enum | education, healthcare, company, archive, nonprofit, government, facility, other |
{ "search": "stanford", "maxItems": 5 }
{ "country": "de", "type": "education", "maxItems": 50 }
⚠️ Good to Know: OpenAlex caps
per-pageat 200 records and uses cursor pagination. The actor handles cursors transparently - you only setmaxItems.
📊 Output
| Field | Description |
|---|---|
| 🖼 imageUrl | Logo thumbnail if available |
| 📛 displayName | Canonical institution name |
| 🏷 type | Institution type |
| 🌍 countryCode | ISO 2-letter country code |
| 📚 worksCount | Total publications |
| 📈 citedByCount | Total citations |
| 🏆 hIndex | Institutional h-index |
| 🔗 rorUrl | ROR identifier URL |
| 🌐 homepageUrl | Official homepage |
| 📍 city / region / country / latitude / longitude | Geo metadata |
| 🆔 acronyms / alternativeNames | Other known names |
| 📚 wikipediaUrl / wikidataUrl / gridId | Cross-references |
| 🔗 sourceUrl | OpenAlex canonical URL |
| 🕒 scrapedAt | ISO timestamp |
✨ Why choose this Actor
- 🆓 No API key, no auth, no rate-limit drama
- 📡 Direct hit on the official OpenAlex API
- 🧭 Cursor pagination handled automatically
- 🏷 ROR, GRID, Wikidata cross-references in every record
- 📦 Pull as structured records
📈 How it compares to alternatives
| Approach | Cost | Coverage | Setup time |
|---|---|---|---|
| Manual tabular retrieves from ROR | Free | Names only | Hours |
| OpenAlex API directly | Free | Full | Code required |
| ParseForge OpenAlex Institutions Scraper | Pay-per-result | Full + structured | Minutes |
🚀 How to use
- Create a free Apify account (includes $5 credit).
- Open the OpenAlex Institutions Scraper.
- Set
search,countryortypefilters. - Click Start and retrieves tabular / spreadsheet / structured / structured.
- Schedule daily, weekly or trigger from Make / Zapier.
💼 Business use cases
Academic CRM enrichment - match institution names against ROR to deduplicate and segment your contact database.
Research partnership scouting - find every healthcare institute in Germany by citation impact.
EdTech go-to-market - build a target list of universities by country and discipline.
Grant funding analytics - benchmark institutional output across your portfolio.
🔌 Automating OpenAlex Institutions Scraper
Hook into Make, Zapier, n8n, Airbyte, Pipedream, Slack, Google Drive, GitHub Actions or any HTTP webhook to schedule recurring runs and pipe data straight to your warehouse.
🌟 Beyond business use cases
- Research: track the geographic spread of every healthcare facility on the planet.
- Personal: explore your alma mater's citation network.
- Non-profit: map underrepresented research institutions in the Global South.
- Experimentation: combine with OpenAlex works data to compute institutional rankings.
🤖 Ask an AI assistant about this scraper
Ask ChatGPT, Claude, Perplexity or Copilot: "How do I pull every healthcare institution in France from OpenAlex using the ParseForge Apify actor?"
❓ Frequently Asked Questions
Do I need an OpenAlex API key? No. OpenAlex is fully open. The actor sends a polite User-Agent on your behalf.
How many institutions are in OpenAlex? 124,000+ as of 2026-05-26.
Can I filter by city? Filter by country code; city is returned as metadata for downstream filtering.
What's the difference between OpenAlex and ROR? ROR is an identifier registry. OpenAlex is the full research graph that uses ROR.
Are h-index values official? They are OpenAlex's computed institutional h-index, recomputed monthly.
Does the actor follow lineage? Lineage is returned as a list of parent OpenAlex IDs per institution.
Does the actor work for non-Latin scripts? Yes. OpenAlex stores Japanese, Chinese, Arabic and Cyrillic names as alternatives.
Can I pull to Google Sheets? Yes, via the Apify Google Sheets integration.
Are deduplications handled? OpenAlex already deduplicates against ROR. The actor returns one record per ROR.
Is the data refreshed? OpenAlex refreshes monthly. Re-run the actor to pick up new institutions.
🔌 Integrate with any app
Apify, Make, Zapier, n8n, Pipedream, Slack, Airbyte, GitHub, Google Drive, Power Automate, AWS Lambda, REST webhook.
🔗 Recommended Actors
| Actor | What it does |
|---|---|
| EU Clinical Trials Register Scraper | Pull every EU CTIS clinical trial |
| OurAirports Scraper | Global airport database |
| FINRA BrokerCheck Scraper | Broker disclosure records |
💡 Pro Tip: browse the complete ParseForge collection for more research and business data scrapers.
🆘 Need Help? Open our contact form
⚠️ Disclaimer: independent tool, not affiliated with OpenAlex. Only publicly available open data is collected.