OpenAlex Institutions Scraper avatar

OpenAlex Institutions Scraper

Pricing

Pay per usage

Go to Apify Store
OpenAlex Institutions Scraper

OpenAlex Institutions Scraper

Gather structured records from Openalex Institutions with names, identifiers, dates, descriptions, status flags and source links. Loved by research, intelligence and operational dashboards. Run on demand or on a recurring schedule and feed every row into your favourite analytics or workflow stack.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

21 hours ago

Last modified

Categories

Share

ParseForge Banner

🎓 OpenAlex Institutions Scraper

🚀 Pull 124,000+ research institutions in seconds. Universities, hospitals, companies, government labs, archives and facilities from the OpenAlex scholarly graph, with citation counts, ROR identifiers, geo coordinates and homepage URLs.

🕒 Last updated: 2026-05-27 · 📊 22 fields per record · 124K+ institutions · Global coverage

OpenAlex is the open replacement for Microsoft Academic Graph, indexing every research institution on Earth with citation analytics and rich metadata. This scraper turns the institutions endpoint into a clean dataset you can pull into spreadsheet, Power BI, Tableau or your own database - no API key, no rate-limit headaches, no scraping HTML.

Every record is normalised to a ROR identifier and tagged with country, type, works count, h-index, i10 index and full geo data. Filter by country code, institution type or any keyword search, and stream as many as a million records into your downstream pipeline.

🎯 Target Audience💡 Primary Use Cases
Academic researchersBuild a directory of universities and labs in a country
Research intelligence teamsBenchmark institutional output and citation impact
Grant fundersIdentify and segment partner research bodies
EdTech and SaaS salesSource TAM lists of higher-ed and healthcare institutions
Data scientistsEnrich author and paper data with institutional metadata

📋 What the OpenAlex Institutions Scraper does

  • Streams institution records from the official OpenAlex /institutions endpoint
  • Supports full-text search, ISO country code filter and institution type filter
  • Returns normalised metadata: works count, citations, h-index, i10 index, ROR, GRID, Wikipedia, Wikidata
  • Adds geo data: city, region, country, latitude and longitude
  • No login, no API key, no usage quotas

💡 Why it matters: institutional metadata is the bedrock of every research analytics, TAM-building and academic partnership workflow. OpenAlex is the most complete open source, and this actor delivers it in the records your stack already understands.

🎬 Full Demo (🚧 Coming soon)

⚙️ Input

FieldTypeDescription
searchstringFull-text search on institution name
maxItemsintegerHow many records to return (free plan capped at 10)
countrystringISO 2-letter country code, lowercase (us, gb, de, jp)
typeenumeducation, healthcare, company, archive, nonprofit, government, facility, other
{ "search": "stanford", "maxItems": 5 }
{ "country": "de", "type": "education", "maxItems": 50 }

⚠️ Good to Know: OpenAlex caps per-page at 200 records and uses cursor pagination. The actor handles cursors transparently - you only set maxItems.

📊 Output

FieldDescription
🖼 imageUrlLogo thumbnail if available
📛 displayNameCanonical institution name
🏷 typeInstitution type
🌍 countryCodeISO 2-letter country code
📚 worksCountTotal publications
📈 citedByCountTotal citations
🏆 hIndexInstitutional h-index
🔗 rorUrlROR identifier URL
🌐 homepageUrlOfficial homepage
📍 city / region / country / latitude / longitudeGeo metadata
🆔 acronyms / alternativeNamesOther known names
📚 wikipediaUrl / wikidataUrl / gridIdCross-references
🔗 sourceUrlOpenAlex canonical URL
🕒 scrapedAtISO timestamp

✨ Why choose this Actor

  • 🆓 No API key, no auth, no rate-limit drama
  • 📡 Direct hit on the official OpenAlex API
  • 🧭 Cursor pagination handled automatically
  • 🏷 ROR, GRID, Wikidata cross-references in every record
  • 📦 Pull as structured records

📈 How it compares to alternatives

ApproachCostCoverageSetup time
Manual tabular retrieves from RORFreeNames onlyHours
OpenAlex API directlyFreeFullCode required
ParseForge OpenAlex Institutions ScraperPay-per-resultFull + structuredMinutes

🚀 How to use

  1. Create a free Apify account (includes $5 credit).
  2. Open the OpenAlex Institutions Scraper.
  3. Set search, country or type filters.
  4. Click Start and retrieves tabular / spreadsheet / structured / structured.
  5. Schedule daily, weekly or trigger from Make / Zapier.

💼 Business use cases

Academic CRM enrichment - match institution names against ROR to deduplicate and segment your contact database.

Research partnership scouting - find every healthcare institute in Germany by citation impact.

EdTech go-to-market - build a target list of universities by country and discipline.

Grant funding analytics - benchmark institutional output across your portfolio.

🔌 Automating OpenAlex Institutions Scraper

Hook into Make, Zapier, n8n, Airbyte, Pipedream, Slack, Google Drive, GitHub Actions or any HTTP webhook to schedule recurring runs and pipe data straight to your warehouse.

🌟 Beyond business use cases

  • Research: track the geographic spread of every healthcare facility on the planet.
  • Personal: explore your alma mater's citation network.
  • Non-profit: map underrepresented research institutions in the Global South.
  • Experimentation: combine with OpenAlex works data to compute institutional rankings.

🤖 Ask an AI assistant about this scraper

Ask ChatGPT, Claude, Perplexity or Copilot: "How do I pull every healthcare institution in France from OpenAlex using the ParseForge Apify actor?"

❓ Frequently Asked Questions

Do I need an OpenAlex API key? No. OpenAlex is fully open. The actor sends a polite User-Agent on your behalf.

How many institutions are in OpenAlex? 124,000+ as of 2026-05-26.

Can I filter by city? Filter by country code; city is returned as metadata for downstream filtering.

What's the difference between OpenAlex and ROR? ROR is an identifier registry. OpenAlex is the full research graph that uses ROR.

Are h-index values official? They are OpenAlex's computed institutional h-index, recomputed monthly.

Does the actor follow lineage? Lineage is returned as a list of parent OpenAlex IDs per institution.

Does the actor work for non-Latin scripts? Yes. OpenAlex stores Japanese, Chinese, Arabic and Cyrillic names as alternatives.

Can I pull to Google Sheets? Yes, via the Apify Google Sheets integration.

Are deduplications handled? OpenAlex already deduplicates against ROR. The actor returns one record per ROR.

Is the data refreshed? OpenAlex refreshes monthly. Re-run the actor to pick up new institutions.

🔌 Integrate with any app

Apify, Make, Zapier, n8n, Pipedream, Slack, Airbyte, GitHub, Google Drive, Power Automate, AWS Lambda, REST webhook.

ActorWhat it does
EU Clinical Trials Register ScraperPull every EU CTIS clinical trial
OurAirports ScraperGlobal airport database
FINRA BrokerCheck ScraperBroker disclosure records

💡 Pro Tip: browse the complete ParseForge collection for more research and business data scrapers.

🆘 Need Help? Open our contact form

⚠️ Disclaimer: independent tool, not affiliated with OpenAlex. Only publicly available open data is collected.