OpenAlex Scholarly Works Scraper
Pricing
from $2.00 / 1,000 work returneds
OpenAlex Scholarly Works Scraper
Searches OpenAlex (250M+ scholarly works) by keyword and returns structured records: title, authors, institutions, venue, year, citation count, concepts, open-access link, and the full reconstructed abstract for literature reviews.
Pricing
from $2.00 / 1,000 work returneds
Rating
0.0
(0)
Developer
Dami's Studio
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
Search the OpenAlex catalog of 250M+ scholarly works and get clean, structured records — no API key, no login. OpenAlex is a free, open index of scholarship (an open replacement for Microsoft Academic Graph / Scopus).
This actor calls the public OpenAlex works endpoint, walks results with cursor pagination (the reliable way past the first couple hundred), reconstructs each abstract from its inverted index into readable text, and returns one flat row per work.
It is a polite API citizen: every request carries a contact mailto (both as a query param and in the User-Agent), which routes traffic to OpenAlex's faster, more reliable "polite pool".
Input
| Field | Type | Default | Description |
|---|---|---|---|
query | string | — (required) | Keywords to search (title, abstract, fulltext), e.g. machine learning. |
sort | string | relevance | relevance, citations (most cited first), or date (newest first). |
fromDate | string | — | Optional YYYY-MM-DD; only works published on/after this date. |
filter | string | — | Optional raw OpenAlex filter, e.g. type:article,is_oa:true. Merged with fromDate. |
maxItems | integer | 100 | Max works to return (50 fetched per page via cursor). |
proxyConfiguration | object | { "useApifyProxy": false } | Optional. Not needed — OpenAlex is a clean public API. |
Example input
{"query": "crispr","sort": "citations","fromDate": "2020-01-01","maxItems": 120}
Output
One row per work:
{"ok": true,"openalexId": "https://openalex.org/W...","doi": "https://doi.org/10....","title": "…","authors": ["Jane Doe", "John Roe"],"institutions": ["Some University"],"year": 2021,"publicationDate": "2021-05-03","type": "article","venue": "Nature","citations": 1234,"concepts": ["Biology", "Genetics"],"isOpenAccess": true,"oaUrl": "https://…pdf","abstract": "Reconstructed abstract text…","url": "https://openalex.org/W..."}
abstract is rebuilt from OpenAlex's abstract_inverted_index; when no abstract is indexed it is null. Results are deduplicated by openalexId.
Diagnostics & billing
On failure or no results, the actor pushes a single diagnostic row (ok:false) with an errorCode (BAD_INPUT, NO_RESULTS, RATE_LIMITED, SERVER_ERROR, NETWORK) instead of failing silently. Only successful work rows are charged (one work unit each) — diagnostics and empty results are never billed.
Data source
Data comes from OpenAlex, released under CC0. Please cite OpenAlex when you use it.