Crossref Scholarly Works Scraper
Pricing
$1.00 / 1,000 work returneds
Crossref Scholarly Works Scraper
Searches the Crossref API (150M+ scholarly works) and returns clean records: DOI, title, authors, journal, publisher, date, citation count, subjects, ISSN, abstract. Filter by work type/date, sort by relevance, citations, or newest for lit reviews.
Pricing
$1.00 / 1,000 work returneds
Rating
0.0
(0)
Developer
Dami's Studio
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
Search the Crossref catalog of 150M+ scholarly works (journal articles, preprints, books, datasets, and more) via its public REST API — no API key, no login, no anti-bot.
The actor is a polite Crossref client: it identifies itself with a contact User-Agent and a mailto query parameter so Crossref routes it to the faster "polite pool", and it uses deep cursor pagination (cursor=* → next-cursor) which is the only reliable way to page past 1,000 rows.
Input
| Field | Type | Default | Description |
|---|---|---|---|
query | string (required) | deep learning | Keywords searched across titles, authors, abstracts and metadata. |
filterType | string | all | Restrict to a Crossref work type, e.g. journal-article. |
fromDate | string YYYY-MM-DD | none | Only works published on/after this date. |
sort | enum | relevance | relevance, is-referenced-by-count (most cited), or published (newest). |
maxItems | integer | 100 | Max works to return (cursor pagination handles >100). |
proxyConfiguration | object | none | Optional and off by default; Crossref is a public, no-key API with no anti-bot, so a proxy adds no benefit. Only enable it if you hit IP-level rate limits. |
Output
Each successful row:
{"ok": true,"doi": "10.1038/nature14539","title": "Deep learning","authors": ["Yann LeCun", "Yoshua Bengio", "Geoffrey Hinton"],"journal": "Nature","publisher": "Springer Science and Business Media LLC","type": "journal-article","publishedDate": "2015-05-28","citations": 70000,"subjects": ["Multidisciplinary"],"issn": ["0028-0836", "1476-4687"],"abstract": null,"url": "https://doi.org/10.1038/nature14539"}
authorsare formatted"Given Family"(organizational authors fall back to their name).publishedDateis assembled from Crossref'sdate-parts(may be year-only or year-month for older records).citationsis Crossref'sis-referenced-by-count.abstractis the JATS-XML abstract stripped to plain text, ornullwhen Crossref has none.- Nullable fields:
title,journal,publisher,type,publishedDate,abstract, andurlmay benull, andauthors,subjects, andissnmay be empty arrays, depending on what the publisher deposited with Crossref.doiis always present (rows without a DOI are dropped).citationsdefaults to0when absent.
Results are deduplicated by DOI. Charging is per successful work (work event). Diagnostic / empty / blocked rows (ok: false with an errorCode) are never charged — this includes BAD_INPUT (empty query or malformed fromDate), NO_RESULTS, and any network/block error.
Troubleshooting
BAD_INPUTrow, no results: you leftqueryempty orfromDateisn'tYYYY-MM-DD. Fix the input and re-run — you were not charged.NO_RESULTSrow: your query/filter combination matched nothing in Crossref. Try broader keywords or drop the type/date filters.RATE_LIMITED/BLOCKEDrow: rare for Crossref. The actor already retries with backoff; if it persists, enable a proxy to use a different IP.
Notes
- Powered entirely by the public Crossref REST API (
https://api.crossref.org/works). Please be considerate of the shared, free service. - Citation counts and abstracts depend on what publishers deposit with Crossref; coverage varies by record.