Crossref Works Extractor
Pricing
from $2.00 / 1,000 results
Crossref Works Extractor
Extract scholarly publication metadata from Crossref — one work per row, with DOI, title, authors, publisher, type, dates, and references. 183M+ works. Public data, no key.
Pricing
from $2.00 / 1,000 results
Rating
0.0
(0)
Developer
xtractoo
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Extract scholarly publication metadata from Crossref — one work per row — with DOI, title, authors, publisher, type, publication dates, license, and reference counts.
Built for research and R&D intelligence, bibliometrics, publishing analytics, and reference/citation tooling.
Why use this actor
- 183M+ works (verified) — pull the whole corpus or a slice by type, date, or query.
- One work per row, with a flat header (
doi,title,type,publisher,is_referenced_by_count) plus the full raw Crossref object (authors, references, license, funder, …). - No login, no key. Add your email to join Crossref's faster "polite pool".
Input
| Field | Type | Description |
|---|---|---|
worksType | dropdown | Journal article / Book chapter / Preprint / Dataset / … |
fromPubDate / untilPubDate | date | Publication-date window. |
query | text | Free-text search. |
filter | text | Extra Crossref filter (e.g. has-orcid:true), combined with the above. |
mailto | text | Your email — faster pool (recommended). |
rows | int | 1–1000 (default 1000). |
maxItems | int | 0 = all matching. |
worksType is a pick-list; query/filter are free-text (Crossref's filter grammar is open-ended).
Output — WORK
Envelope + recordType: "WORK" + flat header, then the raw Crossref work:
{"_input": "type=journal-article; from=2024-01-01","_source": "S1-crossref","_scrapedAt": "2026-06-03T10:00:00Z","recordType": "WORK","doi": "10.1234/example","title": "...","type": "journal-article","publisher": "...","is_referenced_by_count": 12,"author": [ "..." ],"reference": [ "..." ]}
How it works
- Your type, date, and keyword filters are applied to the search.
- The actor automatically pages through all matching results (up to 1000 per request).
- Each work streams into the dataset.
Known limits
- Public data — no account needed, runs from any connection. Backs off on HTTP 429. Add
mailtofor the faster pool. - Verified live 2026-06-03: total 183,058,502;
cursor=*→message.next-cursorconfirmed (200 in ~3.5s).