Crossref Works Extractor avatar

Crossref Works Extractor

Pricing

from $2.00 / 1,000 results

Go to Apify Store
Crossref Works Extractor

Crossref Works Extractor

Extract scholarly publication metadata from Crossref — one work per row, with DOI, title, authors, publisher, type, dates, and references. 183M+ works. Public data, no key.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

xtractoo

xtractoo

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Extract scholarly publication metadata from Crossref — one work per row — with DOI, title, authors, publisher, type, publication dates, license, and reference counts.

Built for research and R&D intelligence, bibliometrics, publishing analytics, and reference/citation tooling.


Why use this actor

  • 183M+ works (verified) — pull the whole corpus or a slice by type, date, or query.
  • One work per row, with a flat header (doi, title, type, publisher, is_referenced_by_count) plus the full raw Crossref object (authors, references, license, funder, …).
  • No login, no key. Add your email to join Crossref's faster "polite pool".

Input

FieldTypeDescription
worksTypedropdownJournal article / Book chapter / Preprint / Dataset / …
fromPubDate / untilPubDatedatePublication-date window.
querytextFree-text search.
filtertextExtra Crossref filter (e.g. has-orcid:true), combined with the above.
mailtotextYour email — faster pool (recommended).
rowsint1–1000 (default 1000).
maxItemsint0 = all matching.

worksType is a pick-list; query/filter are free-text (Crossref's filter grammar is open-ended).

Output — WORK

Envelope + recordType: "WORK" + flat header, then the raw Crossref work:

{
"_input": "type=journal-article; from=2024-01-01",
"_source": "S1-crossref",
"_scrapedAt": "2026-06-03T10:00:00Z",
"recordType": "WORK",
"doi": "10.1234/example",
"title": "...",
"type": "journal-article",
"publisher": "...",
"is_referenced_by_count": 12,
"author": [ "..." ],
"reference": [ "..." ]
}

How it works

  1. Your type, date, and keyword filters are applied to the search.
  2. The actor automatically pages through all matching results (up to 1000 per request).
  3. Each work streams into the dataset.

Known limits

  • Public data — no account needed, runs from any connection. Backs off on HTTP 429. Add mailto for the faster pool.
  • Verified live 2026-06-03: total 183,058,502; cursor=*message.next-cursor confirmed (200 in ~3.5s).