Deprecated

Pricing

from $2.00 / 1,000 results

See alternative Actors

Go to Apify Store

Crossref Scholarly Metadata — 150M+ Works

Deprecated

See alternative Actors

Search 150M+ scholarly works from 20,000+ publishers via Crossref API. Extract DOIs, citations, authors, abstracts, journals, funding data, and publication metadata. Essential for bibliometrics, research impact analysis, citation network studies, and academic data science. No API key required.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

kettledrum

Actor stats

Bookmarked

Total users

Monthly active users

5 months ago

Last modified

Crossref Scholarly Metadata

Extract scholarly metadata from the Crossref REST API — the largest open database of academic works with 150M+ records from 20,000+ publishers.

How much does it cost?

This Actor uses pay-per-event pricing. You are charged per result item returned.

No proxy costs. No API key costs. Crossref data is freely accessible from any location.

What data is available?

Articles: Journal articles, conference papers, preprints, dissertations
Books: Books, book chapters, monographs, edited volumes
Datasets: Research datasets with DOIs
Other: Reports, peer reviews, standards, grants

Each record includes DOI, title, authors (with ORCIDs), abstract, citation counts, publication date, journal/book info, subjects, license, and PDF links when available.

Modes

1. Works Search (default)

Search scholarly works by keyword, author, topic, or any combination.

Example — Find highly-cited machine learning papers:

Query: machine learning
Sort: Citation Count
Order: Descending
Max Results: 100

Example — Recent journal articles with abstracts:

Query: CRISPR gene editing
Work Type: Journal Article
From Date: 2024-01-01
Has Abstract: true

2. DOI Lookup

Get complete metadata for a single work by its DOI, including the full reference list.

Example:

DOI: 10.1038/nature12373

Returns the work's metadata plus up to 500 references with their DOIs, titles, and authors.

3. Journals Search

Search journals by name. Returns coverage statistics, ISSN, publisher info, and DOI counts.

Example:

Query: nature

4. Funders Search

Search funding organizations from the Open Funder Registry.

Example:

Query: national science foundation

Filters (Works mode)

Filter	Description
Work Type	Filter by content type (journal-article, book, dataset, etc.)
Sort By	Sort by relevance, publication date, citation count, or reference count
From/Until Date	Filter by publication date range (YYYY-MM-DD or YYYY)
Has Abstract	Only return works with abstracts

Output

Each result includes (when available):

Field	Description
`doi`	Digital Object Identifier
`title`	Work title
`authors`	Author names, ORCIDs, and affiliations
`abstract`	Abstract text (JATS tags removed)
`type`	Content type (journal-article, book-chapter, etc.)
`publisher`	Publisher name
`containerTitle`	Journal or book title
`publishedDate`	Publication date
`year`	Publication year
`citationCount`	Number of times cited by other works
`referenceCount`	Number of references in the work
`subjects`	Subject categories
`licenseUrl`	License URL (e.g., Creative Commons)
`pdfUrl`	Direct link to PDF

Use cases

Literature reviews — systematically collect papers on a topic with citation counts and metadata
Bibliometric analysis — study publication patterns, citation networks, and research impact
Research monitoring — track new publications in a field by date range
Journal analysis — compare journals by DOI count, coverage depth, and publisher info
Funding analysis — identify funding organizations and their research portfolios
Academic data pipelines — feed structured scholarly data into research databases or AI models

FAQ

Q: How large is the Crossref database? A: Over 150 million works with DOIs from 20,000+ publishers. This includes journal articles, books, datasets, conference papers, preprints, and more. New records are added continuously.

Q: Can I get the full text of papers? A: Crossref provides metadata, not full text. However, many records include a pdfUrl link to the publisher's full-text PDF (especially for open access articles). You can filter for records with abstracts using hasAbstract: true.

Q: How do I get all papers by a specific author? A: Use works search with the author's name as the query. For more precise results, use the author's ORCID if available. Results include author ORCID identifiers when registered.

Q: What's the maximum number of results per run? A: Set maxResults up to any number. The Actor uses cursor-based pagination with no depth limit, so you can retrieve thousands of records in a single run. Cost scales linearly with result count.

Q: Can I search by journal or publisher? A: Yes. Use journals mode to find journal metadata by name, or include the journal name in your works search query. You can also filter by work type (journal-article, book-chapter, etc.).

Q: How often is the data updated? A: Crossref data updates in near real-time as publishers register new DOIs. Recently published papers typically appear within days of publication.

Integration with Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

# Search for recent machine learning papers
run = client.actor("aligned_kettledrum/crossref-scholarly-data").call(
    run_input={
        "mode": "works",
        "query": "large language models",
        "fromDate": "2025-01-01",
        "workType": "journal-article",
        "hasAbstract": True,
        "sort": "citationCount",
        "order": "desc",
        "maxResults": 200,
    }
)

# Analyze with pandas
import pandas as pd
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
df = pd.DataFrame(items)

# Top cited papers
print(df[["title", "citationCount", "year", "containerTitle"]].head(20))

# Citation distribution
print(f"Total papers: {len(df)}")
print(f"Mean citations: {df['citationCount'].mean():.1f}")
print(f"Median citations: {df['citationCount'].median():.0f}")

Technical details

Data source: Crossref REST API — free, no API key required
Database size: 150M+ works with DOIs
Rate limiting: Uses polite pool (10 req/sec) with automatic rate limiting
Pagination: Cursor-based deep pagination (no depth limit)
Memory: 128-512 MB (API calls only, no browser needed)

MCP Server

Need Crossref data inside your AI agent? Use the Crossref MCP Server — same data, native MCP integration for Claude, ChatGPT, and other LLM frameworks.

Crossref Scraper: Scholarly DOIs & Citations

themineworks/crossref-scholarly-metadata

Scrape 150M+ scholarly works from Crossref: DOIs, authors, journals & citation counts by topic, year and type. Clean structured JSON for bibliometrics, literature reviews and RAG. No API key. Use it as an MCP server in Claude, ChatGPT & AI agents.

The Mine Works

Academic Papers Scraper - Citations & Metadata

benthepythondev/crossref-papers-scraper

Search 150M+ scholarly works by keyword (or look up by DOI) for structured metadata: title, authors with ORCID, journal, publication date, type, publisher, citation count, subjects, ISSN, volume/issue/pages and URL. Fast and reliable via the public Crossref API.

Ben

Crossref Scraper — DOI Metadata for Academic Papers

openclawmara/crossref-scraper

Scrape Crossref — largest DOI registry for academic literature. Modes: search works, DOI lookup, journal metadata, funder info, affiliation search. Extracts titles, authors, DOIs, ISSN, references, citations. Official REST API, no auth, 50 req/sec. For research & citation analysis.

OpenClaw Mara

Crossref Scholarly Works Scraper

dami_studio/crossref-scraper

Searches the Crossref API (150M+ scholarly works) and returns clean records: DOI, title, authors, journal, publisher, date, citation count, subjects, ISSN, abstract. Filter by work type/date, sort by relevance, citations, or newest for lit reviews.

Dami's Studio

5.0

(1)

Crossref Academic Paper Search

ryanclinton/crossref-paper-search

Search over 150 million scholarly works indexed by Crossref -- the largest open registry of DOI metadata in the world. Retrieve structured publication data including titles, authors with ORCID identifiers, citation counts, journal names, funding information, abstracts, and more. No API key required.

Ryan Clinton

Open Library Scraper — Books, Authors & Editions

openclawmara/openlibrary-scraper

Scrape Open Library (Internet Archive) for books, authors, and editions. Modes: search by title/author/subject, book details by ISBN/OLID, author works, recent additions. Extracts titles, authors, ISBNs, covers, subjects, publish dates, editions. Uses official Search & Works API. No auth.

OpenClaw Mara

DBLP Computer Science Publication Search

ryanclinton/dblp-publication-search

Search and extract computer science publications from DBLP -- the largest open bibliography database for computer science with over 6 million publications from journals, conferences, and workshops. Filter by keyword, author, venue, year, and publication type.

Ryan Clinton

Academic Research Mcp

ryanclinton/academic-research-mcp

Academic Research Mcp. Available on the Apify Store with pay-per-event pricing.

Ryan Clinton

CrossRef Scraper - Academic DOI & Metadata Extractor

klondikeking/crossref-academic-scraper

Extract academic paper metadata, DOIs, authors, citations, and abstracts from CrossRef via the public REST API. No scraping needed - fast, reliable, and cost-effective for researchers and data scientists.

Pierrick McD0nald

Crossref Academic Citation Scraper

cloud9_ai/crossref-scraper

Search and extract scholarly publication metadata from Crossref. Get DOIs, citations, authors, journals for 140M+ works.

cloud9

Crossref Scholarly Scraper — DOIs, Citations & Journals

logiover/crossref-scraper

Scrape Crossref by keyword, ISSN, or DOI list. Extract title, authors, DOI, citations, journal, publisher, funding, license for research, bibliometrics, and academic analysis. No API key required.