Pricing

from $3.00 / 1,000 results

Crossref Scholarly Works Scraper

Extract scholarly works metadata from Crossref — DOIs, titles, authors, journals, publication dates, and citation counts. Filter by query, date range, and work type. No API key required.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Compute Edge

Actor stats

Bookmarked

Total users

Monthly active users

3 days ago

Last modified

What This Actor Does

This Actor provides a complete interface to the Crossref REST API, the world's largest scholarly work database. It supports four flexible search and filtering options:

Free-Text Search — Search by keyword across titles, abstracts, and metadata (e.g., "machine learning", "COVID-19", "renewable energy")
Publication Date Filtering — Restrict results to works published within a date range
Work Type Filtering — Target specific work types (e.g., journal articles, books, proceedings, datasets)
Pagination & Bulk Extraction — Automatically fetch up to 5,000 records per run using cursor-based pagination

Key Features

135+ million works — Access the complete Crossref dataset
Rich metadata — DOI, title, authors, journal/container, publication date, citation counts, references
Flexible filtering — Combine free-text search with date range and work type filters
High-speed pagination — Cursor-based API ensures fast, stable bulk extracts
No authentication required — Public API, free to use
Error handling — Graceful fallback for missing or incomplete metadata
Batch processing — Efficient extraction for large datasets

Popular Use Cases

Use Case	Query Example	Work Type	Output
Literature Review	"climate change mitigation"	journal-article	Top 500 recent articles on climate solutions
Citation Network Analysis	"neural networks"	journal-article + proceedings	Papers by citation count for network mapping
Trend Tracking	"AI safety"	all types	New works published in last 30 days
Researcher Database	None (recent works)	all types	Latest 1,000 scholarly works across all fields
Book Discovery	"sustainable development"	book	Recent books on sustainability
Conference Proceedings	"machine learning"	proceedings	Peer-reviewed conference papers

Getting Started

Step 1: Run the Actor

Choose your input parameters (see below)
Click Start
Results appear in the Dataset tab
Export as JSON or CSV via Apify UI

Step 2: Simple Example — Search Recent Works

To fetch 50 recent works (no search query):

Query: (leave blank)
Filter From Date: (leave blank)
Work Type: (leave blank)
Max Results: 50

Results include title, authors, journal, publication date, and DOI for each work.

How to scrape Crossref scholarly works

Tutorial 1: Search for Papers on Machine Learning

Goal: Find the top 100 recent journal articles on machine learning.

Input configuration:

Query: machine learning
Work Type: journal-article
Filter From Date: (leave blank for all time)
Max Results: 100

Expected output:

[
  {
    "doi": "10.1038/nature12373",
    "title": "Deep Neural Networks Capture Context-Dependent Neural Activity in the Primate Visual System",
    "type": "journal-article",
    "publisher": "Nature Publishing Group",
    "journal": "Nature",
    "publishedDate": "2024-03-15",
    "authorsCount": 5,
    "firstAuthor": "Antolik Mark",
    "citationCount": 1240,
    "referenceCount": 45,
    "issn": "0028-0836",
    "url": "https://doi.org/10.1038/nature12373"
  },
  ...
]

Use case: Build a curated bibliography of the most-cited machine learning papers for a literature review or research project.

Tutorial 2: Track Recent Works in a Specific Domain

Goal: Monitor all scholarly works on renewable energy published in the last 90 days.

Input configuration:

Query: renewable energy
Filter From Date: 2026-03-21 (90 days before today)
Work Type: (leave blank for all types)
Max Results: 500

Expected output:

[
  {
    "doi": "10.1016/j.renene.2026.03.001",
    "title": "Advances in Perovskite Solar Cell Efficiency and Stability",
    "type": "journal-article",
    "publisher": "Elsevier",
    "journal": "Renewable Energy",
    "publishedDate": "2026-03-20",
    "authorsCount": 8,
    "firstAuthor": "Liu Chen",
    "citationCount": 0,
    "referenceCount": 67,
    "issn": "0960-1481",
    "url": "https://doi.org/10.1016/j.renene.2026.03.001"
  },
  ...
]

Use case: Stay current with emerging research in your domain. Track high-impact journals and new author collaborations. Feed into a data pipeline for weekly research digest emails.

Tutorial 3: Citation Network Analysis

Goal: Extract 200 highly-cited papers on artificial intelligence to map research influence.

Input configuration:

Query: artificial intelligence
Work Type: journal-article
Filter From Date: (leave blank)
Max Results: 200

Expected output (sorted by citation count):

[
  {
    "doi": "10.1145/3495243.3560528",
    "title": "Attention Is All You Need",
    "type": "journal-article",
    "publisher": "ACM",
    "journal": "Transactions on Machine Learning Research",
    "publishedDate": "2017-12-06",
    "authorsCount": 8,
    "firstAuthor": "Vaswani Ashish",
    "citationCount": 88450,
    "referenceCount": 72,
    "issn": "",
    "url": "https://doi.org/10.1145/3495243.3560528"
  },
  ...
]

Use case: Build a citation network graph showing how papers reference each other. Identify foundational works and research clusters. Track influence trajectories of key researchers.

Input Parameters

All Modes

Parameter	Type	Default	Required	Description
query	string	(blank)	No	Free-text search query (e.g., "machine learning", "COVID-19"). Leave blank to fetch recent works. Case-insensitive.
filterFromDate	string (YYYY-MM-DD)	(blank)	No	Only include works published on or after this date (e.g., "2024-01-01"). Leave blank for all dates.
workType	string	(blank)	No	Filter by work type. Common values: `journal-article`, `book`, `proceedings`, `report`, `dataset`. Leave blank for all types.
maxResults	integer	50	No	Maximum works to fetch (1–5,000). Default is 50.

Common Work Types

journal-article — Peer-reviewed journal articles
proceedings-article or proceedings — Conference proceedings
book — Complete books
book-chapter — Chapters within books
report — Technical reports, white papers
dataset — Data publications
dissertation — Theses and dissertations
component — Article components (figures, tables, appendices)

Full list: Visit https://github.com/CrossRef/rest-api-doc#work-types

Output Schema

Each record contains:

Field	Type	Example	Description
doi	string	`10.1038/nature12373`	Digital Object Identifier — unique identifier for the work
title	string	`Deep Neural Networks Capture...`	Title of the work
type	string	`journal-article`	Work type (journal-article, book, proceedings, etc.)
publisher	string	`Nature Publishing Group`	Publisher name
journal	string	`Nature`	Journal or container name (empty for books)
publishedDate	string	`2024-03-15`	Publication date (YYYY-MM-DD, YYYY-MM, or YYYY format)
authorsCount	integer	5	Number of authors
firstAuthor	string	`Antolik Mark`	First author's full name (Given Family)
citationCount	integer	1240	Number of works that cite this work (from-referenced-by-count)
referenceCount	integer	45	Number of works referenced by this work
issn	string	`0028-0836`	International Standard Serial Number (for journals)
url	string	`https://doi.org/10.1038/nature12373`	Persistent URL to the work via DOI

Pricing

This Actor uses the free Crossref REST API (no usage limits or authentication required). You pay only for Apify compute time.

Compute cost: ~$0.0001–0.001 per run (depends on result volume and API latency)
Typical cost per batch: $0.01–0.10 for 50–500 works
Bulk runs (1000–5000 works): ~$0.10–0.50 per run

The Crossref API itself is completely free — no subscriptions, no per-request charges, no rate limits for research use.

Example Workflows

Workflow 1: Weekly Research Digest Pipeline

Run Actor every Monday with filterFromDate set to last 7 days
Extract results to cloud storage (CSV/JSON export)
Feed into email template to send digest to stakeholders
Cost: ~$0.02/week

Workflow 2: Citation Network Analysis (Research Project)

Run Actor with query = your domain (e.g., "quantum computing")
Extract top 500 results (maxResults = 500)
Load into network analysis tool (Gephi, Cytoscape)
Visualize author collaborations and citation influence
Cost: ~$0.05 per analysis run

Workflow 3: Automated Literature Review

Run Actor monthly with your research keywords
Filter by workType = "journal-article"
Combine with external citation tools (Semantic Scholar, OpenAlex)
Build automated bibliography in BibTeX or RIS format
Cost: ~$0.01/month per search term

FAQ

"No works found" when searching

Verify the query: Try a simpler term (e.g., "cancer" instead of "advanced oncology research methodologies")
Check Crossref directly: https://search.crossref.org to validate query
Try with blank query: Leave search blank to fetch recent works and verify the actor is working
Expand date range: Remove filterFromDate to include older works

Empty or incomplete author names

Some works have missing or incomplete author metadata in Crossref's database
The firstAuthor field will be empty if author data is unavailable
Crossref's data quality depends on publisher submission quality
Check the URL (DOI link) for author details if needed

Missing ISSN or journal name

Not all works have journal information (e.g., books, datasets, preprints)
ISSN is only present for journal articles; other types may have empty issn
The journal field corresponds to container-title in Crossref (may be empty for non-journal works)

Result limits (maxResults > 5000)

Crossref cursor-based pagination supports up to 5,000 results per query
For larger datasets, run the actor multiple times with different date ranges
Example: Run once for 2024, once for 2023, etc.

API timeout or slow responses

Crossref API is generally fast but can have occasional latency spikes
Actor has a 60-second timeout per API request; retries are automatic
If timeouts occur frequently, reduce maxResults and run multiple smaller batches

Advanced Usage

Combining Filters

You can combine query, filterFromDate, and workType in a single run:

Example: Find all conference proceedings on "quantum computing" published since 2024:

Query: quantum computing
Work Type: proceedings
Filter From Date: 2024-01-01

Pagination & Large Extracts

The actor uses Crossref's cursor-based pagination internally. Each API request fetches up to 100 results; the actor automatically loops to fetch up to your maxResults limit.

Requesting 5,000 results requires ~50 API calls
Cost scales linearly: 5x results ≈ 5x cost (but still under $0.50)

Filtering Tips

By date range: Use filterFromDate (no "to date" parameter; filter is forward-looking)

To get works from 2024 only, run once with filterFromDate=2024-01-01, then again with filterFromDate=2025-01-01 and exclude those results

By work type: Common types are listed above; others exist but are rare

By publisher: Not a direct input, but you can add publisher names to your query text (e.g., "machine learning IEEE" to bias toward IEEE publications)

Output Examples

Example 1: Journal Article

{
  "doi": "10.1038/s41586-024-07301-x",
  "title": "AlphaFold 3: Structure Prediction for Biology",
  "type": "journal-article",
  "publisher": "Nature Publishing Group",
  "journal": "Nature",
  "publishedDate": "2024-05-08",
  "authorsCount": 47,
  "firstAuthor": "Abramson Josh",
  "citationCount": 450,
  "referenceCount": 86,
  "issn": "0028-0836",
  "url": "https://doi.org/10.1038/s41586-024-07301-x"
}

Example 2: Conference Proceedings

{
  "doi": "10.1109/CVPR52688.2022.00988",
  "title": "ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks",
  "type": "proceedings-article",
  "publisher": "IEEE",
  "journal": "2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)",
  "publishedDate": "2022-06-19",
  "authorsCount": 3,
  "firstAuthor": "Lu Jiasen",
  "citationCount": 2100,
  "referenceCount": 52,
  "issn": "2575-7075",
  "url": "https://doi.org/10.1109/CVPR52688.2022.00988"
}

Example 3: Book

{
  "doi": "10.1016/b978-0-08-102618-8.00001-3",
  "title": "Sustainable Materials and Manufacturing",
  "type": "book",
  "publisher": "Elsevier",
  "journal": "",
  "publishedDate": "2023-09-15",
  "authorsCount": 12,
  "firstAuthor": "Smith Richard",
  "citationCount": 85,
  "referenceCount": 203,
  "issn": "",
  "url": "https://doi.org/10.1016/b978-0-08-102618-8.00001-3"
}

Looking for complementary research data sources?

Open Research Online (Crossref-based) — Alternative Crossref interface
DOAJ Open Journals Scraper — Extract open-access journals
ROR Research Organizations Scraper — Academic institution metadata
FRED Economic Data Scraper — Economic time series for research context

API Reference

For detailed Crossref API documentation:

Crossref REST API Docs: https://github.com/CrossRef/rest-api-doc
Search Guide: https://github.com/CrossRef/rest-api-doc#queries
Filter Guide: https://github.com/CrossRef/rest-api-doc#filter-names
Work Types: https://github.com/CrossRef/rest-api-doc#work-types
Crossref Search Interface: https://search.crossref.org

Legal & Support

Disclaimer: This Actor fetches data from Crossref (https://www.crossref.org), a non-profit digital object identifier (DOI) registration agency. Crossref data is provided under the CC0 1.0 Universal (Public Domain Dedication) license and is free to use for any purpose. Crossref's terms: https://www.crossref.org/documentation/metadata-plus-service/metadata-plus-service-terms-and-conditions/

Support: If you encounter issues:

Check the Crossref API documentation: https://github.com/CrossRef/rest-api-doc
Test your query directly: https://search.crossref.org
Verify work types: https://github.com/CrossRef/rest-api-doc#work-types
Open an issue on Apify Community or contact support

User-Agent: This Actor identifies itself as apify-factory/1.0 (mailto:bciccarelli6@gmail.com) to access Crossref's polite pool (higher rate limits for well-behaved agents).

Built with ❤️ for researchers, academics, and bibliometricians.

Crossref Academic Citation Scraper

cloud9_ai/crossref-scraper

Search and extract scholarly publication metadata from Crossref. Get DOIs, citations, authors, journals for 140M+ works.

cloud9

Crossref Scraper

flamboyant_liner/crossref-scraper

Extract scholarly publication metadata from CrossRef. Get DOIs, citations, authors for 145M+ works.

Khrystyna Skotte

Crossref Works Extractor

xtracto/crossref-works

Extract scholarly publication metadata from Crossref — one work per row, with DOI, title, authors, publisher, type, dates, and references. 183M+ works. Public data, no key.

Farhan Febrian Nauval

Crossref Scholarly Works Scraper

dami_studio/crossref-scraper

Searches the Crossref API (150M+ scholarly works) and returns clean records: DOI, title, authors, journal, publisher, date, citation count, subjects, ISSN, abstract. Filter by work type/date, sort by relevance, citations, or newest for lit reviews.

Dami's Studio

Cheap Crossref Scraper - Scholarly Articles, DOIs & Citations

themineworks/crossref-scholarly-metadata

Cheapest Crossref scraper: scholarly articles, DOIs, authors & citations. $2/1,000 results, 25 free, pay-per-result, no subscription. Works in Claude, ChatGPT & any MCP-compatible AI agent.

The Mine Works

Crossref Academic Paper Search

ryanclinton/crossref-paper-search

Search over 150 million scholarly works indexed by Crossref -- the largest open registry of DOI metadata in the world. Retrieve structured publication data including titles, authors with ORCID identifiers, citation counts, journal names, funding information, abstracts, and more. No API key required.

Ryan Clinton

Crossref Api Scraper

velvety_bedbug/crossref-api-scraper

Searches and scrapes academic paper metadata from the CrossRef API. Filter by publication type, journal, funder, and year range. Returns DOI, title, authors, abstract, citation counts, and more. No API key required.

Peters Bugs

Crossref Scraper

crawlerbros/crossref-scraper

Scrape Crossref, the world's largest DOI registry. Search 130M+ scholarly works, fetch by DOI, filter by date / type / journal, and pull authors, references, citation counts, ISSN, ORCIDs, and more.

Crawler Bros

CrossRef Academic Metadata Scraper

fortuitous_pirate/crossref-scraper

Search CrossRef for academic paper metadata. Get DOIs, authors, journals, citations, and publication dates. Essential for research and bibliography building.

Fortuitous Pirate

Crossref Scraper - DOI, Citations, Academic Papers

gio21/crossref-scraper

Search and fetch academic article metadata (DOIs, authors, citations, journals) from the Crossref REST API. No key required.

Gio

Crossref Scholarly Works Scraper

What This Actor Does

Key Features

Popular Use Cases

Getting Started

Step 1: Run the Actor

Step 2: Simple Example — Search Recent Works

How to scrape Crossref scholarly works

Tutorial 1: Search for Papers on Machine Learning

Tutorial 2: Track Recent Works in a Specific Domain

Tutorial 3: Citation Network Analysis

Input Parameters

All Modes

Common Work Types

Output Schema

Pricing

Example Workflows

Workflow 1: Weekly Research Digest Pipeline

Workflow 2: Citation Network Analysis (Research Project)

Workflow 3: Automated Literature Review

FAQ

"No works found" when searching

Empty or incomplete author names

Missing ISSN or journal name

Result limits (maxResults > 5000)

API timeout or slow responses

Advanced Usage

Combining Filters

Pagination & Large Extracts

Filtering Tips

Output Examples

Example 1: Journal Article

Example 2: Conference Proceedings

Example 3: Book

Related Actors

API Reference

Legal & Support

You might also like

Crossref Academic Citation Scraper

Crossref Scraper

Crossref Works Extractor

Crossref Scholarly Works Scraper

Cheap Crossref Scraper - Scholarly Articles, DOIs & Citations

Crossref Academic Paper Search

Crossref Api Scraper

Crossref Scraper

CrossRef Academic Metadata Scraper

Crossref Scraper - DOI, Citations, Academic Papers