Open Citations Scraper avatar

Open Citations Scraper

Pricing

Pay per event

Go to Apify Store
Open Citations Scraper

Open Citations Scraper

Comprehensive OpenCitations scraper for extracting citation and reference data from OpenCitations API. Perfect for researchers, academics, and data scientists who need automated access to citation networks, bibliographic metadata, and citation analysis data.

Pricing

Pay per event

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

4 days ago

Last modified

Share

ParseForge Banner

📚 OpenCitations Scraper

🚀 Extract citation networks and bibliographic metadata from OpenCitations in seconds. Search by DOI, PMID, or OMID. No coding, no API keys required.

🕒 Last updated: 2026-04-16 · 📊 20 fields · 🔍 Citations and references modes · 📄 Optional detailed metadata

OpenCitations is an open scholarly infrastructure providing free access to citation data from millions of academic publications. This scraper collects citation relationships, self-citation flags, and optional bibliographic metadata (authors, titles, venues, publication dates) for any publication identified by DOI, PubMed ID, or OpenCitations Meta ID. Choose between citations mode (who cited this work) and references mode (what this work cites) to map research influence in either direction.

Researchers, bibliometric analysts, and data scientists use this actor to build citation networks, track research impact, identify self-citations, and analyze how knowledge flows between publications. Instead of querying the OpenCitations API manually and parsing responses, you get clean, structured data exported as JSON, CSV, or Excel. With metadata enabled, every record includes the citing and cited entity IDs, creation date, timespan, self-citation flags, plus the full title, authors, publication date, venue, and publisher.

🎯 Target Audience💡 Use Cases
Bibliometric analystsMap citation networks and measure impact
Academic researchersTrack who cites your publications
University administratorsEvaluate research impact for departments
Science policy makersAnalyze knowledge flow between institutions
Data scientistsBuild citation graph datasets for analysis
LibrariansEnrich catalog records with citation data

📋 What the OpenCitations Scraper does

  • 🔍 DOI-based search to find citations or references for any published work
  • 🆔 PMID support for biomedical publications indexed in PubMed
  • 📋 OMID support for OpenCitations internal identifier lookups
  • 🔄 Bidirectional search with citations (incoming) and references (outgoing) modes
  • 📊 Self-citation detection with flags for author and journal self-citations
  • 📝 Optional metadata including titles, authors, venues, and publication dates

The scraper queries the OpenCitations API with your identifier and search type, retrieves all matching citation relationships, and extracts structured data for each record. When metadata is enabled, it also fetches detailed bibliographic information for each citing or cited work. Results include unique citation identifiers (OCI), entity IDs, creation dates, timespans, self-citation flags, and full publication metadata.

💡 Why it matters: Manually collecting citation data from OpenCitations involves API queries, pagination, and metadata enrichment. This scraper handles everything automatically, delivering structured citation networks ready for analysis, visualization, or integration with other research tools.


🎬 Full Demo

🚧 Coming soon...


⚙️ Input

FieldTypeRequiredDescription
maxItemsintegerNoMax records to collect. Free: up to 10. Paid: up to 1,000,000
doistringNoDigital Object Identifier (e.g., 10.1016/j.jmb.2005.08.075)
pmidstringNoPubMed ID for biomedical publications
omidstringNoOpenCitations Meta Identifier (e.g., omid:br/06140242082)
searchTypestringNoSearch direction: citations (incoming) or references (outgoing)
includeMetadatabooleanNoFetch detailed metadata (title, authors, date) for each record

Example 1: Get citations for a DOI

{
"doi": "10.1016/j.jmb.2005.08.075",
"searchType": "citations",
"includeMetadata": true,
"maxItems": 50
}

Example 2: Get references from a PubMed article

{
"pmid": "16325459",
"searchType": "references",
"includeMetadata": true,
"maxItems": 100
}

⚠️ Good to Know: Provide one identifier (DOI, PMID, or OMID), not multiple. Enabling metadata makes the scraper slower but provides full bibliographic details for each citation. The default search type is "citations" (incoming citations).


📊 Output

🧾 Schema

EmojiFieldTypeDescription
📝ocistringUnique Open Citation Identifier
👤citingstringIdentifier of the citing entity
👤citedstringIdentifier of the cited entity
📅creationDatestringWhen the citation relationship was recorded
⏱️timespanstringTime between publication dates
📊journalSelfCitationbooleanWhether the citation is within the same journal
📊authorSelfCitationbooleanWhether the author cites their own work
📝titlestringPublication title (with metadata enabled)
👥authorsstringAuthor names (with metadata enabled)
📅publicationDatestringPublication date (with metadata enabled)
📖volumestringJournal volume
📄issuestringJournal issue
📍venuestringJournal or venue name
🏷️publicationTypestringType of publication
📄pagestringPage range
🏢publisherstringPublisher name
✏️editorstringEditor name
🆔workIdstringInternal work identifier
scrapedAtstringCollection timestamp
⚠️errorstringError message if processing failed

📦 Sample records


✨ Why choose this Actor

FeatureDetails
🔍 Three identifier typesSearch by DOI, PubMed ID, or OpenCitations Meta ID
🔄 Bidirectional searchFind incoming citations or outgoing references
📊 Self-citation detectionFlags for author and journal self-citations
📝 Optional metadataFull bibliographic details when enabled
🆓 Open dataAll OpenCitations data is freely available
📦 Flexible exportJSON, CSV, or Excel output
⚡ Automatic paginationHandles large citation networks automatically

📊 Map citation networks for any publication with up to 1,000,000 records per run, including self-citation detection and full metadata.


📈 How it compares to alternatives

FeatureThis ActorManual API QueriesGeneric Scrapers
DOI, PMID, and OMID supportManual
Self-citation detection
Optional metadata enrichmentManual
Bidirectional searchManual
Bulk collection (1M+ records)Manual
Structured JSON/CSV outputJSON onlyVaries
Scheduled runs

Get structured citation data at scale without writing API code or managing pagination.


🚀 How to use

  1. Create an Apify account - Sign up free with $5 credit
  2. Open the OpenCitations Scraper - Navigate to the actor page on Apify
  3. Enter a DOI, PMID, or OMID - Provide the identifier for the publication you want to analyze
  4. Choose search type and options - Select citations or references mode and enable metadata if needed
  5. Click Start - The actor collects citation relationships and delivers structured data

⏱️ A typical run with 50 citations completes in under 1 minute.


💼 Business use cases

📊 Bibliometric Analysis
  • Map citation networks for research impact assessment
  • Identify self-citations to calculate adjusted metrics
  • Track citation accumulation over time
  • Compare citation patterns across disciplines
🎓 Academic Research
  • Build citation graphs for literature reviews
  • Track who is citing your publications
  • Identify influential papers in your field
  • Analyze reference patterns in competitor research
🏛️ Research Administration
  • Evaluate faculty research impact for reviews
  • Track department-level citation metrics
  • Monitor publication influence across programs
  • Build reporting dashboards for stakeholders
📈 Data Science
  • Build citation graph datasets for network analysis
  • Train models on citation prediction tasks
  • Analyze knowledge flow between research fields
  • Create visualization datasets for research mapping

🔌 Automating OpenCitations Scraper

Integrate the OpenCitations Scraper into your workflow using the Apify API or client libraries.

Node.js:

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor("parseforge/open-citations-scraper").call({
doi: "10.1016/j.jmb.2005.08.075",
searchType: "citations",
includeMetadata: true,
maxItems: 100
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("parseforge/open-citations-scraper").call(run_input={
"doi": "10.1016/j.jmb.2005.08.075",
"searchType": "citations",
"includeMetadata": True,
"maxItems": 100
})
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(items)

Schedules: Set up recurring runs to monitor citation growth for your publications. Configure weekly or monthly schedules from the Apify Console to track new citations automatically.


❓ Frequently Asked Questions


🔌 Integrate with any app

  • 🔗 Make (Integromat) - Connect citation data to Google Sheets, Notion, or any of 1,500+ apps
  • 🔗 Zapier - Trigger workflows when new citations are detected
  • 🔗 Slack - Get notified when new citations appear for your publications
  • 🔗 Airbyte - Stream citation data into your data warehouse
  • 🔗 GitHub - Store citation datasets in repositories for version control
  • 🔗 Google Drive - Automatically save CSV exports to shared folders

ActorDescription
Crossref ScraperExtract DOI metadata for 155M+ research publications
PubMed Citation ScraperExtract publication metadata from PubMed for biomedical research
Open Library ScraperSearch and download book data from the Internet Archive
ROR ScraperCollect research organization data from ROR
US Census Bureau ScraperExtract demographic and economic data from the Census Bureau

💡 Pro Tip: Combine the OpenCitations Scraper with the Crossref Scraper to get both citation networks and full publication metadata for each cited work.


🆘 Need Help? Open our contact form and we will get back to you within 24 hours. We are happy to help with custom setups, integrations, or feature requests.


Disclaimer: This actor is not affiliated with, endorsed by, or connected to OpenCitations. It accesses publicly available data through the OpenCitations API. Use responsibly and in accordance with applicable terms of service.