Open Citations Scraper
Pricing
Pay per event
Open Citations Scraper
Comprehensive OpenCitations scraper for extracting citation and reference data from OpenCitations API. Perfect for researchers, academics, and data scientists who need automated access to citation networks, bibliographic metadata, and citation analysis data.
Pricing
Pay per event
Rating
0.0
(0)
Developer
ParseForge
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
4 days ago
Last modified
Categories
Share

📚 OpenCitations Scraper
🚀 Extract citation networks and bibliographic metadata from OpenCitations in seconds. Search by DOI, PMID, or OMID. No coding, no API keys required.
🕒 Last updated: 2026-04-16 · 📊 20 fields · 🔍 Citations and references modes · 📄 Optional detailed metadata
OpenCitations is an open scholarly infrastructure providing free access to citation data from millions of academic publications. This scraper collects citation relationships, self-citation flags, and optional bibliographic metadata (authors, titles, venues, publication dates) for any publication identified by DOI, PubMed ID, or OpenCitations Meta ID. Choose between citations mode (who cited this work) and references mode (what this work cites) to map research influence in either direction.
Researchers, bibliometric analysts, and data scientists use this actor to build citation networks, track research impact, identify self-citations, and analyze how knowledge flows between publications. Instead of querying the OpenCitations API manually and parsing responses, you get clean, structured data exported as JSON, CSV, or Excel. With metadata enabled, every record includes the citing and cited entity IDs, creation date, timespan, self-citation flags, plus the full title, authors, publication date, venue, and publisher.
| 🎯 Target Audience | 💡 Use Cases |
|---|---|
| Bibliometric analysts | Map citation networks and measure impact |
| Academic researchers | Track who cites your publications |
| University administrators | Evaluate research impact for departments |
| Science policy makers | Analyze knowledge flow between institutions |
| Data scientists | Build citation graph datasets for analysis |
| Librarians | Enrich catalog records with citation data |
📋 What the OpenCitations Scraper does
- 🔍 DOI-based search to find citations or references for any published work
- 🆔 PMID support for biomedical publications indexed in PubMed
- 📋 OMID support for OpenCitations internal identifier lookups
- 🔄 Bidirectional search with citations (incoming) and references (outgoing) modes
- 📊 Self-citation detection with flags for author and journal self-citations
- 📝 Optional metadata including titles, authors, venues, and publication dates
The scraper queries the OpenCitations API with your identifier and search type, retrieves all matching citation relationships, and extracts structured data for each record. When metadata is enabled, it also fetches detailed bibliographic information for each citing or cited work. Results include unique citation identifiers (OCI), entity IDs, creation dates, timespans, self-citation flags, and full publication metadata.
💡 Why it matters: Manually collecting citation data from OpenCitations involves API queries, pagination, and metadata enrichment. This scraper handles everything automatically, delivering structured citation networks ready for analysis, visualization, or integration with other research tools.
🎬 Full Demo
🚧 Coming soon...
⚙️ Input
| Field | Type | Required | Description |
|---|---|---|---|
| maxItems | integer | No | Max records to collect. Free: up to 10. Paid: up to 1,000,000 |
| doi | string | No | Digital Object Identifier (e.g., 10.1016/j.jmb.2005.08.075) |
| pmid | string | No | PubMed ID for biomedical publications |
| omid | string | No | OpenCitations Meta Identifier (e.g., omid:br/06140242082) |
| searchType | string | No | Search direction: citations (incoming) or references (outgoing) |
| includeMetadata | boolean | No | Fetch detailed metadata (title, authors, date) for each record |
Example 1: Get citations for a DOI
{"doi": "10.1016/j.jmb.2005.08.075","searchType": "citations","includeMetadata": true,"maxItems": 50}
Example 2: Get references from a PubMed article
{"pmid": "16325459","searchType": "references","includeMetadata": true,"maxItems": 100}
⚠️ Good to Know: Provide one identifier (DOI, PMID, or OMID), not multiple. Enabling metadata makes the scraper slower but provides full bibliographic details for each citation. The default search type is "citations" (incoming citations).
📊 Output
🧾 Schema
| Emoji | Field | Type | Description |
|---|---|---|---|
| 📝 | oci | string | Unique Open Citation Identifier |
| 👤 | citing | string | Identifier of the citing entity |
| 👤 | cited | string | Identifier of the cited entity |
| 📅 | creationDate | string | When the citation relationship was recorded |
| ⏱️ | timespan | string | Time between publication dates |
| 📊 | journalSelfCitation | boolean | Whether the citation is within the same journal |
| 📊 | authorSelfCitation | boolean | Whether the author cites their own work |
| 📝 | title | string | Publication title (with metadata enabled) |
| 👥 | authors | string | Author names (with metadata enabled) |
| 📅 | publicationDate | string | Publication date (with metadata enabled) |
| 📖 | volume | string | Journal volume |
| 📄 | issue | string | Journal issue |
| 📍 | venue | string | Journal or venue name |
| 🏷️ | publicationType | string | Type of publication |
| 📄 | page | string | Page range |
| 🏢 | publisher | string | Publisher name |
| ✏️ | editor | string | Editor name |
| 🆔 | workId | string | Internal work identifier |
| ⏰ | scrapedAt | string | Collection timestamp |
| ⚠️ | error | string | Error message if processing failed |
📦 Sample records
✨ Why choose this Actor
| Feature | Details |
|---|---|
| 🔍 Three identifier types | Search by DOI, PubMed ID, or OpenCitations Meta ID |
| 🔄 Bidirectional search | Find incoming citations or outgoing references |
| 📊 Self-citation detection | Flags for author and journal self-citations |
| 📝 Optional metadata | Full bibliographic details when enabled |
| 🆓 Open data | All OpenCitations data is freely available |
| 📦 Flexible export | JSON, CSV, or Excel output |
| ⚡ Automatic pagination | Handles large citation networks automatically |
📊 Map citation networks for any publication with up to 1,000,000 records per run, including self-citation detection and full metadata.
📈 How it compares to alternatives
| Feature | This Actor | Manual API Queries | Generic Scrapers |
|---|---|---|---|
| DOI, PMID, and OMID support | ✅ | Manual | ❌ |
| Self-citation detection | ✅ | ✅ | ❌ |
| Optional metadata enrichment | ✅ | Manual | ❌ |
| Bidirectional search | ✅ | Manual | ❌ |
| Bulk collection (1M+ records) | ✅ | Manual | ❌ |
| Structured JSON/CSV output | ✅ | JSON only | Varies |
| Scheduled runs | ✅ | ❌ | ❌ |
Get structured citation data at scale without writing API code or managing pagination.
🚀 How to use
- Create an Apify account - Sign up free with $5 credit
- Open the OpenCitations Scraper - Navigate to the actor page on Apify
- Enter a DOI, PMID, or OMID - Provide the identifier for the publication you want to analyze
- Choose search type and options - Select citations or references mode and enable metadata if needed
- Click Start - The actor collects citation relationships and delivers structured data
⏱️ A typical run with 50 citations completes in under 1 minute.
💼 Business use cases
📊 Bibliometric Analysis
| 🎓 Academic Research
|
🏛️ Research Administration
| 📈 Data Science
|
🔌 Automating OpenCitations Scraper
Integrate the OpenCitations Scraper into your workflow using the Apify API or client libraries.
Node.js:
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor("parseforge/open-citations-scraper").call({doi: "10.1016/j.jmb.2005.08.075",searchType: "citations",includeMetadata: true,maxItems: 100});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Python:
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("parseforge/open-citations-scraper").call(run_input={"doi": "10.1016/j.jmb.2005.08.075","searchType": "citations","includeMetadata": True,"maxItems": 100})items = list(client.dataset(run["defaultDatasetId"]).iterate_items())print(items)
Schedules: Set up recurring runs to monitor citation growth for your publications. Configure weekly or monthly schedules from the Apify Console to track new citations automatically.
❓ Frequently Asked Questions
🔌 Integrate with any app
- 🔗 Make (Integromat) - Connect citation data to Google Sheets, Notion, or any of 1,500+ apps
- 🔗 Zapier - Trigger workflows when new citations are detected
- 🔗 Slack - Get notified when new citations appear for your publications
- 🔗 Airbyte - Stream citation data into your data warehouse
- 🔗 GitHub - Store citation datasets in repositories for version control
- 🔗 Google Drive - Automatically save CSV exports to shared folders
🔗 Recommended Actors
| Actor | Description |
|---|---|
| Crossref Scraper | Extract DOI metadata for 155M+ research publications |
| PubMed Citation Scraper | Extract publication metadata from PubMed for biomedical research |
| Open Library Scraper | Search and download book data from the Internet Archive |
| ROR Scraper | Collect research organization data from ROR |
| US Census Bureau Scraper | Extract demographic and economic data from the Census Bureau |
💡 Pro Tip: Combine the OpenCitations Scraper with the Crossref Scraper to get both citation networks and full publication metadata for each cited work.
🆘 Need Help? Open our contact form and we will get back to you within 24 hours. We are happy to help with custom setups, integrations, or feature requests.
Disclaimer: This actor is not affiliated with, endorsed by, or connected to OpenCitations. It accesses publicly available data through the OpenCitations API. Use responsibly and in accordance with applicable terms of service.