Pricing

Pay per event

Open Citations Scraper

Comprehensive OpenCitations scraper for extracting citation and reference data from OpenCitations API. Perfect for researchers, academics, and data scientists who need automated access to citation networks, bibliographic metadata, and citation analysis data.

Pricing

Pay per event

Rating

0.0

(0)

Developer

ParseForge

Actor stats

Bookmarked

Total users

Monthly active users

10 days ago

Last modified

📚 OpenCitations Scraper

🚀 Extract citation networks and bibliographic metadata from OpenCitations in seconds. Search by DOI, PMID, or OMID. No coding, no API keys required.

🕒 Last updated: 2026-04-23 · 📊 20 fields · 🔍 Citations and references modes · 📄 Optional detailed metadata

OpenCitations is an open scholarly infrastructure providing free access to citation data from millions of academic publications. This scraper collects citation relationships, self-citation flags, and optional bibliographic metadata (authors, titles, venues, publication dates) for any publication identified by DOI, PubMed ID, or OpenCitations Meta ID. Choose between citations mode (who cited this work) and references mode (what this work cites) to map research influence in either direction.

Researchers, bibliometric analysts, and data scientists use this actor to build citation networks, track research impact, identify self-citations, and analyze how knowledge flows between publications. Instead of querying the OpenCitations API manually and parsing responses, you get clean, structured data exported as JSON, CSV, or Excel. With metadata enabled, every record includes the citing and cited entity IDs, creation date, timespan, self-citation flags, plus the full title, authors, publication date, venue, and publisher.

🎯 Target Audience	💡 Use Cases
Bibliometric analysts	Map citation networks and measure impact
Academic researchers	Track who cites your publications
University administrators	Evaluate research impact for departments
Science policy makers	Analyze knowledge flow between institutions
Data scientists	Build citation graph datasets for analysis
Librarians	Enrich catalog records with citation data

📋 What the OpenCitations Scraper does

🔍 DOI-based search to find citations or references for any published work
🆔 PMID support for biomedical publications indexed in PubMed
📋 OMID support for OpenCitations internal identifier lookups
🔄 Bidirectional search with citations (incoming) and references (outgoing) modes
📊 Self-citation detection with flags for author and journal self-citations
📝 Optional metadata including titles, authors, venues, and publication dates

The scraper queries the OpenCitations API with your identifier and search type, retrieves all matching citation relationships, and extracts structured data for each record. When metadata is enabled, it also fetches detailed bibliographic information for each citing or cited work. Results include unique citation identifiers (OCI), entity IDs, creation dates, timespans, self-citation flags, and full publication metadata.

💡 Why it matters: Manually collecting citation data from OpenCitations involves API queries, pagination, and metadata enrichment. This scraper handles everything automatically, delivering structured citation networks ready for analysis, visualization, or integration with other research tools.

🎬 Full Demo

🚧 Coming soon...

⚙️ Input

Field	Type	Required	Description
maxItems	integer	No	Max records to collect. Free: up to 10. Paid: up to 1,000,000
doi	string	No	Digital Object Identifier (e.g., 10.1016/j.jmb.2005.08.075)
pmid	string	No	PubMed ID for biomedical publications
omid	string	No	OpenCitations Meta Identifier (e.g., omid:br/06140242082)
searchType	string	No	Search direction: citations (incoming) or references (outgoing)
includeMetadata	boolean	No	Fetch detailed metadata (title, authors, date) for each record

Example 1: Get citations for a DOI

{
  "doi": "10.1016/j.jmb.2005.08.075",
  "searchType": "citations",
  "includeMetadata": true,
  "maxItems": 50
}

Example 2: Get references from a PubMed article

{
  "pmid": "16325459",
  "searchType": "references",
  "includeMetadata": true,
  "maxItems": 100
}

⚠️ Good to Know: Provide one identifier (DOI, PMID, or OMID), not multiple. Enabling metadata makes the scraper slower but provides full bibliographic details for each citation. The default search type is "citations" (incoming citations).

📊 Output

🧾 Schema

Emoji	Field	Type	Description
📝	oci	string	Unique Open Citation Identifier
👤	citing	string	Identifier of the citing entity
👤	cited	string	Identifier of the cited entity
📅	creationDate	string	When the citation relationship was recorded
⏱️	timespan	string	Time between publication dates
📊	journalSelfCitation	boolean	Whether the citation is within the same journal
📊	authorSelfCitation	boolean	Whether the author cites their own work
📝	title	string	Publication title (with metadata enabled)
👥	authors	string	Author names (with metadata enabled)
📅	publicationDate	string	Publication date (with metadata enabled)
📖	volume	string	Journal volume
📄	issue	string	Journal issue
📍	venue	string	Journal or venue name
🏷️	publicationType	string	Type of publication
📄	page	string	Page range
🏢	publisher	string	Publisher name
✏️	editor	string	Editor name
🆔	workId	string	Internal work identifier
⏰	scrapedAt	string	Collection timestamp
⚠️	error	string	Error message if processing failed

📦 Sample records

✨ Why choose this Actor

Feature	Details
🔍 Three identifier types	Search by DOI, PubMed ID, or OpenCitations Meta ID
🔄 Bidirectional search	Find incoming citations or outgoing references
📊 Self-citation detection	Flags for author and journal self-citations
📝 Optional metadata	Full bibliographic details when enabled
🆓 Open data	All OpenCitations data is freely available
📦 Flexible export	JSON, CSV, or Excel output
⚡ Automatic pagination	Handles large citation networks automatically

📊 Map citation networks for any publication with up to 1,000,000 records per run, including self-citation detection and full metadata.

📈 How it compares to alternatives

Feature	This Actor	Manual API Queries	Generic Scrapers
DOI, PMID, and OMID support	✅	Manual	❌
Self-citation detection	✅	✅	❌
Optional metadata enrichment	✅	Manual	❌
Bidirectional search	✅	Manual	❌
Bulk collection (1M+ records)	✅	Manual	❌
Structured JSON/CSV output	✅	JSON only	Varies
Scheduled runs	✅	❌	❌

Get structured citation data at scale without writing API code or managing pagination.

🚀 How to use

Create an Apify account - Sign up free with $5 credit
Open the OpenCitations Scraper - Navigate to the actor page on Apify
Enter a DOI, PMID, or OMID - Provide the identifier for the publication you want to analyze
Choose search type and options - Select citations or references mode and enable metadata if needed
Click Start - The actor collects citation relationships and delivers structured data

⏱️ A typical run with 50 citations completes in under 1 minute.

💼 Business use cases

📊 Bibliometric Analysis Map citation networks for research impact assessment Identify self-citations to calculate adjusted metrics Track citation accumulation over time Compare citation patterns across disciplines	🎓 Academic Research Build citation graphs for literature reviews Track who is citing your publications Identify influential papers in your field Analyze reference patterns in competitor research
🏛️ Research Administration Evaluate faculty research impact for reviews Track department-level citation metrics Monitor publication influence across programs Build reporting dashboards for stakeholders	📈 Data Science Build citation graph datasets for network analysis Train models on citation prediction tasks Analyze knowledge flow between research fields Create visualization datasets for research mapping

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

Empirical datasets for papers, thesis work, and coursework
Longitudinal studies tracking changes across snapshots
Reproducible research with cited, versioned data pulls
Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

Side projects, portfolio demos, and indie app launches
Data visualizations, dashboards, and infographics
Content research for bloggers, YouTubers, and podcasters
Hobbyist collections and personal trackers

🤝 Non-profit and civic

Transparency reporting and accountability projects
Advocacy campaigns backed by public-interest data
Community-run databases for local issues
Investigative journalism on public records

🧪 Experimentation

Prototype AI and machine-learning pipelines with real data
Validate product-market hypotheses before engineering spend
Train small domain-specific models on niche corpora
Test dashboard concepts with live input

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

❓ Frequently Asked Questions

🔌 Automating OpenCitations Scraper

Integrate the OpenCitations Scraper into your workflow using the Apify API or client libraries.

Node.js:

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor("parseforge/open-citations-scraper").call({
  doi: "10.1016/j.jmb.2005.08.075",
  searchType: "citations",
  includeMetadata: true,
  maxItems: 100
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python:

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("parseforge/open-citations-scraper").call(run_input={
    "doi": "10.1016/j.jmb.2005.08.075",
    "searchType": "citations",
    "includeMetadata": True,
    "maxItems": 100
})
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(items)

Schedules: Set up recurring runs to monitor citation growth for your publications. Configure weekly or monthly schedules from the Apify Console to track new citations automatically.

🔌 Integrate with any app

🔗 Make (Integromat) - Connect citation data to Google Sheets, Notion, or any of 1,500+ apps
🔗 Zapier - Trigger workflows when new citations are detected
🔗 Slack - Get notified when new citations appear for your publications
🔗 Airbyte - Stream citation data into your data warehouse
🔗 GitHub - Store citation datasets in repositories for version control
🔗 Google Drive - Automatically save CSV exports to shared folders

🔗 Recommended Actors

Actor	Description
Crossref Scraper	Extract DOI metadata for 155M+ research publications
PubMed Citation Scraper	Extract publication metadata from PubMed for biomedical research
Open Library Scraper	Search and download book data from the Internet Archive
ROR Scraper	Collect research organization data from ROR
US Census Bureau Scraper	Extract demographic and economic data from the Census Bureau

💡 Pro Tip: Combine the OpenCitations Scraper with the Crossref Scraper to get both citation networks and full publication metadata for each cited work.

🆘 Need Help? Open our contact form and we will get back to you within 24 hours. We are happy to help with custom setups, integrations, or feature requests.

Disclaimer: This actor is not affiliated with, endorsed by, or connected to OpenCitations. It accesses publicly available data through the OpenCitations API. Use responsibly and in accordance with applicable terms of service.

Pubmed Citation Scraper

parseforge/pubmed-citation-scraper

Automate collection of detailed citation information from the world's largest biomedical literature database. Extract complete citation data including titles, authors, abstracts, publication dates, journals, DOIs, MeSH terms, and more from NCBI's PubMed database.

ParseForge

5.0

OpenAlex Scraper - Academic Papers & Citations

benthepythondev/openalex-scraper

OpenAlex Scraper to search 250M+ academic papers via the free OpenAlex API. Extract title, authors, institutions, year, venue, DOI, citation count, open-access status, concepts and PDF links. Filter by year and open access. For literature reviews, citation analysis and AI/RAG datasets.

ben

Academic Paper Scraper

labrat011/academic-paper-scraper

Search MILLIONS of academic papers from Semantic Scholar and arXiv by keyword, DOI, or citation graph. Returns titles, authors, abstracts, citation counts, and open access PDFs as clean JSON. Works as an MCP tool for AI agents.

mick_

Semantic Scholar Paper Scraper

agenscrape/semantic-scholar-paper-scraper

Scrape academic papers from Semantic Scholar. Search by keyword and extract paper titles, abstracts, authors, citation counts, publication dates, DOIs, open access PDFs... Perfect for literature reviews, citation analysis, and research databases. Real time data output with pagination support.

Agenscrape

Google Scholar Scraper

cloud9_ai/google-scholar-scraper

Extract academic papers from Google Scholar: title, authors, year, journal, citation count, abstract snippet, PDF links. Search by keyword with year range filters. Stricter rate limiting for reliability. Perfect for literature review, research trend analysis, citation tracking.

cloud9

DataCite Metadata Scraper

parseforge/datacite-metadata-scraper

Comprehensive DataCite metadata scraper for extracting DOI metadata from DataCite API. Perfect for researchers, librarians, and data scientists who need automated access to scholarly publication metadata, research datasets, and digital object identifiers.

ParseForge

Google Scholar Scraper — Papers, Citations & Author Profiles

khadinakbar/google-scholar-scraper

Scrape Google Scholar across 6 modes: paper search, citation export (BibTeX/APA/MLA/Chicago), author profiles (h-index, i10-index), publication lists, citation history, and co-author networks. MCP-ready. Hybrid Camoufox + SerpApi managed/BYOK fallback for high reliability.

Khadin Akbar

📄 Academic Paper Scraper — Research & Citations

nexgendata/academic-paper-scraper

Scrape academic papers, research articles, citations, author profiles, and h-index data from Google Scholar. Extract abstracts, publication dates, journal names, and citation counts for literature reviews.

NexGenData

AI Search Visibility Tracker — AEO & Citation Audit

khadinakbar/ai-search-visibility-tracker

Check if your domain gets cited by Perplexity, ChatGPT, Claude & Gemini. Tracks citation rank, content gaps & competing domains per keyword. AEO audit.

Khadin Akbar

OSHA Inspection & Citation Search

ryanclinton/osha-inspection-search

OSHA Inspection & Citation Search is an Apify actor that searches and retrieves workplace inspection records and violation citations from the U.S. Occupational Safety and Health Administration (OSHA) database.