DOAJ Open Access Journals Scraper avatar

DOAJ Open Access Journals Scraper

Pricing

Pay per event

Go to Apify Store
DOAJ Open Access Journals Scraper

DOAJ Open Access Journals Scraper

Export all open-access journals from the Directory of Open Access Journals (DOAJ). 22,000+ peer-reviewed journals across every subject. Filter by country, subject, language, or publication frequency. Pull titles, ISSNs, publishers, licenses, APC fees.

Pricing

Pay per event

Rating

5.0

(1)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

1

Bookmarked

10

Total users

1

Monthly active users

2 days ago

Last modified

Share

ParseForge Banner

📖 DOAJ Journal Metrics Scraper

🚀 Extract open access journal data from DOAJ in seconds. Filter by subject, language, country, or custom query. No coding, no API keys required.

🕒 Last updated: 2026-04-24 · 📊 25+ fields · 🔄 Runs on Apify cloud or locally · 📁 Export: JSON, CSV, Excel

The DOAJ Journal Metrics Scraper connects to the Directory of Open Access Journals to collect detailed metrics and publishing information for 22,000+ peer-reviewed journals. Each record includes 25+ structured fields covering journal titles, ISSN numbers, publisher details, article processing charges (APCs), licensing terms, editorial review types, and preservation services. Whether you need a quick sample of 10 journals or the entire DOAJ catalog, this tool handles it automatically.

Built for researchers evaluating journal options, academic librarians managing journal collections, publishers studying market trends, and funding agencies analyzing open access policies. The scraper uses DOAJ's own search API, supports advanced Lucene queries, and delivers clean structured data ready for spreadsheets and databases. No proxy required, no authentication needed, just fast access to the world's most complete open access journal directory.

Target AudienceUse Cases
Academic ResearchersJournal selection, APC comparison
University LibrariansCollection management, subscription analysis
PublishersMarket research, competitor analysis
Funding AgenciesOpen access policy assessment
Data ScientistsScholarly communication studies
Research AdministratorsCompliance monitoring, reporting

📋 What the DOAJ Journal Metrics Scraper does

  • 📝 Extracts journal titles and identifiers including official names, alternative titles, and DOAJ IDs for accurate tracking
  • 🔢 Collects ISSN numbers for both print (PISSN) and electronic (EISSN) editions for cross-database identification
  • 💰 Captures article processing charge (APC) details including fee amounts, currencies, and waiver availability
  • ⚖️ Gathers licensing and copyright information including license types and whether authors retain copyright
  • 📋 Pulls editorial process details with review types (single-blind, double-blind, open), plagiarism detection, and publication timelines
  • 🌍 Collects publisher and geographic data including publisher names, headquarters countries, and journal languages

The scraper queries the DOAJ search API with your specified filters, handles pagination automatically, and processes results in parallel for speed. You can use simple keyword searches or advanced Lucene query syntax for precise filtering.

💡 Why it matters: DOAJ is the gold standard directory for open access journals. Manually browsing 22,000+ entries is impractical. This scraper gives you the full catalog (or any filtered subset) as structured, analysis-ready data in minutes.


🎬 Full Demo

🚧 Coming soon...


⚙️ Input

FieldTypeRequiredDescription
searchQuerystringNoLucene query string. Use * for all journals. Examples: bibjson.keywords:medicine, publisher:elsevier
maxItemsintegerNoNumber of journals to retrieve. Free users: limited to 10. Paid users: up to 1,000,000.
subjectstringNoLibrary of Congress subject code (e.g., R for Medicine, Q for Science, H for Social Sciences).
languagestringNoISO 639-1 language code (e.g., EN, FR, ES, DE, ZH).
countrystringNoISO 3166-1 alpha-2 country code (e.g., US, GB, DE, IN, BR).
sortstringNoSort order: title:asc, title:desc, issn:asc, or issn:desc. Default: title:asc.
maxConcurrencyintegerNoParallel request limit. Lower if rate-limited. Default: 5.

Example 1: Medical journals in English

{
"searchQuery": "bibjson.keywords:medicine",
"language": "EN",
"maxItems": 100
}

Example 2: All journals from a specific country

{
"searchQuery": "*",
"country": "BR",
"sort": "title:asc",
"maxItems": 500
}

⚠️ Good to Know: Use * as the search query to get all journals. The subject, language, and country filters can be combined with any search query. Free users are automatically limited to 10 items per run.


📊 Output

🧾 Schema

EmojiFieldTypeDescription
📝titlestringOfficial journal title
📝alternativeTitlestringAlternative journal name
🆔doajIdstringUnique DOAJ identifier
🔢pissnstringPrint ISSN
🔢eissnstringElectronic ISSN
🏢publisherstringPublisher name
🌍publisherCountrystringPublisher headquarters country
💰hasApcbooleanWhether the journal charges APCs
💰apcAmountnumberAPC fee amount
💱apcCurrencystringAPC fee currency
🎫hasWaiverbooleanWhether APC waivers are available
⚖️licenseTypestringLicense type (CC BY, CC BY-NC, etc.)
⚖️authorRetainsCopyrightbooleanWhether authors keep copyright
📋reviewTypestringPeer review type (single-blind, double-blind, open)
🔍plagiarismDetectionbooleanWhether plagiarism screening is used
⏱️publicationTimeWeeksnumberAverage time from submission to publication
🗄️preservationServicesarrayDigital preservation programs
🌐languagesarrayPublication languages
📂subjectsarrayLCC subject classifications
🔗journalUrlstringJournal website URL
📅addedToDoajstringDate added to DOAJ
📅lastUpdatedstringLast update timestamp
📊articleCountnumberNumber of articles indexed
📅scrapedAtstringData collection timestamp
errorstringError message if extraction failed

📦 Sample records


✨ Why choose this Actor

FeatureDetails
📊 25+ structured fieldsTitles, ISSNs, APCs, licensing, review types, and more
🌐 22,000+ journals coveredAccess the full DOAJ catalog
💰 APC comparisonFee amounts, currencies, and waiver availability
⚖️ License trackingLicense types and author copyright retention
🔍 Advanced query syntaxLucene queries for precise filtering
🌍 Multi-filter supportSubject, language, country, and sort options
⚡ Parallel processingConcurrent requests for fast data collection

📈 Typical performance: Collects 300+ journal records per minute. The full DOAJ catalog of 22,000+ journals takes roughly 60-90 minutes.


📈 How it compares to alternatives

FeatureThis ActorManual DOAJ BrowsingGeneric Scrapers
25+ structured fields per journalPartial
APC and waiver information✅ (one at a time)
Advanced Lucene query filteringPartial
Export to CSV/JSON/ExcelPartial
Full catalog download
No coding requiredN/A
Scheduled runsPartial

Purpose-built for DOAJ data, with every journal field mapped and pagination handled automatically.


🚀 How to use

  1. Create a free Apify account - Sign up here (includes free credits)
  2. Open the DOAJ Journal Metrics Scraper - Navigate to the Actor page and click "Start"
  3. Set your search - Enter a search query or use * for all journals, then apply subject, language, or country filters
  4. Choose your limit - Set maxItems (free users: up to 10)
  5. Run and download - Click "Start", wait for completion, then export as JSON, CSV, or Excel

⏱️ First results appear in under 10 seconds. A run of 100 journals completes in about 30 seconds.


💼 Business use cases

Academic Libraries

  • Compare APC fees across journals in a field
  • Audit open access compliance for funded research
  • Build journal recommendation lists for faculty
  • Track new journal additions to DOAJ

Publishers & Editors

  • Benchmark APCs against competitor journals
  • Analyze peer review practices by discipline
  • Study publication timelines across markets
  • Identify gaps in journal coverage by subject

Research Administration

  • Monitor open access policy compliance
  • Generate reports on institutional publishing patterns
  • Evaluate preservation service coverage
  • Track license type adoption trends

Data Science & Bibliometrics

  • Build datasets for scholarly communication studies
  • Analyze geographic distribution of open access
  • Create journal classification models
  • Study the relationship between APCs and review quality

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

  • Empirical datasets for papers, thesis work, and coursework
  • Longitudinal studies tracking changes across snapshots
  • Reproducible research with cited, versioned data pulls
  • Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

  • Side projects, portfolio demos, and indie app launches
  • Data visualizations, dashboards, and infographics
  • Content research for bloggers, YouTubers, and podcasters
  • Hobbyist collections and personal trackers

🤝 Non-profit and civic

  • Transparency reporting and accountability projects
  • Advocacy campaigns backed by public-interest data
  • Community-run databases for local issues
  • Investigative journalism on public records

🧪 Experimentation

  • Prototype AI and machine-learning pipelines with real data
  • Validate product-market hypotheses before engineering spend
  • Train small domain-specific models on niche corpora
  • Test dashboard concepts with live input

🔌 Automating DOAJ Journal Metrics Scraper

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor("parseforge/doaj-scraper").call({
searchQuery: "*",
subject: "R",
maxItems: 200
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("parseforge/doaj-scraper").call(run_input={
"searchQuery": "*",
"subject": "R",
"maxItems": 200
})
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(items)

Schedules: Set up monthly runs with Apify Schedules to track new journals added to DOAJ, monitor APC changes, and maintain an up-to-date journal directory.

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:



❓ Frequently Asked Questions


🔌 Integrate with any app

  • 🔗 Make (Integromat) - Connect DOAJ data to 1,000+ apps with visual workflows
  • 🔗 Zapier - Trigger notifications when new journals match your criteria
  • 🔗 Slack - Send alerts to your team when new open access journals appear
  • 🔗 Airbyte - Sync journal data to your data warehouse or database
  • 🔗 GitHub - Automate journal directory updates with GitHub Actions
  • 🔗 Google Drive - Export journal listings to Google Sheets

ActorDescription
📚 PubMed Citation ScraperExtract citation data and metadata from PubMed biomedical literature
🔬 OpenAlex ScraperQuery 250M+ scholarly records from the OpenAlex open catalog
📖 Open Library ScraperExtract book metadata and availability from Open Library
🧬 Crossref ScraperCollect DOI metadata and citation information from Crossref
📄 Unpaywall ScraperFind free legal copies of research articles via Unpaywall

💡 Pro Tip: Combine the DOAJ Scraper with the Crossref Scraper to match journal-level metadata with article-level citation data for full bibliometric analysis.


🆘 Need Help? Open our contact form and we will get back to you within 24 hours. For bug reports, feature requests, or integration help, we are here to assist.


Disclaimer: This Actor is provided as-is, without warranty. It is not affiliated with or endorsed by the Directory of Open Access Journals (DOAJ). Use it responsibly and in compliance with applicable terms of service. The authors are not responsible for how the collected data is used. Always verify data accuracy for critical applications.