# DOAJ Open Access Journals Scraper — Directory API (`compute-edge/doaj-open-journals-scraper`) Actor

Extract open-access journals from DOAJ (Directory of Open Access Journals), 23k+ vetted journals. Filter by search query. Returns title, publisher, country, ISSNs, subjects, languages, homepage, APC fees, license, and peer-review process for academic and publishing intelligence.

- **URL**: https://apify.com/compute-edge/doaj-open-journals-scraper.md
- **Developed by:** [Compute Edge](https://apify.com/compute-edge) (community)
- **Categories:** Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## DOAJ Open Access Journals Scraper

Extract comprehensive metadata for **23,000+ peer-reviewed open-access journals** from the **Directory of Open Access Journals (DOAJ)** — the world's largest curated list of vetted, open-access scholarly journals. This Actor provides seamless access to journal titles, publishers, ISSN identifiers, subject classifications, article processing charge (APC) information, and homepage URLs via the free DOAJ API.

DOAJ is maintained by a global community and includes journals in every academic discipline, from medicine and biology to social sciences and humanities. All journals are independently quality-reviewed and freely available online. This Actor is ideal for **academic research**, **library acquisition workflows**, **publisher intelligence**, and **open science initiatives**.

### Key Features

- **Complete DOAJ catalog access** — 23,000+ peer-reviewed open-access journals in a single Actor run
- **Free API** — No authentication required, no rate limits, no API keys
- **Search filtering** — Use DOAJ's query syntax to filter journals by title, subject, keywords, publisher, or ISSN
- **Full pagination support** — Automatically paginate through results, configurable result limits
- **Rich metadata extraction** — Titles, publishers, ISSN numbers, subjects, keywords, languages, APC status, and homepage URLs
- **Batch-optimized output** — Clean JSON ready for library systems, publisher databases, academic research, or RAG pipelines
- **Default fallback** — Run with empty input to fetch all journals (subject to maxResults limit)

### Output Data Fields

| Field | Type | Description |
|-------|------|-------------|
| `id` | string | DOAJ journal identifier |
| `title` | string | Journal title |
| `alternativeTitle` | string | Alternative or translated journal title (if available) |
| `publisher` | string | Publisher name |
| `publisherCountry` | string | ISO country code of publisher |
| `pissn` | string | Print ISSN (if available) |
| `eissn` | string | Electronic ISSN |
| `subjects` | string | Subject classifications (comma-separated) |
| `keywords` | string | Journal keywords (comma-separated) |
| `languages` | string | Languages published (comma-separated) |
| `homepage` | string | Journal homepage URL |
| `hasAPC` | boolean | Whether the journal charges article processing fees (APC) |
| `apcMaxPrice` | string | Maximum APC amount (if applicable) |
| `apcCurrency` | string | Currency of APC pricing |
| `oaStart` | integer | Year the journal started open access publication |
| `licenseType` | string | Primary license type (e.g., CC-BY, CC-BY-NC, etc.) |
| `reviewProcess` | string | Peer review process type (comma-separated) |
| `lastUpdated` | string | Last update timestamp from DOAJ |

### How to Scrape DOAJ Journal Data

1. Navigate to the **DOAJ Open Access Journals Scraper** Actor page on Apify Store.
2. Click **Start** to open the input configuration form.
3. (Optional) Enter a **Search Query** to filter journals:
   - Leave blank or use `*` for all journals
   - Examples: `medicine`, `cancer`, `bibjson.title:health`, `bibjson.keywords:AI`
4. (Optional) Set **Max Results** to control output size (default: 1000, set to 0 for unlimited).
5. Click **Start** to run the Actor.
6. Download results as JSON, CSV, or Excel from the **Dataset** tab.
7. Integrate results into your library system, publisher database, or research workflow.

### Input Example

```json
{
    "query": "medicine",
    "maxResults": 500
}
````

To fetch all journals (respecting maxResults):

```json
{
    "query": "*",
    "maxResults": 0
}
```

### Output Example

```json
{
    "id": "7e9fe18d6e1b49c9a2e8d1f5b6c2e9a3",
    "title": "PLOS Medicine",
    "alternativeTitle": "",
    "publisher": "Public Library of Science (PLOS)",
    "publisherCountry": "US",
    "pissn": "1549-1676",
    "eissn": "1549-1676",
    "subjects": "Medicine, Health professions",
    "keywords": "medicine, health, diseases",
    "languages": "English",
    "homepage": "https://journals.plos.org/plosmedicine/",
    "hasAPC": true,
    "apcMaxPrice": "2450",
    "apcCurrency": "USD",
    "oaStart": 2004,
    "licenseType": "CC-BY",
    "reviewProcess": "Double blind peer review",
    "lastUpdated": "2026-01-15T10:32:45Z"
}
```

### Pricing

This Actor fetches data from a free, public DOAJ API with no authentication or proxy costs.

- **Cost per run**: ~$0.0005 (API calls only, no browser required)
- **Actor start event**: Default platform rate
- **Per-result pricing**: $0.003/result

Typical run time is 10-60 seconds depending on the number of results requested.

### Use Cases

- **Library acquisition workflows** — Identify open-access journals in your subject areas for collection development
- **Publisher intelligence** — Monitor competitors' open-access journal portfolios and APC pricing
- **Academic research** — Analyze trends in open-access publishing by subject, geography, or APC policy
- **Open science initiatives** — Build comprehensive datasets of freely available peer-reviewed journals
- **Journal discovery** — Help researchers find quality open-access venues for manuscript submission
- **Metadata enrichment** — Augment your journal database with DOAJ's curated, peer-reviewed catalog
- **RAG pipeline ingestion** — Clean structured output ready for LLM-based academic research analysis

### DOAJ Query Syntax (Advanced)

The `query` parameter supports DOAJ's Lucene-like search syntax:

- `*` — All journals (default)
- `medicine` — Keywords partial match across all fields
- `bibjson.title:health` — Filter by title containing "health"
- `bibjson.keywords:AI` — Filter by keywords containing "AI"
- `bibjson.publisher.name:Springer` — Filter by publisher
- `bibjson.pissn:2214-5095` — Filter by PISSN
- `bibjson.eissn:2214-5095` — Filter by EISSN
- Multiple terms: `bibjson.title:cancer AND bibjson.publisher.country:US`

See [DOAJ API documentation](https://doaj.org/api/v3/docs) for the complete query syntax reference.

### FAQ

**Q: Can I search for specific journals by ISSN?**
A: Yes. Use `bibjson.pissn:XXXX-XXXX` or `bibjson.eissn:XXXX-XXXX` to find journals by their ISSN.

**Q: Are results limited to English-language journals?**
A: No. DOAJ includes journals published in many languages. The `languages` field in output shows which languages each journal publishes.

**Q: What's the maximum number of results I can retrieve?**
A: The API supports up to 25,000 results. Set `maxResults` to 0 for unlimited or adjust as needed.

**Q: Can I get journals by subject area?**
A: Yes. Use the `subjects` field in output to filter results, or query by discipline (e.g., `medicine`, `computer science`).

**Q: Are there any rate limits?**
A: DOAJ's public API has no rate limits. You can run this Actor as many times as needed without restrictions.

**Q: What does "APC" mean?**
A: APC (Article Processing Charge) is a fee some open-access journals charge authors to publish articles. The `hasAPC` field indicates whether a journal charges fees.

### Legal Disclaimer

This Actor accesses publicly available data from the **DOAJ (Directory of Open Access Journals)** API, which is maintained as a free public service. All data is provided under DOAJ's terms of use. Respect the intellectual property rights of journal publishers and authors. Use extracted data only for legitimate academic, research, or commercial purposes in compliance with local laws and the journals' terms of service.

DOAJ is a community-curated resource supported by a diverse editorial team. Consider contributing to DOAJ if your institution benefits from its services.

### Support

For issues, feature requests, or questions about this Actor, please contact the developer or file an issue on the Actor's repository.

# Actor input Schema

## `query` (type: `string`):

Search query for journals. Use '\*' for all journals (default). Examples: 'medicine', 'cancer', 'bibjson.title:health', 'bibjson.keywords:AI'.

## `maxResults` (type: `integer`):

Maximum number of journal records to return. Set to 0 for unlimited (up to 23,000+). Default: 1000.

## Actor input object example

```json
{
  "query": "*",
  "maxResults": 1000
}
```

# Actor output Schema

## `dataset` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("compute-edge/doaj-open-journals-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("compute-edge/doaj-open-journals-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call compute-edge/doaj-open-journals-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=compute-edge/doaj-open-journals-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "DOAJ Open Access Journals Scraper — Directory API",
        "description": "Extract open-access journals from DOAJ (Directory of Open Access Journals), 23k+ vetted journals. Filter by search query. Returns title, publisher, country, ISSNs, subjects, languages, homepage, APC fees, license, and peer-review process for academic and publishing intelligence.",
        "version": "0.1",
        "x-build-id": "ENi3tTsL2IUH6MUPK"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/compute-edge~doaj-open-journals-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-compute-edge-doaj-open-journals-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/compute-edge~doaj-open-journals-scraper/runs": {
            "post": {
                "operationId": "runs-sync-compute-edge-doaj-open-journals-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/compute-edge~doaj-open-journals-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-compute-edge-doaj-open-journals-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "query": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search query for journals. Use '*' for all journals (default). Examples: 'medicine', 'cancer', 'bibjson.title:health', 'bibjson.keywords:AI'.",
                        "default": "*"
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 0,
                        "maximum": 25000,
                        "type": "integer",
                        "description": "Maximum number of journal records to return. Set to 0 for unlimited (up to 23,000+). Default: 1000.",
                        "default": 1000
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
