# PLOS Open Access Journals Scraper (`crawlerbros/plos-one-scraper`) Actor

Search and browse PLOS (Public Library of Science) open-access academic journals - PLOS ONE, PLOS Biology, PLOS Medicine, PLOS Genetics, PLOS Computational Biology, PLOS Pathogens, and PLOS Neglected Tropical Diseases. No API key required.

- **URL**: https://apify.com/crawlerbros/plos-one-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Agents, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## PLOS Open Access Journals Scraper

Search and browse academic articles from **PLOS (Public Library of Science)** — one of the world's largest publishers of fully open-access peer-reviewed research. This actor uses the official PLOS Solr search API, which is free and requires no API key.

**Supported Journals:**
- PLOS ONE
- PLOS Biology
- PLOS Medicine
- PLOS Genetics
- PLOS Computational Biology
- PLOS Pathogens
- PLOS Neglected Tropical Diseases

---

### What You Can Do

- **Keyword search** across 400,000+ articles in all PLOS journals
- **Browse by journal** to get the latest articles from a specific PLOS publication
- **Filter by article type** — Research Articles, Reviews, Meta-Analyses, Perspectives, and more
- **Filter by date range** to narrow results to a publication window
- **Paginate** through large result sets up to 10,000 articles per run

---

### Input

| Field | Type | Description |
|---|---|---|
| `mode` | Select | `searchArticles` — keyword search; `browseJournal` — list articles from one journal |
| `query` | String | Search terms (e.g. `machine learning`, `title:CRISPR`, `abstract:vaccine`). Supports Solr syntax. |
| `journal` | Select | Filter to a specific PLOS journal (default: PLOS ONE) |
| `articleType` | Select | Filter by article type (Research Article, Review, Meta-Analysis, etc.) |
| `fromDate` | String | Earliest publication date in `YYYY-MM-DD` format (e.g. `2023-01-01`) |
| `toDate` | String | Latest publication date in `YYYY-MM-DD` format (e.g. `2024-12-31`) |
| `maxItems` | Integer | Max articles to return (default: 100, max: 10,000) |

#### Example Inputs

**Search for machine learning articles in PLOS Computational Biology:**
```json
{
  "mode": "searchArticles",
  "query": "machine learning",
  "journal": "PLOS Computational Biology",
  "articleType": "Research Article",
  "maxItems": 50
}
````

**Browse recent PLOS Medicine articles:**

```json
{
  "mode": "browseJournal",
  "journal": "PLOS Medicine",
  "maxItems": 100
}
```

**Search for COVID-19 papers published in 2023:**

```json
{
  "mode": "searchArticles",
  "query": "COVID-19 vaccine",
  "fromDate": "2023-01-01",
  "toDate": "2023-12-31",
  "maxItems": 200
}
```

***

### Output

Each record in the dataset contains:

| Field | Type | Description |
|---|---|---|
| `doi` | String | Digital Object Identifier, e.g. `10.1371/journal.pone.0123456` |
| `title` | String | Article title |
| `authors` | Array | Author names, e.g. `["Alice Smith", "Bob Jones"]` |
| `journal` | String | Journal name, e.g. `PLOS ONE` |
| `publication_date` | String | Publication date in `YYYY-MM-DD` format |
| `article_type` | String | Article type, e.g. `Research Article` |
| `subjects` | Array | Subject category paths, e.g. `["/Science/Biology/Genetics"]` |
| `abstract` | String | Article abstract text |
| `url` | String | Canonical DOI URL, e.g. `https://doi.org/10.1371/journal.pone.0123456` |
| `plos_url` | String | PLOS journal page URL, e.g. `https://journals.plos.org/plosone/article?id=...` |
| `scrapedAt` | String | ISO 8601 timestamp of when the record was scraped |

#### Example Output Record

```json
{
  "doi": "10.1371/journal.pone.0123456",
  "title": "Genomic Analysis of SARS-CoV-2 Variants",
  "authors": ["Alice Smith", "Bob Jones", "Carol White"],
  "journal": "PLOS ONE",
  "publication_date": "2024-03-15",
  "article_type": "Research Article",
  "subjects": [
    "/Science/Biology/Genetics/Genomics",
    "/Medicine/Infectious Diseases/Viral Diseases/COVID 19"
  ],
  "abstract": "We performed whole-genome sequencing on 500 SARS-CoV-2 isolates...",
  "url": "https://doi.org/10.1371/journal.pone.0123456",
  "plos_url": "https://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0123456",
  "scrapedAt": "2026-06-04T10:00:00+00:00"
}
```

***

### Search Query Syntax

The `query` field supports full **Solr query syntax**:

| Example | Effect |
|---|---|
| `machine learning` | Full-text search for "machine learning" |
| `title:CRISPR` | Search only in article titles |
| `abstract:vaccine` | Search only in abstracts |
| `author:Smith` | Filter by author name |
| `*:*` | Match all articles |

***

### Use Cases

- **Literature reviews** — Gather all Research Articles on a topic for systematic review
- **Citation analysis** — Extract article metadata for bibliometric analysis
- **Research monitoring** — Track new publications in specific PLOS journals
- **Academic data pipelines** — Feed article metadata into knowledge graphs or RAG systems
- **Science journalism** — Find recent research on trending topics

***

### FAQ

**Do I need an API key?**
No. The PLOS Solr API is completely free and open. No registration, account, or API key is required.

**How many articles can I scrape?**
The PLOS API indexes over 400,000 articles. You can retrieve up to 10,000 per run using the `maxItems` parameter.

**How fresh is the data?**
The PLOS API returns live data. Articles are indexed shortly after publication.

**Can I search specific fields?**
Yes — the `query` field supports Solr syntax. Use `title:keyword` to search titles, `abstract:keyword` for abstracts, etc.

**Are all PLOS journals covered?**
Yes. All seven major PLOS journals are supported.

**What article types are available?**
Research Articles, Reviews, Meta-Analyses, Perspectives, Corrections, Retractions, Editorials, Opinions, Essays, Primers, Community Pages, Software papers, and Methods & Resources articles.

**Will results include the full text?**
No — the actor returns metadata and abstracts. Full text is available on the linked PLOS article page.

***

### Data Source

This actor uses the official **PLOS Search API** at `https://api.plos.org/search`, which is powered by Apache Solr. PLOS (Public Library of Science) is a non-profit publisher that provides fully open-access research articles at no cost. See [PLOS API documentation](https://api.plos.org/) for details.

# Actor input Schema

## `mode` (type: `string`):

Select what to fetch. 'searchArticles' performs a keyword search across all PLOS journals; 'browseJournal' lists recent articles from a specific PLOS journal.

## `query` (type: `string`):

Keyword query for article search (mode=searchArticles). Supports Solr syntax e.g. 'title:machine learning', 'abstract:CRISPR'. Use '*:*' to match all articles.

## `journal` (type: `string`):

PLOS journal to filter results or browse (mode=browseJournal). Defaults to PLOS ONE.

## `articleType` (type: `string`):

Filter results by article type. Leave blank for no filter.

## `fromDate` (type: `string`):

Filter articles published on or after this date (ISO format, e.g. '2023-01-01'). Leave blank for no lower bound.

## `toDate` (type: `string`):

Filter articles published on or before this date (ISO format, e.g. '2024-12-31'). Leave blank for no upper bound.

## `maxItems` (type: `integer`):

Maximum number of articles to return.

## Actor input object example

```json
{
  "mode": "searchArticles",
  "query": "open source software",
  "journal": "PLOS ONE",
  "articleType": "",
  "maxItems": 100
}
```

# Actor output Schema

## `articles` (type: `string`):

Dataset containing all scraped PLOS academic articles.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "searchArticles",
    "query": "open source software",
    "maxItems": 100
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/plos-one-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "searchArticles",
    "query": "open source software",
    "maxItems": 100,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/plos-one-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "searchArticles",
  "query": "open source software",
  "maxItems": 100
}' |
apify call crawlerbros/plos-one-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/plos-one-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "PLOS Open Access Journals Scraper",
        "description": "Search and browse PLOS (Public Library of Science) open-access academic journals - PLOS ONE, PLOS Biology, PLOS Medicine, PLOS Genetics, PLOS Computational Biology, PLOS Pathogens, and PLOS Neglected Tropical Diseases. No API key required.",
        "version": "1.0",
        "x-build-id": "Tvf58ebiffoUPEaEQ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~plos-one-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-plos-one-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~plos-one-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-plos-one-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~plos-one-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-plos-one-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "searchArticles",
                            "browseJournal"
                        ],
                        "type": "string",
                        "description": "Select what to fetch. 'searchArticles' performs a keyword search across all PLOS journals; 'browseJournal' lists recent articles from a specific PLOS journal.",
                        "default": "searchArticles"
                    },
                    "query": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Keyword query for article search (mode=searchArticles). Supports Solr syntax e.g. 'title:machine learning', 'abstract:CRISPR'. Use '*:*' to match all articles.",
                        "default": "open source"
                    },
                    "journal": {
                        "title": "Journal",
                        "enum": [
                            "PLOS ONE",
                            "PLOS Biology",
                            "PLOS Medicine",
                            "PLOS Genetics",
                            "PLOS Computational Biology",
                            "PLOS Pathogens",
                            "PLOS Neglected Tropical Diseases"
                        ],
                        "type": "string",
                        "description": "PLOS journal to filter results or browse (mode=browseJournal). Defaults to PLOS ONE.",
                        "default": "PLOS ONE"
                    },
                    "articleType": {
                        "title": "Article type",
                        "enum": [
                            "",
                            "Research Article",
                            "Review",
                            "Meta-Analysis",
                            "Perspective",
                            "Correction",
                            "Retraction",
                            "Editorial",
                            "Opinion",
                            "Essay",
                            "Primer",
                            "Community Page",
                            "Software",
                            "Methods and Resources"
                        ],
                        "type": "string",
                        "description": "Filter results by article type. Leave blank for no filter.",
                        "default": ""
                    },
                    "fromDate": {
                        "title": "From date",
                        "type": "string",
                        "description": "Filter articles published on or after this date (ISO format, e.g. '2023-01-01'). Leave blank for no lower bound."
                    },
                    "toDate": {
                        "title": "To date",
                        "type": "string",
                        "description": "Filter articles published on or before this date (ISO format, e.g. '2024-12-31'). Leave blank for no upper bound."
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of articles to return.",
                        "default": 100
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
