# Open Science Evidence Finder (`mrbridge/open-science-evidence-finder`) Actor

Find, verify, deduplicate, and score open scientific metadata from OpenAlex, Crossref, arXiv, and Europe PMC for LLM source grounding.

- **URL**: https://apify.com/mrbridge/open-science-evidence-finder.md
- **Developed by:** [MrBridge](https://apify.com/mrbridge) (community)
- **Categories:** AI, MCP servers, Developer tools
- **Stats:** 1 total users, 1 monthly users, 0.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.00 / 1,000 tool-reads

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Open Science Evidence Finder

Open Science Evidence Finder is an Apify Standby MCP server that retrieves scientific metadata from OpenAlex, Crossref, arXiv, and Europe PMC, then normalizes, deduplicates, scores, and stores evidence candidates for LLM source grounding. It exposes MCP tools for literature search and DOI verification; it does not call an LLM.

Ask for "recent papers about urban heat waves and mortality" or paste a DOI such as `10.1038/s41586-020-2649-2` and get structured metadata back: identifiers, title, authors, venue, publication date, open-access status, source provenance, warnings, and heuristic scores. More free tools and studies are available at [mr-bridge.com](https://mr-bridge.com).

### What is Open Science Evidence Finder?

Open Science Evidence Finder is a metadata retrieval Actor for evidence discovery, DOI verification, literature discovery, RAG source grounding, and open-access paper discovery. It queries official open APIs and returns normalized `EvidenceItem` objects in the default dataset, plus a compact run summary in the key-value store.

It is useful when a LLM needs sourced candidate papers before writing a literature summary, but it should not be treated as a systematic review engine or a scientific-quality judge.

#### Sources used

* **OpenAlex** - primary source for works, identifiers, authorship, topics, citations, and open-access metadata.
* **Crossref** - DOI verification and bibliographic enrichment.
* **arXiv** - recent preprints and category metadata.
* **Europe PMC** - biomedical and life-science publications, PMIDs, PMCIDs, abstracts, and open-access flags.

Thank you to arXiv for use of its open access interoperability. This project is not endorsed by arXiv.

### How to use Open Science Evidence Finder

1. Create a free [Apify account](https://apify.com/sign-up?fpr=mrbridge).
2. Start this Actor in Standby mode on Apify. The Input tab is only for MCP connection instructions; there is no research query field there.
3. Connect your LLM client to `https://mrbridge--open-science-evidence-finder.apify.actor/mcp?token=YOUR_APIFY_TOKEN`.
4. Ask the LLM for literature discovery or DOI verification. The LLM calls the MCP tools with the right arguments.
5. Read returned tool JSON directly in the LLM, or inspect the default dataset plus `RUN_SUMMARY` and `OUTPUT` in Apify storage.

### MCP connection

The Actor exposes Streamable HTTP MCP at `/mcp`:

```text
https://mrbridge--open-science-evidence-finder.apify.actor/mcp?token=YOUR_APIFY_TOKEN
````

Client setup:

- **Claude Desktop** -> Settings -> Connectors -> Add custom connector -> paste the URL.
- **ChatGPT** (Plus / Pro / Team / Enterprise) -> Settings -> Connectors -> enable Developer mode -> Add custom connector -> paste the URL.
- **Apify Universal MCP** -> add `https://mcp.apify.com?tools=mrbridge/open-science-evidence-finder` to your existing config.
- **Other MCP clients** -> use `/mcp` over Streamable HTTP with `Authorization: Bearer YOUR_APIFY_TOKEN` or the `?token=` query parameter.

The Input tab intentionally has no query, DOI, sorting, language, author, or result fields. Those belong to the MCP tool calls made by the connected LLM.

### MCP tools

#### `find_scientific_evidence`

Finds, normalizes, deduplicates, and scores open scientific metadata for a natural-language research request.

Tool arguments:

```json
{
  "query": "retrieval augmented generation evaluation benchmarks since 2021",
  "maxResults": 10
}
```

#### `verify_scientific_doi`

Verifies one DOI and returns normalized metadata enriched from OpenAlex, Crossref, and Europe PMC.

Tool arguments:

```json
{
  "doi": "10.1038/s41586-020-2649-2"
}
```

The Actor queries OpenAlex, Crossref, arXiv, and Europe PMC automatically for search. `OPENALEX_API_KEY` and `CROSSREF_MAILTO` can be supplied as environment variables when needed.

### Standby mode and monetization

This server runs in Apify Standby mode as a hosted Streamable HTTP MCP server. Warm requests are suitable for conversational use; the first request after inactivity may take longer while Apify starts the container.

This Actor is designed for Apify pay-per-event monetization. Check the [Pricing tab](https://apify.com/mrbridge/open-science-evidence-finder/pricing?fpr=mrbridge) for the live configuration on your Apify plan.

Recommended pay-per-event setup in Apify Console:

| Event name | What it bills |
|------------|---------------|
| `tool-read` | Data retrieval calls such as `find_scientific_evidence` ($0.003/call) |
| `tool-match` | Metadata enrichment calls reserved for multi-record publication enrichment ($0.005/call) |
| `tool-analysis` | Evidence verification calls such as `verify_scientific_doi` ($0.015/call) |
| `apify-actor-start` | Actor startup at Apify's default low price |

Do not enable `apify-default-dataset-item`; tool calls already charge by MCP event, and EvidenceItems are stored in the dataset for observability. Charging both the tool event and dataset rows would double bill the same user action.

### Output

Each dataset row is a normalized `EvidenceItem`:

```json
{
  "itemType": "work",
  "title": "Example scientific work",
  "normalizedTitle": "example scientific work",
  "doi": "10.1000/example",
  "ids": {
    "openalex": "https://openalex.org/W123",
    "doi": "10.1000/example",
    "pmid": null,
    "pmcid": null,
    "arxiv": null,
    "crossref": "10.1000/example"
  },
  "publicationYear": 2024,
  "publicationDate": "2024-01-15",
  "workType": "journal-article",
  "venue": {
    "name": "Example Journal",
    "issn": ["1234-5678"],
    "publisher": "Example Publisher"
  },
  "authors": [
    {
      "name": "A. Researcher",
      "orcid": null,
      "institutions": []
    }
  ],
  "abstract": "Short abstract when available and allowed.",
  "abstractSource": "crossref",
  "abstractTruncated": false,
  "openAccess": {
    "isOpenAccess": true,
    "oaStatus": "gold",
    "url": "https://example.org/work",
    "license": "cc-by"
  },
  "metrics": {
    "citedByCount": 42,
    "referencedWorksCount": null
  },
  "topics": ["information retrieval"],
  "keywords": ["retrieval"],
  "sourceCoverage": {
    "openalex": true,
    "crossref": true,
    "arxiv": false,
    "europepmc": false
  },
  "sourceRecords": [
    {
      "source": "openalex",
      "recordId": "https://openalex.org/W123",
      "recordUrl": "https://openalex.org/W123",
      "apiUrl": "https://api.openalex.org/works?...",
      "retrievedAt": "2026-06-24T00:00:00.000Z"
    }
  ],
  "scores": {
    "relevanceScore": 0.91,
    "evidenceScore": 0.84,
    "recencyScore": 1,
    "metadataCompletenessScore": 0.88
  },
  "warnings": [],
  "raw": null
}
```

The Actor also writes:

- `RUN_SUMMARY` - query, DOI, requested/succeeded/failed sources, raw and deduplicated counts, warnings, and up to five top items without raw records.
- `OUTPUT` - MCP-friendly object containing `resultsCount`, compact `results`, and the same summary.

### Scoring

Scores are deterministic metadata heuristics between 0 and 1:

- `relevanceScore` combines source rank, query-term matches in title/abstract/topics/keywords, and exact DOI match when DOI mode is used.
- `recencyScore` favors recent publications while keeping older works eligible.
- `metadataCompletenessScore` checks DOI, date, authors, venue, abstract availability, OA URL, identifiers, and provenance.
- `evidenceScore` combines relevance, completeness, log-scaled citations, source coverage, and recency.

These scores are not measures of scientific quality, causal validity, peer-review rigor, consensus, or medical/legal reliability.

### Limits and responsible use

Metadata may be incomplete, stale, duplicated, or inconsistent across sources. This Actor is not a systematic review, not medical advice, and not legal advice.

The Actor does not download PDFs or full text by default and does not store long copyrighted content. Abstracts are handled conservatively: arXiv and Europe PMC abstracts are used when returned by the API, Crossref abstracts are cleaned and truncated when present, and OpenAlex inverted-index abstracts are not reconstructed unless explicitly requested.

The Actor uses official APIs and includes retries for HTTP 429 and 5xx responses. Keep result limits reasonable and respect each source's API terms, rate limits, attribution expectations, and robots or reuse policies where applicable.

### FAQ

#### Can I use this Actor via MCP?

Yes. This Actor is now a Standby MCP server. Connect your LLM client to `/mcp`; the available tools are `find_scientific_evidence` and `verify_scientific_doi`.

#### Does it call a LLM?

No. It only retrieves, normalizes, deduplicates, scores, and stores source metadata. The consuming LLM should perform the final synthesis.

#### What happens if one source fails?

The run continues if at least one requested source succeeds. Source failures are logged and included in `RUN_SUMMARY.warnings`.

#### Your feedback

Report bugs, source mapping issues, or feature requests in the Actor's Issues tab on Apify.

# Actor input Schema

## Actor input object example

```json
{}
```

# Actor output Schema

## `mcpEndpoint` (type: `string`):

Streamable HTTP MCP endpoint for this Standby run. Use the public Actor endpoint for normal MCP clients.

## `results` (type: `string`):

Complete normalized EvidenceItem records from the default dataset.

## `overview` (type: `string`):

Compact dataset view with title, DOI, year, venue, access status, citations, and scores.

## `provenance` (type: `string`):

Dataset view focused on source coverage, source records, and warnings.

## `summary` (type: `string`):

Structured RUN\_SUMMARY record with source status, counts, warnings, and up to five top evidence items.

## `output` (type: `string`):

Default OUTPUT key-value store record for integrations that expect a single JSON output pointer.

## `runUrl` (type: `string`):

Link to the public Actor run details page.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("mrbridge/open-science-evidence-finder").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("mrbridge/open-science-evidence-finder").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call mrbridge/open-science-evidence-finder --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=mrbridge/open-science-evidence-finder",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Open Science Evidence Finder",
        "description": "Find, verify, deduplicate, and score open scientific metadata from OpenAlex, Crossref, arXiv, and Europe PMC for LLM source grounding.",
        "version": "0.1",
        "x-build-id": "aeJp8VIQUVVp1R4JO"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/mrbridge~open-science-evidence-finder/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-mrbridge-open-science-evidence-finder",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/mrbridge~open-science-evidence-finder/runs": {
            "post": {
                "operationId": "runs-sync-mrbridge-open-science-evidence-finder",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/mrbridge~open-science-evidence-finder/run-sync": {
            "post": {
                "operationId": "run-sync-mrbridge-open-science-evidence-finder",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {}
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
