OpenRouter Fusion MCP server avatar

OpenRouter Fusion MCP server

Pricing

Pay per usage

Go to Apify Store
OpenRouter Fusion MCP server

OpenRouter Fusion MCP server

MCP server exposing the OpenRouter Fusion Router (multi-model deliberation) as a tool, via the apify/openrouter proxy. No API key required.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Yann Feunteun

Yann Feunteun

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

an hour ago

Last modified

Share

OpenRouter Fusion MCP server

An Apify Actor running a Model Context Protocol server (Streamable HTTP transport) that exposes the OpenRouter Fusion Router as a single tool.

ask_fusion sends your prompt to a panel of frontier models in parallel; each panelist can search the web and cite sources, then a judge model fuses them into one answer plus a structured analysis — consensus, contradictions (attributed per model), unique insights, and blind spots. You get the fused answer, the cited source URLs, and every panelist's full response.

It reaches OpenRouter through the apify/openrouter proxy, so no OpenRouter API key is required — requests are authenticated and billed with your Apify token.

Why use it

Use it for hard, high-stakes, or ambiguous questions where one model is not enough: a single model can be confidently wrong, omit a key angle, or miss recent facts. Fusion runs several models, grounds them in live web sources, and surfaces exactly where they agree, disagree, and fall short — so you can trust the consensus and scrutinise the contradictions.

Tool: ask_fusion

InputTypeDefaultDescription
promptstring— (required)The question or task to deliberate on.
systemstringnoneSystem instruction to steer the panel.
preset"high" | "budget""high"Panel preset. high = frontier models (best, pricey). budget = cheaper/faster panel with a frontier judge.
analysisModelsstring[]preset panelOverride the panel with explicit OpenRouter model slugs (≥ 2; fewer is ignored). Beats preset.
maxToolCallsnumber8 (OpenRouter default)Max web search/fetch iterations per panelist (clamped 1–16). Lower = faster/cheaper, fewer citations.
webbooleantrueRequire the panel to web-search and cite source URLs for factual/current claims. Set false for pure reasoning with no grounding.

Presets

preset selects the panel that deliberates (the judge is always the frontier model anthropic/claude-opus-4.8). Override either with analysisModels.

PresetPanel modelsUse when
high (default)~anthropic/claude-opus-latest, ~openai/gpt-latest, ~google/gemini-pro-latestBest quality and most reliable citations. Pricey, slow (~$0.4–1.1/call).
budgetgoogle/gemini-3.5-flash, deepseek/deepseek-v4-flash, openai/gpt-5.4-miniFaster/cheaper. Weaker panelists cite less reliably (the judge still vets sources).

The ~…-latest slugs are OpenRouter floating aliases that always resolve to the current frontier model.

Output

ask_fusion returns the fused answer as text (with a Sources: list appended), plus structured fields:

FieldDescription
answerThe judge's fused answer.
sourcesDistinct source URLs from the judge's vetted answer. The judge drops citations it deems fabricated/unverifiable, so these are not the raw union of panel URLs (weak panel models can hallucinate plausible-looking URLs; those are filtered out). The full per-panel citations remain in panel.
analysis.consensusPoints all/most panelists agreed on.
analysis.contradictionsTopics where panelists disagreed, with each model's stance — plus an "evidence" stance that is the judge's own adjudication (not a panel model).
analysis.uniqueInsightsPoints only one model raised.
analysis.blindSpotsTopics the panel under-addressed.
analysis.partialCoveragePoints some (not all) panelists made.
panelEach panel model and its full response (where its own citations live).
modelThe judge/router model that produced the fused answer.
usageToken usage reported by the proxy.

Grounding & citations. With web: true (default), the panel is instructed to verify current facts/prices/versions via web search, cite the real source URL, and say "no source found" rather than guess. The judge then adjudicates the panel's citations — fabricated or unverifiable URLs (which weaker panel models can invent) are dropped from sources. For the most reliable citations, use preset: "high" (frontier panelists search and cite more reliably than the budget panel).

Example output

Prompt: "For tuning the hyperparameters of an expensive-to-train ML model, compare Bayesian optimization vs reinforcement learning … cite sources." (abridged)

{
"answer": "# Bayesian Optimization vs. Reinforcement Learning …\n\nFor tuning a single expensive-to-train model, **Bayesian optimization (BO) is the right default** on sample efficiency, practicality, and tooling maturity. **RL is a specialized choice** … BO builds a probabilistic surrogate (GP or TPE) and finds strong configs in a few dozen to a few hundred trials ([Frazier, BO tutorial](https://arxiv.org/pdf/1807.02811)). Standard model-free RL is sample-hungry — Hyp-RL needed ~10M frames / 24 GPU-hours of pretraining ([Hyp-RL](https://ar5iv.labs.arxiv.org/html/1906.11527)) …",
"sources": [
"https://arxiv.org/pdf/1807.02811",
"https://ar5iv.labs.arxiv.org/html/1906.11527",
"https://arxiv.org/abs/1611.01578",
"https://research.google/pubs/google-vizier-a-service-for-black-box-optimization/",
"https://arxiv.org/pdf/1807.01774"
// … 18 more
],
"analysis": {
"consensus": [
"For tuning a single expensive model on a limited budget, Bayesian optimization is the better default.",
"Standard model-free RL is sample-inefficient for one-off HPO; its cost only pays off when amortized across many tasks."
],
"contradictions": [
{
"topic": "Has RL been 'replaced' by BO in neural architecture search?",
"stances": [
{ "model": "~google/gemini-pro-latest", "stance": "A 2023 NAS survey says RL was outperformed and replaced by BO and evolutionary strategies." },
{ "model": "~anthropic/claude-opus-latest", "stance": "The field moved to one-shot/differentiable methods (DARTS), not specifically BO." },
// "evidence" is the judge's own adjudication of the disagreement, not a panel model.
{ "model": "evidence", "stance": "In NAS, RL controllers were displaced mainly by one-shot/weight-sharing methods (ENAS, DARTS), not BO." }
]
}
],
"blindSpots": [
"Population Based Training (PBT) — a leading method for dynamic schedules — is omitted.",
"Parallel/async evaluation often dominates wall-clock efficiency more than raw sample count."
]
},
"panel": [
{ "model": "~anthropic/claude-opus-latest", "content": "# Bayesian Optimization vs. Reinforcement Learning …" },
{ "model": "~openai/gpt-latest", "content": "## Short answer\nFor a single expensive-to-train ML model, pick Bayesian optimization …" },
{ "model": "~google/gemini-pro-latest", "content": "When tuning the hyperparameters of an expensive-to-train model …" }
],
"model": "anthropic/claude-4.8-opus-20260528",
"usage": { "promptTokens": 15831, "completionTokens": 3103, "totalTokens": 18934 }
}

This Actor is an MCP server — the result is returned inline in the tool response (the structuredContent above). It does not write to an Apify dataset or key-value store, so there is no run output to download separately; consume it through your MCP client.

Endpoint

Deployed as a standby Actor, the MCP server is reachable at:

https://straightforward-under--fusion-mcp.apify.actor/mcp

Authenticate with your Apify API token as a Bearer token:

Authorization: Bearer <YOUR_APIFY_API_TOKEN>

Note: a fusion deliberation is slow (frontier panel + web search routinely runs ~100s+) and the proxy can occasionally return an incomplete body on long runs. The server streams keepalive notifications so MCP clients don't time out, and retries incomplete responses automatically. Use preset: "budget" and/or a low maxToolCalls for faster, cheaper runs.

Use with Claude Code

Add the server with claude mcp add over the Streamable HTTP transport. Replace YOUR_APIFY_API_TOKEN with your Apify token.

Project scope

Scoped to the current project and written to a .mcp.json file in the repo (shared with anyone who checks it out):

claude mcp add --transport http --scope project fusion \
https://straightforward-under--fusion-mcp.apify.actor/mcp \
--header "Authorization: Bearer YOUR_APIFY_API_TOKEN"

Security: --scope project stores the header in .mcp.json, which is committed to git. Do not hard-code a real token there. Claude Code expands environment variables in .mcp.json, so prefer a placeholder and let each user supply their own — set the header to Authorization: Bearer ${APIFY_TOKEN} (single-quote it in the shell so it is stored literally) and export APIFY_TOKEN in your environment.

User scope

Available across all your projects on this machine; stored in your user config, never in the repo:

claude mcp add --transport http --scope user fusion \
https://straightforward-under--fusion-mcp.apify.actor/mcp \
--header "Authorization: Bearer YOUR_APIFY_API_TOKEN"

After adding, restart Claude Code and run /mcp to confirm fusion is connected. Claude can then call the ask_fusion tool. Remove it any time with claude mcp remove fusion.

Running locally

npm install
APIFY_TOKEN=<your-apify-token> npm run start:dev

The server listens for MCP requests at http://localhost:3000/mcp. The token is needed locally because ask_fusion calls the apify/openrouter proxy; on the Apify platform it is injected automatically.

Deploying your own

Push the Actor to Apify with standby mode enabled:

$apify push

Your standby endpoint will be https://<username>--fusion-mcp.apify.actor/mcp.

Cost

This Actor does not charge a per-call fee. You pay for:

  • Apify compute while the standby Actor is running, and
  • OpenRouter model usage, billed by the apify/openrouter proxy against your Apify token.

A fusion call is much pricier than a normal model callpreset: "high" runs three frontier models plus a judge, each able to web-search. Expect roughly $0.4–$1.1 per call depending on prompt length and web usage; preset: "budget" is several times cheaper. To control cost, prefer budget, lower maxToolCalls, or override analysisModels with cheaper slugs.

A Pay Per Event schema is included in .actor/pay_per_event.json for anyone who wants to monetize a fork. It is not enabled on this Actor, and Actor.charge is not called.

Resources