OpenRouter Fusion MCP server
Pricing
Pay per usage
OpenRouter Fusion MCP server
MCP server exposing the OpenRouter Fusion Router (multi-model deliberation) as a tool, via the apify/openrouter proxy. No API key required.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Yann Feunteun
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
0
Monthly active users
an hour ago
Last modified
Categories
Share
OpenRouter Fusion MCP server
An Apify Actor running a Model Context Protocol server (Streamable HTTP transport) that exposes the OpenRouter Fusion Router as a single tool.
ask_fusion sends your prompt to a panel of frontier models in parallel; each panelist can search the web and cite sources, then a judge model fuses them into one answer plus a structured analysis — consensus, contradictions (attributed per model), unique insights, and blind spots. You get the fused answer, the cited source URLs, and every panelist's full response.
It reaches OpenRouter through the apify/openrouter proxy, so no OpenRouter API key is required — requests are authenticated and billed with your Apify token.
Why use it
Use it for hard, high-stakes, or ambiguous questions where one model is not enough: a single model can be confidently wrong, omit a key angle, or miss recent facts. Fusion runs several models, grounds them in live web sources, and surfaces exactly where they agree, disagree, and fall short — so you can trust the consensus and scrutinise the contradictions.
Tool: ask_fusion
| Input | Type | Default | Description |
|---|---|---|---|
prompt | string | — (required) | The question or task to deliberate on. |
system | string | none | System instruction to steer the panel. |
preset | "high" | "budget" | "high" | Panel preset. high = frontier models (best, pricey). budget = cheaper/faster panel with a frontier judge. |
analysisModels | string[] | preset panel | Override the panel with explicit OpenRouter model slugs (≥ 2; fewer is ignored). Beats preset. |
maxToolCalls | number | 8 (OpenRouter default) | Max web search/fetch iterations per panelist (clamped 1–16). Lower = faster/cheaper, fewer citations. |
web | boolean | true | Require the panel to web-search and cite source URLs for factual/current claims. Set false for pure reasoning with no grounding. |
Presets
preset selects the panel that deliberates (the judge is always the frontier model anthropic/claude-opus-4.8). Override either with analysisModels.
| Preset | Panel models | Use when |
|---|---|---|
high (default) | ~anthropic/claude-opus-latest, ~openai/gpt-latest, ~google/gemini-pro-latest | Best quality and most reliable citations. Pricey, slow (~$0.4–1.1/call). |
budget | google/gemini-3.5-flash, deepseek/deepseek-v4-flash, openai/gpt-5.4-mini | Faster/cheaper. Weaker panelists cite less reliably (the judge still vets sources). |
The ~…-latest slugs are OpenRouter floating aliases that always resolve to the current frontier model.
Output
ask_fusion returns the fused answer as text (with a Sources: list appended), plus structured fields:
| Field | Description |
|---|---|
answer | The judge's fused answer. |
sources | Distinct source URLs from the judge's vetted answer. The judge drops citations it deems fabricated/unverifiable, so these are not the raw union of panel URLs (weak panel models can hallucinate plausible-looking URLs; those are filtered out). The full per-panel citations remain in panel. |
analysis.consensus | Points all/most panelists agreed on. |
analysis.contradictions | Topics where panelists disagreed, with each model's stance — plus an "evidence" stance that is the judge's own adjudication (not a panel model). |
analysis.uniqueInsights | Points only one model raised. |
analysis.blindSpots | Topics the panel under-addressed. |
analysis.partialCoverage | Points some (not all) panelists made. |
panel | Each panel model and its full response (where its own citations live). |
model | The judge/router model that produced the fused answer. |
usage | Token usage reported by the proxy. |
Grounding & citations. With
web: true(default), the panel is instructed to verify current facts/prices/versions via web search, cite the real source URL, and say "no source found" rather than guess. The judge then adjudicates the panel's citations — fabricated or unverifiable URLs (which weaker panel models can invent) are dropped fromsources. For the most reliable citations, usepreset: "high"(frontier panelists search and cite more reliably than the budget panel).
Example output
Prompt: "For tuning the hyperparameters of an expensive-to-train ML model, compare Bayesian optimization vs reinforcement learning … cite sources." (abridged)
{"answer": "# Bayesian Optimization vs. Reinforcement Learning …\n\nFor tuning a single expensive-to-train model, **Bayesian optimization (BO) is the right default** on sample efficiency, practicality, and tooling maturity. **RL is a specialized choice** … BO builds a probabilistic surrogate (GP or TPE) and finds strong configs in a few dozen to a few hundred trials ([Frazier, BO tutorial](https://arxiv.org/pdf/1807.02811)). Standard model-free RL is sample-hungry — Hyp-RL needed ~10M frames / 24 GPU-hours of pretraining ([Hyp-RL](https://ar5iv.labs.arxiv.org/html/1906.11527)) …","sources": ["https://arxiv.org/pdf/1807.02811","https://ar5iv.labs.arxiv.org/html/1906.11527","https://arxiv.org/abs/1611.01578","https://research.google/pubs/google-vizier-a-service-for-black-box-optimization/","https://arxiv.org/pdf/1807.01774"// … 18 more],"analysis": {"consensus": ["For tuning a single expensive model on a limited budget, Bayesian optimization is the better default.","Standard model-free RL is sample-inefficient for one-off HPO; its cost only pays off when amortized across many tasks."],"contradictions": [{"topic": "Has RL been 'replaced' by BO in neural architecture search?","stances": [{ "model": "~google/gemini-pro-latest", "stance": "A 2023 NAS survey says RL was outperformed and replaced by BO and evolutionary strategies." },{ "model": "~anthropic/claude-opus-latest", "stance": "The field moved to one-shot/differentiable methods (DARTS), not specifically BO." },// "evidence" is the judge's own adjudication of the disagreement, not a panel model.{ "model": "evidence", "stance": "In NAS, RL controllers were displaced mainly by one-shot/weight-sharing methods (ENAS, DARTS), not BO." }]}],"blindSpots": ["Population Based Training (PBT) — a leading method for dynamic schedules — is omitted.","Parallel/async evaluation often dominates wall-clock efficiency more than raw sample count."]},"panel": [{ "model": "~anthropic/claude-opus-latest", "content": "# Bayesian Optimization vs. Reinforcement Learning …" },{ "model": "~openai/gpt-latest", "content": "## Short answer\nFor a single expensive-to-train ML model, pick Bayesian optimization …" },{ "model": "~google/gemini-pro-latest", "content": "When tuning the hyperparameters of an expensive-to-train model …" }],"model": "anthropic/claude-4.8-opus-20260528","usage": { "promptTokens": 15831, "completionTokens": 3103, "totalTokens": 18934 }}
This Actor is an MCP server — the result is returned inline in the tool response (the structuredContent above). It does not write to an Apify dataset or key-value store, so there is no run output to download separately; consume it through your MCP client.
Endpoint
Deployed as a standby Actor, the MCP server is reachable at:
https://straightforward-under--fusion-mcp.apify.actor/mcp
Authenticate with your Apify API token as a Bearer token:
Authorization: Bearer <YOUR_APIFY_API_TOKEN>
Note: a fusion deliberation is slow (frontier panel + web search routinely runs ~100s+) and the proxy can occasionally return an incomplete body on long runs. The server streams keepalive notifications so MCP clients don't time out, and retries incomplete responses automatically. Use
preset: "budget"and/or a lowmaxToolCallsfor faster, cheaper runs.
Use with Claude Code
Add the server with claude mcp add over the Streamable HTTP transport. Replace YOUR_APIFY_API_TOKEN with your Apify token.
Project scope
Scoped to the current project and written to a .mcp.json file in the repo (shared with anyone who checks it out):
claude mcp add --transport http --scope project fusion \https://straightforward-under--fusion-mcp.apify.actor/mcp \--header "Authorization: Bearer YOUR_APIFY_API_TOKEN"
Security:
--scope projectstores the header in.mcp.json, which is committed to git. Do not hard-code a real token there. Claude Code expands environment variables in.mcp.json, so prefer a placeholder and let each user supply their own — set the header toAuthorization: Bearer ${APIFY_TOKEN}(single-quote it in the shell so it is stored literally) and exportAPIFY_TOKENin your environment.
User scope
Available across all your projects on this machine; stored in your user config, never in the repo:
claude mcp add --transport http --scope user fusion \https://straightforward-under--fusion-mcp.apify.actor/mcp \--header "Authorization: Bearer YOUR_APIFY_API_TOKEN"
After adding, restart Claude Code and run /mcp to confirm fusion is connected. Claude can then call the ask_fusion tool. Remove it any time with claude mcp remove fusion.
Running locally
npm installAPIFY_TOKEN=<your-apify-token> npm run start:dev
The server listens for MCP requests at http://localhost:3000/mcp. The token is needed locally because ask_fusion calls the apify/openrouter proxy; on the Apify platform it is injected automatically.
Deploying your own
Push the Actor to Apify with standby mode enabled:
$apify push
Your standby endpoint will be https://<username>--fusion-mcp.apify.actor/mcp.
Cost
This Actor does not charge a per-call fee. You pay for:
- Apify compute while the standby Actor is running, and
- OpenRouter model usage, billed by the
apify/openrouterproxy against your Apify token.
A fusion call is much pricier than a normal model call — preset: "high" runs three frontier models plus a judge, each able to web-search. Expect roughly $0.4–$1.1 per call depending on prompt length and web usage; preset: "budget" is several times cheaper. To control cost, prefer budget, lower maxToolCalls, or override analysisModels with cheaper slugs.
A Pay Per Event schema is included in .actor/pay_per_event.json for anyone who wants to monetize a fork. It is not enabled on this Actor, and
Actor.chargeis not called.