Openrouter
OpenRouter is a platform that provides a unified API for accessing various large language models (LLMs) from different providers, allowing developers to integrate multiple AI models seamlessly.
Verdict
Common use cases
- Compare outputs across Claude and GPT-4 side by side
- Route to cheaper models for drafts, premium for finals
- Test prompt performance on 10 models in one session
- Monitor token costs before running expensive generations
- Access niche models like Code Llama or Mixtral
Integration
- Vendor
- Openrouter
- Category
- other
- Auth
- API_KEY
- Tools
- 7
- Composio slug
openrouter
Tools
- Create Chat Completion
Tool to generate a chat-style completion. use after assembling messages and selecting a model. supports streaming and function calls.
- Create Completion
Tool to generate a text completion for a given prompt or set of messages. use when you need a model-generated response from a specified model.
- Get Credits
Tool to get the current api credit balance for the authenticated user. use after authenticating to monitor remaining credits before making further api calls.
- Get Generation
Tool to retrieve a generation result by its unique id. use after a generation completes to fetch metadata like token counts, cost, and latency.
- List Available Models
Tool to list available models via openrouter api. use after confirming authentication to fetch the model catalog.
- OpenRouter List Model Endpoints
Tool to list endpoints for a specific model. use after specifying model author and slug to get endpoint details including pricing, context length, and supported parameters.
- OpenRouter List Providers
Tool to list all ai model providers available through the openrouter api. use after authentication to retrieve available provider options for routing configuration.
Setup
Setup guide
- 11. In Switchy, open your workspace settings and navigate to the Integrations tab. 2. Click 'Add MCP Integration' and select OpenRouter from the catalog. 3. Visit openrouter.ai/keys to generate an API key with full access to model routing and generation endpoints. 4. Paste the key into Switchy's auth dialog and click 'Connect'. 5. Switchy will verify the key by fetching your credit balance and the model catalog. 6. Open any Space and type '@OpenRouter list available models' to confirm the connection works. 7. To generate text, use '@OpenRouter create chat completion' followed by your model choice and prompt. 8. Check remaining credits anytime with '@OpenRouter get credits' to avoid mid-workflow interruptions.
What teammates see: by default, memories from Openrouter are scoped to the Space (PROJECT visibility) - you can mark any memory PRIVATE or share it ORG-wide.
Works well with
Top models
Compatibility data appears once enough Spaces have used this MCP together with a given model.
How Switchy teams use it
Starter prompts
List All Models
@OpenRouter list available models and show me the top 10 by context length, including pricing per million tokens.Open in a Space →
Compare Model Responses
@OpenRouter create chat completion using claude-3-5-sonnet, gpt-4-turbo, and llama-3-70b with this prompt: 'Explain quantum entanglement in one paragraph for a high school audience.'Open in a Space →
Check Credit Balance
@OpenRouter get credits and tell me how many dollars I have left, plus an estimate of how many GPT-4 requests that covers.Open in a Space →
Get Generation Details
@OpenRouter get generation for ID gen_abc123 and summarize the token usage, total cost, and response time.Open in a Space →
Find Cheapest Model
@OpenRouter list providers and model endpoints, then recommend the cheapest model with at least 8k context for summarizing customer feedback.Open in a Space →
Example outputs
Illustrative - representative of the model's voice and quality, not literal recordings.
@openrouter list all available models and show me which ones support function calling under $0.50 per million tokens
I found 23 models matching your criteria. Here are the top options: Anthropic's Claude 3.5 Haiku ($0.25/1M tokens, supports function calling), Google's Gemini 1.5 Flash ($0.075/1M tokens, supports function calling), and Meta's Llama 3.1 8B ($0.055/1M tokens, supports function calling). Each model includes context length (ranging from 8K to 200K tokens), latency estimates, and provider-specific features. Would you like detailed endpoint information for any of these models?
This example shows how the MCP helps teams compare model options across providers before committing to a specific API. Requires valid API key authentication. Pricing and availability change frequently — always verify current rates before production use.
@openrouter create a chat completion using Claude 3.5 Sonnet to summarize the last 5 messages in this Space, then tell me how many tokens that cost
I've generated a summary using Claude 3.5 Sonnet (anthropic/claude-3.5-sonnet). The model identified three main discussion threads: API integration strategy, rate limit handling, and error logging patterns. The completion used 847 input tokens and 203 output tokens, costing approximately $0.0126. Generation ID: gen_abc123xyz. Latency was 2.3 seconds. Your current account balance is $47.82 remaining.
This demonstrates the MCP's ability to route requests through OpenRouter's unified API while tracking usage. The generation ID allows you to retrieve detailed metadata later. Be aware that costs accumulate across the team — consider setting up usage alerts in your OpenRouter dashboard.
@openrouter check my current credit balance and estimate how many Claude Opus calls I can make with 2000 input tokens each
Your current OpenRouter credit balance is $47.82. Based on Claude Opus pricing ($15/1M input tokens, $75/1M output tokens), you can make approximately 1,594 calls with 2000 input tokens each, assuming an average output of 500 tokens per response. This estimate accounts for $0.03 per call ($0.03 input + $0.0375 output). Actual costs vary based on response length. Consider switching to Claude 3.5 Sonnet ($3/1M input tokens) to extend your budget 5x.
This example shows how the MCP helps teams manage API spend across multiple model providers. The cost calculation is illustrative — actual token counts depend on conversation context and model behavior. OpenRouter's credit system pools usage across all team members with the same API key.
Use-case deep-dives
When Openrouter beats vendor lock-in for rapid testing
A 5-person product team building a conversational feature needs to test Claude, GPT-4, and Gemini side-by-side before committing to one vendor. Openrouter wins here because the Create Chat Completion tool routes to any model through a single API key—no juggling three separate accounts or billing setups. The List Available Models and OpenRouter List Model Endpoints tools let the team compare pricing and context windows in real time during sprint planning. The Get Credits tool keeps burn rate visible so the PM can cap spend at $200/month while the team runs A/B tests. If your team already standardized on one vendor or needs sub-50ms latency guarantees, direct vendor APIs are simpler. But for teams in the discovery phase who need model flexibility without infrastructure overhead, Openrouter is the fastest path from hypothesis to production.
How Openrouter routes support queries to cheaper models first
A 12-person SaaS support team handles 400 tickets a week, half of which are FAQ-tier questions that don't need frontier models. Openrouter's routing logic lets them send simple queries to Llama or Mistral (pennies per call) and escalate complex cases to GPT-4o only when needed. The Get Generation tool logs token counts and cost per ticket, so the ops lead can prove ROI to finance—last quarter they cut LLM spend by 60% while maintaining CSAT. The Create Completion tool handles both workflows through one integration, so engineers don't maintain separate pipelines for cheap-vs-smart routing. If your ticket volume is under 50/week or your queries are uniformly complex, a single-vendor setup is less cognitive load. But for teams with bimodal support complexity and a CFO watching AI line items, Openrouter turns model selection into a cost lever instead of a vendor commitment.
When Openrouter simplifies client billing and model choice
A solo developer building AI features for three different clients needs each client to use their preferred model (one wants Claude for legal work, another wants GPT-4 for creative, a third wants the cheapest option). Openrouter's single API key and per-request model selection mean the dev writes one integration and routes by client ID at runtime—no managing three vendor accounts or reconciling three invoices. The Get Credits tool lets the dev monitor burn rate across all clients from one dashboard, and the OpenRouter List Providers tool helps pitch new clients on model options without vendor research. If you're building for one client with fixed requirements, a direct vendor integration is cleaner. But for freelancers or agencies juggling multiple clients with different model preferences and budgets, Openrouter collapses three billing relationships into one and keeps the codebase DRY.
Frequently asked
What does the OpenRouter MCP do in Switchy?
It routes your prompts to 200+ AI models from providers like Anthropic, OpenAI, Google, and Meta through a single API. You can switch models mid-conversation, compare outputs side-by-side, or let OpenRouter pick the cheapest model that meets your requirements. Useful when you need flexibility beyond a single vendor's model lineup.
Do I need an OpenRouter account to use this MCP?
Yes. You need an OpenRouter API key, which you generate after signing up at openrouter.ai. The key authenticates all requests and tracks your credit balance. OpenRouter charges per-token based on the model you use, so you'll want to monitor credits with the Get Credits tool before running expensive prompts.
Can this MCP fine-tune models or access private deployments?
No. It only calls publicly available models through OpenRouter's routing layer. If you need fine-tuned models, you'd connect directly to the provider's MCP (like OpenAI or Anthropic). OpenRouter is for switching between off-the-shelf models, not customising them.
Why use OpenRouter instead of calling model APIs directly?
OpenRouter handles fallback routing, rate limits, and unified billing across providers. If Claude is overloaded, it can auto-route to GPT-4 without changing your code. You also get a single invoice instead of managing separate accounts with Anthropic, OpenAI, and Google. Trade-off: you pay a small markup per token.
Who on the team should connect the OpenRouter MCP?
Whoever manages your AI model budget. The API key controls spending across all models, so you don't want every team member burning credits on GPT-4 when a cheaper model would work. One person connects it, then shares access to specific models via Switchy's workspace permissions.