Mistral: Mistral Medium 3.5
Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with text output, and is designed for agentic workflows, coding, and complex...
Anyone in the Space can @-mention Mistral: Mistral Medium 3.5 with the team's shared context — pooled credits, one chat, one memory.
Starter is free forever — 1 Space, 100 credits/month, 1 MCP. No card.
Verdict
Best for
- Long-context document analysis under budget
- Vision tasks on screenshots and diagrams
- Cost-sensitive multilingual workflows
- Rapid prototyping with large context needs
- Teams already invested in Mistral ecosystem
Strengths
The 262K context window handles entire codebases, legal documents, or multi-chapter manuscripts in a single call. Vision support covers common document and UI analysis tasks without needing separate OCR pipelines. Pricing undercuts GPT-4o and Claude Sonnet by 40-50% on input tokens, making it viable for high-volume applications. Mistral's European heritage often translates to stronger multilingual performance, especially on French, German, and Spanish.
Trade-offs
No public benchmarks means you're flying blind on reasoning depth, code generation accuracy, and instruction-following compared to Claude Sonnet 4.5 or GPT-4o. Mistral models historically lag OpenAI and Anthropic on complex multi-step reasoning and nuanced creative writing. Vision capabilities are newer and less battle-tested than GPT-4o's. Output pricing at $7.50/Mtok climbs quickly for verbose responses. Early adopters should budget time for prompt tuning and output validation.
Specifications
- Provider
- mistralai
- Category
- llm
- Context length
- 262,144 tokens
- Max output
- —
- Modalities
- text, image
- License
- proprietary
- Released
- 2026-04-30
Pricing
- Input
- $1.50/Mtok
- Output
- $7.50/Mtok
- Model ID
mistralai/mistral-medium-3-5
Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool — one plan, one balance for everyone.
Team cost calculator
5 seats · 80 msgs/day
Switchy meters this against your org's shared credit pool — one plan, one balance for everyone.
Providers
| Provider | Context | Input | Output | P50 latency | Throughput | 30d uptime |
|---|---|---|---|---|---|---|
| mistralai | 262k | $1.50/Mtok | $7.50/Mtok | — | — | — |
Performance
Benchmarks
Works well with
Top MCPs
Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.
How Switchy teams use it
Starter prompts
Analyze Legal Contract
Review this contract and create a structured summary with: parties involved, key obligations for each party, payment terms, termination clauses, and any unusual or high-risk provisions. Highlight ambiguous language.Open in a Space →
Screenshot to Requirements
Analyze this screenshot and generate a technical requirements document. Include: UI components visible, user interactions implied, data fields needed, validation rules, and any accessibility considerations.Open in a Space →
Multilingual Support Triage
Read this customer message and provide: detected language, sentiment (positive/neutral/negative), issue category, urgency level (low/medium/high), and a one-sentence summary in English.Open in a Space →
Codebase Context Search
Given this codebase, locate all functions that handle user authentication. For each, provide: function name, file path, a one-line description, and any security-relevant details like token validation or session management.Open in a Space →
Research Paper Synthesis
Summarize this research paper in three sections: core hypothesis and methodology, key findings with supporting data, and limitations or open questions. Use technical language appropriate for a graduate-level audience.Open in a Space →
Example outputs
Illustrative — representative of the model's voice and quality, not literal recordings.
Review this Python function for potential bugs and suggest improvements. Focus on edge cases and performance: ```python def merge_sorted_lists(list1, list2): result = [] i = j = 0 while i < len(list1) and j < len(list2): if list1[i] < list2[j]: result.append(list1[i]) i += 1 else: result.append(list2[j]) j += 1 return result ```
The function has a critical bug: it doesn't handle remaining elements after one list is exhausted. When the while loop exits, elements from the longer list are discarded. Fix by adding: ```python result.extend(list1[i:]) result.extend(list2[j:]) ``` Edge cases to consider: empty lists (currently handled correctly), lists with duplicate values (works fine), and single-element lists (works after the fix). Performance is O(n+m) which is optimal for this problem. The repeated `len()` calls in the loop condition are negligible in Python 3.x due to caching, but you could pre-compute them if profiling shows it matters.
This example shows Mistral Medium's code review capabilities with the 262K token context window supporting larger codebases. The response correctly identifies the bug and provides actionable fixes. However, without published benchmarks, it's unclear how this model compares to specialized code models on complex refactoring tasks or multi-file analysis.
I'm designing a SaaS dashboard. Describe a layout for a metrics overview page that displays: monthly recurring revenue, churn rate, active users, and conversion funnel. The design should work on desktop and be data-dense but scannable.
Use a 2×2 grid layout for the four primary metrics, each in a card with the metric value prominent (48px bold), label above (14px), and a sparkline showing 90-day trend below. Place MRR top-left as the anchor metric. Below the grid, add a full-width conversion funnel visualization using a horizontal bar chart with percentage drop-offs between stages. Use a single accent color (suggest blue) with opacity variations to show depth. For scannability: align all numbers right, use consistent spacing (24px between cards), and add subtle dividers. Include a date range selector in the top-right corner. The layout stays readable down to 1280px width without horizontal scroll.
Mistral Medium handles multimodal input, making it suitable for design discussions that reference visual concepts. The response is structured and specific. The trade-off: without image generation capabilities, users must translate these descriptions into mockups themselves, unlike models that can produce visual outputs directly.
Analyze this customer support transcript and extract: the customer's core issue, any product bugs mentioned, sentiment progression, and recommended next steps. Transcript: [3,200-word conversation between customer and support agent about billing discrepancies, failed payment retries, and confusion over pro-rated charges]
**Core Issue:** Customer was charged $47 instead of expected $29 after downgrading from annual to monthly plan mid-cycle. **Product Bugs Identified:** 1. Payment retry logic attempted charges 4 times in 2 hours instead of the documented 24-hour intervals 2. Downgrade confirmation email didn't explain pro-rating calculation **Sentiment Progression:** Started frustrated (mentions "third time contacting support"), became confused during pro-rating explanation, ended cautiously satisfied after refund offer. **Recommended Next Steps:** Issue $18 refund, update downgrade email template to include pro-rating formula, review payment retry timing in billing service, follow up in 48 hours to confirm resolution.
The 262K token context window makes Mistral Medium well-suited for analyzing long-form documents like support transcripts, legal contracts, or research papers. This example demonstrates structured extraction from conversational data. The $1.50/$7.50 per Mtok pricing means processing a 3,200-word transcript costs under $0.02, making it economical for high-volume document analysis workflows.
Use-case deep-dives
When 262K context beats chaining for legal doc review
A 4-person legal ops team processing vendor agreements needs to compare clauses across 8-12 contracts simultaneously. Mistral Medium 3.5's 262,144-token window fits roughly 200,000 words—enough to load an entire contract portfolio in one prompt without chunking or retrieval overhead. At $1.50/Mtok input, a 200K-token analysis costs $0.30, making it cheaper than running sequential calls through smaller-context models that need summarization passes. The trade-off: if your contracts average under 40 pages each and you're only comparing 2-3 at a time, a 128K model saves you half the cost. Use this when you're routinely cross-referencing 6+ documents and need the model to hold all context without lossy summarization.
Cost-effective ticket routing for mid-scale support teams
A 12-person SaaS support team handling 800 tickets daily needs to auto-categorize and route incoming requests. Mistral Medium 3.5's $1.50 input pricing means a 500-token ticket costs $0.00075 to classify—$0.60 per 800 tickets, or $18/month at that volume. The model's multimodal capability handles screenshot attachments without a separate vision API call. The threshold: if you're processing under 200 tickets/day, the setup overhead isn't worth it; above 500/day, you're saving $40-60/month versus $3/Mtok alternatives. Without public benchmarks, validate classification accuracy on your ticket taxonomy during a 2-week pilot before committing to production routing.
When extended context memory beats conversation stitching
A 3-person content studio runs 90-minute brand strategy sessions where the AI needs to remember 40+ ideas, client feedback, and evolving direction without losing thread. Mistral Medium 3.5's 262K window holds an entire session transcript (roughly 65,000 words) plus reference docs—no mid-session summarization that flattens nuance. At $7.50/Mtok output, a 5,000-token synthesis response costs $0.0375, manageable for weekly sessions. The catch: if your sessions are under 30 minutes or you're fine with periodic context resets, a 32K model at half the output cost is smarter. Choose this when session continuity and callback to early ideas matter more than per-token cost, and you're running fewer than 20 sessions/month.
Frequently asked
Is Mistral Medium 3.5 good for general text tasks?
Yes, Mistral Medium 3.5 handles general text work well—drafting, summarization, Q&A, light reasoning. It sits between Mistral's small and large tiers, so you get decent quality without paying flagship prices. The 262k token context window means you can throw entire codebases or long documents at it without chunking.
Is Mistral Medium 3.5 cheaper than GPT-4o or Claude Sonnet?
Yes, significantly. At $1.50 input and $7.50 output per million tokens, Mistral Medium 3.5 undercuts GPT-4o ($2.50/$10) and Claude 3.5 Sonnet ($3/$15). If you're running high-volume workflows where cost matters more than bleeding-edge reasoning, this is a strong pick.
Can Mistral Medium 3.5 handle image inputs reliably?
It supports image inputs, but Mistral hasn't published vision benchmarks for this model. Expect basic image understanding—OCR, simple scene description—but don't rely on it for complex visual reasoning or fine-grained object detection. For serious vision work, use GPT-4o or Claude 3.5 Sonnet instead.
How does Mistral Medium 3.5 compare to Mistral Large?
Mistral Large costs more ($2/$6 per Mtok) and delivers stronger reasoning and coding performance. Medium 3.5 is the budget option when you need Mistral's speed and context window but can tolerate slightly weaker outputs. Without public benchmarks, you're trading proven capability for lower cost.
Should I use Mistral Medium 3.5 for production chatbots?
Only if cost is your primary constraint. The lack of public benchmarks means you're flying blind on quality versus alternatives. Test it thoroughly against your use case first. If users notice worse responses compared to GPT-4o mini or Claude Haiku, the cost savings won't matter.