LLMmistralai

Mistral: Mistral Medium 3.5

Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with text output, and is designed for agentic workflows, coding, and complex...

Anyone in the Space can @-mention Mistral: Mistral Medium 3.5 with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

Mistral Medium 3.5 sits in the middle of Mistral's lineup with a 262K context window and multimodal support at $1.50/$7.50 per Mtok. Without public benchmarks, it's positioned as a cost-performance balance between their smaller and flagship models. The pricing undercuts GPT-4 class models while maintaining vision and file handling capabilities. Reach for this when you need long-context multimodal work but don't want to pay flagship rates — just know you're trading some capability ceiling for the savings.

Best for

  • Budget-conscious long-context analysis
  • Multimodal tasks with cost constraints
  • Document processing with vision needs
  • Teams migrating from Mistral Small

Strengths

The 262K context window handles full codebases and lengthy documents without chunking. Multimodal support covers text, images, and file uploads in a single request. Input pricing at $1.50 per Mtok runs 40% cheaper than GPT-4o while maintaining vision capabilities. The model slots naturally into existing Mistral workflows for teams already using their API, with consistent prompt formatting across their lineup.

Trade-offs

Absence of public benchmarks makes capability assessment difficult compared to Claude or GPT-4 where MMLU, HumanEval, and reasoning scores are documented. As a mid-tier offering, expect it to trail Mistral Large on complex reasoning and code generation tasks. Output pricing at $7.50 per Mtok climbs higher than input, so verbose generation tasks eat budget quickly. Vision performance likely lags specialized models like GPT-4o or Claude Sonnet for detailed image analysis.

Specifications

Provider
mistralai
Category
llm
Context length
262,144 tokens
Max output
Modalities
text, image, file
License
proprietary
Released
2026-04-30

Pricing

Input
$1.50/Mtok
Output
$7.50/Mtok
Model ID
mistralai/mistral-medium-3-5

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$58.08
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
mistralai262k$1.50/Mtok$7.50/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Analyze Multi-Page Contract

Review this contract and create a structured summary with: parties involved, key obligations for each party, payment terms, termination clauses, and any unusual provisions. Flag ambiguous language.
Open in a Space →

Screenshot to Code

Generate React component code for this UI screenshot. Include responsive layout, appropriate semantic HTML, and Tailwind classes. Add comments explaining layout decisions.
Open in a Space →

Research Paper Synthesis

I'm uploading three research papers on the same topic. Identify the main argument of each, compare their methodologies, highlight where they agree or conflict, and suggest gaps in the current research.
Open in a Space →

Codebase Documentation Audit

Analyze this codebase and identify: functions lacking docstrings, modules without README files, complex logic needing inline comments, and inconsistent documentation patterns. Prioritize by impact on maintainability.
Open in a Space →

Invoice Data Extraction

Extract from this invoice image: vendor name, invoice number, date, line items with descriptions and amounts, subtotal, tax, and total. Return as JSON with consistent field names.
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

Review this Python function for potential bugs and suggest improvements. Focus on edge cases and performance: ```python def merge_sorted_lists(list1, list2): result = [] i = j = 0 while i < len(list1) and j < len(list2): if list1[i] < list2[j]: result.append(list1[i]) i += 1 else: result.append(list2[j]) j += 1 return result ```

Output

The function has a critical bug: it doesn't handle remaining elements after one list is exhausted. When the while loop exits, elements from the longer list are discarded. Fix by adding: ```python result.extend(list1[i:]) result.extend(list2[j:]) ``` Edge cases to consider: empty lists (currently handled correctly), lists with duplicate values (works fine), and single-element lists (works after the fix). Performance is O(n+m) which is optimal for this problem. The repeated `len()` calls in the loop condition are negligible in Python 3.x due to caching, but you could pre-compute them if profiling shows it matters.

Notes

This example shows Mistral Medium's code review capabilities with the 262K token context window supporting larger codebases. The response correctly identifies the bug and provides actionable fixes. However, without published benchmarks, it's unclear how this model compares to specialized code models on complex refactoring tasks or multi-file analysis.

Prompt

I'm designing a SaaS dashboard. Describe a layout for a metrics overview page that displays: monthly recurring revenue, churn rate, active users, and conversion funnel. The design should work on desktop and be data-dense but scannable.

Output

Use a 2×2 grid layout for the four primary metrics, each in a card with the metric value prominent (48px bold), label above (14px), and a sparkline showing 90-day trend below. Place MRR top-left as the anchor metric. Below the grid, add a full-width conversion funnel visualization using a horizontal bar chart with percentage drop-offs between stages. Use a single accent color (suggest blue) with opacity variations to show depth. For scannability: align all numbers right, use consistent spacing (24px between cards), and add subtle dividers. Include a date range selector in the top-right corner. The layout stays readable down to 1280px width without horizontal scroll.

Notes

Mistral Medium handles multimodal input, making it suitable for design discussions that reference visual concepts. The response is structured and specific. The trade-off: without image generation capabilities, users must translate these descriptions into mockups themselves, unlike models that can produce visual outputs directly.

Prompt

Analyze this customer support transcript and extract: the customer's core issue, any product bugs mentioned, sentiment progression, and recommended next steps. Transcript: [3,200-word conversation between customer and support agent about billing discrepancies, failed payment retries, and confusion over pro-rated charges]

Output

**Core Issue:** Customer was charged $47 instead of expected $29 after downgrading from annual to monthly plan mid-cycle. **Product Bugs Identified:** 1. Payment retry logic attempted charges 4 times in 2 hours instead of the documented 24-hour intervals 2. Downgrade confirmation email didn't explain pro-rating calculation **Sentiment Progression:** Started frustrated (mentions "third time contacting support"), became confused during pro-rating explanation, ended cautiously satisfied after refund offer. **Recommended Next Steps:** Issue $18 refund, update downgrade email template to include pro-rating formula, review payment retry timing in billing service, follow up in 48 hours to confirm resolution.

Notes

The 262K token context window makes Mistral Medium well-suited for analyzing long-form documents like support transcripts, legal contracts, or research papers. This example demonstrates structured extraction from conversational data. The $1.50/$7.50 per Mtok pricing means processing a 3,200-word transcript costs under $0.02, making it economical for high-volume document analysis workflows.

Use-case deep-dives

Multi-document contract analysis

When 262K context beats chaining for legal doc review

A 4-person legal ops team processing vendor agreements needs to compare clauses across 8-12 contracts simultaneously. Mistral Medium 3.5's 262,144-token window fits roughly 200,000 words—enough to load an entire contract portfolio in one prompt without chunking or retrieval overhead. At $1.50/Mtok input, a 200K-token analysis costs $0.30, making it cheaper than running sequential calls through smaller-context models that need summarization passes. The trade-off: if your contracts average under 40 pages each and you're only comparing 2-3 at a time, a 128K model saves you half the cost. Use this when you're routinely cross-referencing 6+ documents and need the model to hold all context without lossy summarization.

High-volume customer support triage

Cost-effective ticket routing for mid-scale support teams

A 12-person SaaS support team handling 800 tickets daily needs to auto-categorize and route incoming requests. Mistral Medium 3.5's $1.50 input pricing means a 500-token ticket costs $0.00075 to classify—$0.60 per 800 tickets, or $18/month at that volume. The model's multimodal capability handles screenshot attachments without a separate vision API call. The threshold: if you're processing under 200 tickets/day, the setup overhead isn't worth it; above 500/day, you're saving $40-60/month versus $3/Mtok alternatives. Without public benchmarks, validate classification accuracy on your ticket taxonomy during a 2-week pilot before committing to production routing.

Long-session creative brainstorming

When extended context memory beats conversation stitching

A 3-person content studio runs 90-minute brand strategy sessions where the AI needs to remember 40+ ideas, client feedback, and evolving direction without losing thread. Mistral Medium 3.5's 262K window holds an entire session transcript (roughly 65,000 words) plus reference docs—no mid-session summarization that flattens nuance. At $7.50/Mtok output, a 5,000-token synthesis response costs $0.0375, manageable for weekly sessions. The catch: if your sessions are under 30 minutes or you're fine with periodic context resets, a 32K model at half the output cost is smarter. Choose this when session continuity and callback to early ideas matter more than per-token cost, and you're running fewer than 20 sessions/month.

Frequently asked

Is Mistral Medium 3.5 good for general text tasks?

Yes, Mistral Medium 3.5 handles general text work well—drafting, summarization, Q&A, light reasoning. It sits between Mistral's small and large tiers, so you get decent quality without paying flagship prices. The 262k token context window means you can throw entire codebases or long documents at it without chunking.

Is Mistral Medium 3.5 cheaper than GPT-4o or Claude Sonnet?

Yes, significantly. At $1.50 input and $7.50 output per million tokens, Mistral Medium 3.5 undercuts GPT-4o ($2.50/$10) and Claude 3.5 Sonnet ($3/$15). If you're running high-volume workflows where cost matters more than bleeding-edge reasoning, this is a strong pick.

Can Mistral Medium 3.5 handle image inputs reliably?

It supports image inputs, but Mistral hasn't published vision benchmarks for this model. Expect basic image understanding—OCR, simple scene description—but don't rely on it for complex visual reasoning or fine-grained object detection. For serious vision work, use GPT-4o or Claude 3.5 Sonnet instead.

How does Mistral Medium 3.5 compare to Mistral Large?

Mistral Large costs more ($2/$6 per Mtok) and delivers stronger reasoning and coding performance. Medium 3.5 is the budget option when you need Mistral's speed and context window but can tolerate slightly weaker outputs. Without public benchmarks, you're trading proven capability for lower cost.

Should I use Mistral Medium 3.5 for production chatbots?

Only if cost is your primary constraint. The lack of public benchmarks means you're flying blind on quality versus alternatives. Test it thoroughly against your use case first. If users notice worse responses compared to GPT-4o mini or Claude Haiku, the cost savings won't matter.

Data last verified just now.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.