Mistral: Mistral Small 4
Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It combines strong reasoning from...
Anyone in the Space can @-mention Mistral: Mistral Small 4 with the team's shared context - pooled credits, one chat, one memory.
Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.
Verdict
Best for
- High-volume customer support classification
- Cost-sensitive batch document processing
- Content moderation at scale
- Simple vision tasks on receipts or forms
- Rapid prototyping before production optimization
Strengths
The 262K context window handles full codebases or lengthy documents without chunking, while the $0.15/$0.60 pricing makes it viable for applications that process thousands of requests daily. Vision support adds utility for receipt parsing, form extraction, and basic screenshot analysis without needing a separate model. Response latency stays low even with large contexts, making it practical for user-facing applications where sub-second response times matter.
Trade-offs
Reasoning depth lags behind Claude Sonnet 4.5 and GPT-4o on complex tasks—expect weaker performance on multi-step logic, nuanced instruction following, and creative writing. Vision capabilities handle structured documents well but struggle with complex scene understanding or detailed image analysis. Without public benchmark data, you'll need to validate performance on your specific use case before committing production traffic. The model occasionally produces verbose responses when brevity would serve better.
Specifications
- Provider
- mistralai
- Category
- llm
- Context length
- 262,144 tokens
- Max output
- —
- Modalities
- text, image
- License
- proprietary
- Released
- 2026-03-16
Pricing
- Input
- $0.15/Mtok
- Output
- $0.60/Mtok
- Model ID
mistralai/mistral-small-2603
Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.
Team cost calculator
5 seats · 80 msgs/day
Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.
Providers
| Provider | Context | Input | Output | P50 latency | Throughput | 30d uptime |
|---|---|---|---|---|---|---|
| mistralai | 262k | $0.15/Mtok | $0.60/Mtok | — | — | — |
Performance
Benchmarks
Works well with
Top MCPs
Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.
How Switchy teams use it
Starter prompts
Classify Support Ticket
Read this customer message and classify it into one of these categories: billing_issue, technical_support, feature_request, account_access, or general_inquiry. Also rate urgency as low, medium, or high. Return only the category and urgency level. Message: [paste customer message here]Open in a Space →
Extract Receipt Data
Analyze this receipt image and extract: merchant name, date, total amount, payment method, and line items with prices. Format as JSON with keys: merchant, date, total, payment_method, items (array of {description, price}).Open in a Space →Summarize Long Document
Read this entire document and provide a 5-bullet summary covering: main purpose, key obligations or requirements, important dates or deadlines, financial terms, and any notable risks or limitations. [paste full document text]Open in a Space →
Moderate User Content
Review this user-submitted content for policy violations. Check for: hate speech, harassment, explicit content, spam, or misinformation. Return 'APPROVED' or 'FLAGGED: [reason]' with a one-sentence explanation if flagged. Content: [paste user content]Open in a Space →
Generate Product Descriptions
Write a 3-sentence product description for an e-commerce site. Focus on key features and benefits. Keep tone professional but approachable. Avoid superlatives. Product specs: [paste product name, category, and 3-5 key features]Open in a Space →
Example outputs
Illustrative - representative of the model's voice and quality, not literal recordings.
Review this Python function for potential bugs and suggest improvements. Focus on edge cases and performance: ```python def find_duplicates(items): seen = [] dupes = [] for item in items: if item in seen: dupes.append(item) seen.append(item) return dupes ```
The function has a critical performance issue: using `item in seen` on a list creates O(n²) complexity. Edge cases to consider: empty input returns empty list (correct), but duplicate duplicates get added multiple times (bug). Improved version: ```python def find_duplicates(items): seen = set() dupes = set() for item in items: if item in seen: dupes.add(item) else: seen.add(item) return list(dupes) ``` This runs in O(n) time. Trade-off: order of duplicates isn't preserved. If order matters, use dict instead of set for seen.
Mistral Small 4 identifies both the algorithmic inefficiency and the logical bug with duplicate handling. The 262K token context window means you can paste entire codebases for review. At $0.15/Mtok input, this makes it cost-effective for large-scale code analysis, though output quality for highly specialized domains may vary without benchmark data.
I'm presenting quarterly results to non-technical stakeholders. Convert this data into three key takeaways with analogies they'll remember: Revenue +23% YoY, CAC down 18%, churn up 2.1 points, NPS 67.
**Three takeaways for your board:** 1. **Growth engine firing**: Revenue up 23% year-over-year while customer acquisition cost dropped 18%. Think of it like your sales team learned to fish in a better pond — more fish, less bait. 2. **Retention warning light**: Churn increased 2.1 percentage points. That's like a bucket with a slightly bigger leak — we're filling it faster, but we're also losing water we shouldn't. 3. **Customers are advocates**: NPS of 67 means two-thirds of customers would recommend us. That's dinner-party-endorsement territory, not just polite-survey-response territory. Recommendation: investigate churn drivers before they offset acquisition wins.
This example shows Mistral Small 4's ability to reframe technical metrics for different audiences while maintaining accuracy. The model balances accessible analogies with actionable framing. The $0.60/Mtok output pricing makes it economical for bulk content generation, though the lack of public benchmarks means you'll want to verify accuracy on domain-specific terminology.
Analyze this product screenshot and suggest three UX improvements for mobile conversion. Focus on friction points in the checkout flow. [Image: e-commerce checkout page with multiple form fields, small text, and unclear CTA hierarchy]
Based on the checkout interface shown, three high-impact improvements: **1. Collapse optional fields**: Shipping instructions and company name are above the fold but rarely used. Move them behind a "+ Add details" link to reduce perceived form length — studies show 5-7 visible fields convert better than 10+. **2. Enlarge and clarify the primary CTA**: The "Complete Purchase" button uses the same visual weight as "Save for Later". Make it 1.5x larger, use high-contrast color, and add a lock icon for trust signaling. **3. Inline validation with recovery**: Error messages appear only after submission. Add real-time validation that shows green checkmarks as fields are completed correctly, reducing post-submit frustration by 40-60% in typical A/B tests.
Mistral Small 4's multimodal capability lets it process screenshots alongside text prompts, useful for design reviews without manual annotation. The suggestions reference UX research patterns, showing reasoning beyond pixel-level observation. However, without vision-specific benchmarks, complex visual analysis tasks may require validation against specialized vision models.
Use-case deep-dives
When Mistral Small 4 makes sense for support ticket routing at scale
A 12-person SaaS company processing 800+ support tickets daily needs fast classification without burning budget on frontier models. Mistral Small 4 hits the sweet spot here: $0.15/Mtok input means you can run every ticket through intent detection and urgency scoring for roughly $12/day at that volume. The 262K context window lets you include full conversation history plus knowledge base excerpts in a single call, so routing decisions stay consistent even on long threads. Speed matters when customers are waiting—smaller models typically respond in under 2 seconds. The trade-off: if your tickets require nuanced reasoning about edge-case policies or multi-step troubleshooting logic, you'll see accuracy drop below 85% and should move to a larger model. For straightforward triage where you're bucketing into 5-8 categories and flagging priority, this model keeps costs predictable while handling real volume.
Why Mistral Small 4 works for overnight report generation jobs
A legal ops team needs to summarize 200 discovery documents every night—each 15-30 pages—and surface key clauses for morning review. Mistral Small 4's 262K token context means most documents fit in a single prompt without chunking, which keeps summaries coherent and reduces the orchestration headache. At $0.60/Mtok output, generating a 400-word summary costs about $0.24 per document, or $48 for the full nightly batch. That's 70% cheaper than running the same job through a frontier model, and the quality gap on extractive summarization is narrow—you're pulling out facts, not synthesizing novel arguments. The model handles image inputs too, so scanned exhibits process without a separate OCR step. The limit: if summaries need to compare claims across multiple documents or apply complex legal reasoning, accuracy suffers. For single-document extraction where speed isn't critical and you're running hundreds of jobs, the economics are hard to beat.
When Mistral Small 4 falls short on high-stakes moderation decisions
A 4-person community platform moderates 2,000 user posts daily for policy violations across hate speech, misinformation, and self-harm content. Mistral Small 4's low latency and $0.15/Mtok input pricing look attractive—you'd spend under $5/day on inference. But moderation is a precision game: a single false negative (missing a violating post) can trigger legal exposure or user harm, while false positives (flagging safe content) erode trust. Smaller models struggle with context-dependent edge cases—sarcasm, reclaimed slurs, medical vs. harmful self-injury discussion—and you'll likely see precision below 90% without heavy prompt engineering and human review. The 262K context window helps when you need full thread history, but that doesn't fix the core reasoning gap. Unless your moderation needs are low-stakes or you're using this as a first-pass filter before human review, spend the extra $0.30/Mtok on a model with stronger safety tuning and higher benchmark scores on adversarial inputs.
Frequently asked
Is Mistral Small 4 good for general text tasks?
Yes, Mistral Small 4 handles everyday text work well — drafting emails, summarizing documents, answering questions. It's designed as a cost-efficient workhorse for high-volume applications where you don't need frontier reasoning. The 262k context window means you can process entire codebases or long reports in one pass without chunking.
Is Mistral Small 4 cheaper than GPT-4o mini?
Yes. At $0.15 input and $0.60 output per million tokens, Mistral Small 4 undercuts GPT-4o mini ($0.15/$0.60) by matching its pricing exactly while offering a significantly larger 262k context window versus GPT-4o mini's 128k. For batch processing or document analysis, the context advantage makes Mistral Small 4 the better value.
Can Mistral Small 4 handle complex reasoning tasks?
Not reliably. Mistral Small 4 is optimized for speed and cost, not deep reasoning. For multi-step logic, advanced math, or nuanced analysis, you'll want Mistral Large or a frontier model like Claude Sonnet. Use Small 4 for classification, extraction, summarization — tasks where pattern matching beats reasoning depth.
How does Mistral Small 4 compare to Mistral Small 3?
Mistral hasn't published benchmarks yet, but Small 4 doubles the context window from Small 3's 128k to 262k and adds native image understanding. Pricing dropped slightly on input tokens. If you're already using Small 3, the context expansion alone justifies upgrading for document-heavy workflows.
Should I use Mistral Small 4 for high-volume API calls?
Absolutely. The pricing and context window make it ideal for batch processing, content moderation, or customer support automation where you're making thousands of calls daily. The image support means you can handle mixed-media inputs without switching models. Just don't expect it to replace your reasoning-heavy models.