LLMminimax

MiniMax: MiniMax M2-her

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message...

Anyone in the Space can @-mention MiniMax: MiniMax M2-her with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

MiniMax M2-her is a cost-effective text model with a 65K context window, priced at $0.30 input and $1.20 output per million tokens. Without public benchmark data, its performance profile remains unclear relative to established alternatives like GPT-4o-mini or Claude Haiku. The pricing sits between budget and mid-tier options, making it worth testing if you need moderate context length and are willing to validate quality on your own workloads. Best suited for teams already in the MiniMax ecosystem or those seeking alternatives to Western providers.

Best for

  • Cost-sensitive applications with moderate context needs
  • Teams exploring non-Western model providers
  • Workloads requiring 65K token windows
  • Prototyping before committing to premium models

Strengths

The 65K context window handles most document analysis and multi-turn conversation tasks without chunking. Input pricing at $0.30/Mtok undercuts many mid-tier models, making it viable for high-volume read-heavy workflows. The output rate of $1.20/Mtok keeps generation costs reasonable for applications that produce moderate-length responses. As a proprietary model from MiniMax, it may offer different reasoning patterns than OpenAI or Anthropic alternatives.

Trade-offs

Lack of public benchmarks means you cannot compare performance against GPT-4o-mini, Claude Haiku, or Gemini Flash on standard reasoning and coding tasks. The context window is smaller than Gemini 2.0 Flash's 1M tokens, limiting use cases that require full-book or codebase ingestion. Output pricing is 4x the input rate, which penalizes generation-heavy tasks like creative writing or code synthesis. Without transparency on training data or safety tuning, compliance and content moderation behavior is harder to predict.

Specifications

Provider
minimax
Category
llm
Context length
65,536 tokens
Max output
2,048 tokens
Modalities
text
License
proprietary
Released
2026-01-23

Pricing

Input
$0.30/Mtok
Output
$1.20/Mtok
Model ID
minimax/minimax-m2-her

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$10.03
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
minimax66k$0.30/Mtok$1.20/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Document Q&A Test

Read this 20-page technical specification and answer: What are the three main security requirements listed in section 4.2, and how do they interact with the authentication flow described in section 3?
Open in a Space →

Cost-Optimized Summarization

Summarize this 5,000-word article in 150 words, focusing on the author's main argument and the two strongest pieces of supporting evidence.
Open in a Space →

Multi-Turn Conversation

I'm planning a three-day trip to Kyoto. Ask me five questions one at a time to understand my preferences, then suggest an itinerary based on my answers.
Open in a Space →

Structured Data Extraction

Extract all product names, prices, and availability status from this e-commerce page HTML and return them as a JSON array of objects.
Open in a Space →

Code Review Assistance

Review this Python function for potential bugs, performance issues, and style violations. Explain each issue and suggest a fix.
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

Review this Python function for potential edge cases and suggest improvements: def calculate_discount(price, discount_percent): return price - (price * discount_percent / 100)

Output

This function has several edge cases to consider: 1. **Negative values**: No validation for negative prices or discount percentages, which could produce nonsensical results. 2. **Discount > 100%**: A discount_percent above 100 would result in a negative final price. 3. **Type safety**: No type checking—strings or None could cause runtime errors. Improved version: ```python def calculate_discount(price: float, discount_percent: float) -> float: if price < 0: raise ValueError("Price cannot be negative") if not 0 <= discount_percent <= 100: raise ValueError("Discount must be between 0 and 100") return price * (1 - discount_percent / 100) ``` This adds type hints, validates inputs, and uses a more numerically stable calculation.

Notes

Demonstrates systematic code review with practical security and robustness concerns. The 65K token context window supports reviewing larger codebases in a single prompt. However, without benchmark data, it's unclear how this model compares to specialized code models on complex refactoring tasks.

Prompt

Explain the difference between TCP and UDP protocols to a junior developer who understands basic networking but hasn't worked with transport layers directly.

Output

Think of TCP and UDP as two different delivery services: **TCP (Transmission Control Protocol)** is like registered mail. It guarantees your package arrives, in order, and confirms delivery. Before sending, it establishes a connection (handshake), then sends data in numbered packets. If a packet is lost, TCP resends it. This reliability comes with overhead—slower but dependable. Use TCP for web browsing, file transfers, emails—anything where accuracy matters more than speed. **UDP (User Datagram Protocol)** is like dropping postcards in the mail. No handshake, no delivery confirmation, no guaranteed order. Packets just fire off. Much faster, but some might get lost or arrive scrambled. Use UDP for video streaming, online gaming, DNS lookups—scenarios where speed matters and occasional data loss is acceptable. Rule of thumb: need reliability? TCP. Need speed and can tolerate loss? UDP.

Notes

Shows clear technical explanation with appropriate analogies for the target audience. The $0.30/$1.20 per Mtok pricing makes this cost-effective for documentation and educational content generation. The explanation balances accuracy with accessibility, though depth on advanced topics remains untested without benchmark comparisons.

Prompt

Generate a professional email declining a meeting request because of scheduling conflicts, but expressing interest in connecting later. Keep it warm but concise.

Output

Subject: Re: Meeting Request—Alternative Times? Hi [Name], Thank you for reaching out—I'd genuinely like to discuss [topic] with you. Unfortunately, I have conflicts during the times you proposed. Would any of these work instead? • [Date/Time Option 1] • [Date/Time Option 2] • [Date/Time Option 3] If none of these suit, I'm happy to work around your availability. Looking forward to connecting. Best, [Your Name]

Notes

Produces professional business communication with appropriate tone calibration. The model handles the nuance of declining politely while maintaining relationship warmth. At this price point, it's viable for routine business writing tasks. The 65K context window is oversized for this use case but enables handling longer email threads when needed.

Use-case deep-dives

Budget-conscious content drafting

When MiniMax M2-her makes sense for high-volume blog teams

A 4-person content studio pushing 80 blog drafts per month needs a model that won't blow the budget on first-pass generation. MiniMax M2-her at $0.30 input / $1.20 output sits in the lower third of the market—roughly half the cost of GPT-4o for comparable context. The 65k token window handles full article briefs plus reference docs without chunking. Without public benchmarks, you're trading proven performance data for price arbitrage. If your workflow already includes human editing and you're spending $400+/month on generation, run a two-week pilot on 20% of your drafts. Watch for factual drift and tone consistency. If those hold, the savings compound fast at volume.

Internal documentation summarization

MiniMax M2-her for teams consolidating Slack and Notion archives

A 12-person ops team wants to turn 6 months of Slack threads and Notion pages into onboarding summaries. MiniMax M2-her's 65k context window fits most weekly thread dumps in a single call, and the $0.30 input rate keeps batch jobs under $50 for the full archive. The lack of public benchmarks means you can't predict summarization quality against ROUGE or BERTScore baselines, so this is a fit if your use-case tolerates 15-20% manual correction. Run the first month's archive as a test batch. If the summaries need less than 10 minutes of editing per doc, the model pays for itself in saved research time. If correction time climbs past 20 minutes, switch to a benchmarked alternative.

Exploratory chatbot prototyping

When to prototype customer support bots with MiniMax M2-her

A 3-person SaaS team is building a support chatbot MVP and needs 2-3 weeks of iteration before committing to a production model. MiniMax M2-her's pricing lets you run 500 test conversations for under $20, and the 65k window handles full ticket histories without truncation. The missing benchmark data is a non-issue in prototype phase—you're validating conversation flow and response structure, not optimizing F1 scores. If your MVP gets traction and conversation volume crosses 200/day, plan the migration to a model with public evals before you scale. For the prototype window, MiniMax M2-her keeps experimentation costs low enough to test 4-5 prompt strategies without budget anxiety.

Frequently asked

Is MiniMax M2-her good for general text generation tasks?

M2-her handles standard text generation competently with its 65K token context window, making it suitable for document analysis and long-form content. Without public benchmarks, it's hard to assess quality against GPT-4 or Claude, but the pricing suggests a mid-tier model. Test it on your specific use case before committing to production workloads.

Is MiniMax M2-her cheaper than GPT-4o or Claude Sonnet?

Yes, significantly. At $0.30 input and $1.20 output per million tokens, M2-her costs roughly 10x less than GPT-4o ($5/$15) and 6x less than Claude Sonnet 3.5 ($3/$15). If you're processing high volumes of text where top-tier reasoning isn't critical, the cost savings are substantial. Just verify output quality meets your standards first.

Can MiniMax M2-her handle 65K tokens reliably in practice?

The 65K context window is advertised, but real-world performance depends on how the model maintains coherence across that span. Without published needle-in-haystack or long-context benchmarks, assume degradation past 40-50K tokens. For critical applications requiring full context utilization, test with your actual documents before deploying. Gemini 1.5 Pro offers better-validated long-context performance if that's your priority.

How does MiniMax M2-her compare to other Chinese LLMs like Qwen or DeepSeek?

Without public benchmarks, direct comparison is speculative. Qwen2.5 and DeepSeek-V3 publish extensive eval scores and typically outperform on coding and reasoning tasks. M2-her's pricing sits between budget and premium tiers, suggesting mid-range capabilities. If you need proven performance metrics for decision-making, choose models with published MMLU, HumanEval, or GSM8K scores instead.

Should I use MiniMax M2-her for production chatbots or customer support?

Only after thorough testing. The lack of public benchmarks means you're flying blind on safety, hallucination rates, and instruction-following quality. For customer-facing applications, start with models that have documented safety evaluations like GPT-4o-mini or Claude Haiku. Use M2-her for internal tools or non-critical workflows where you can tolerate occasional errors and iterate based on real usage.

Data last verified 8 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.