Inflection: Inflection 3 Pi
Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, emotional intelligence, productivity, and safety. It has access to recent news, and excels in scenarios like customer support and roleplay. Pi...
Anyone in the Space can @-mention Inflection: Inflection 3 Pi with the team's shared context - pooled credits, one chat, one memory.
Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.
Verdict
Best for
- Conversational interfaces with empathetic tone
- Customer support chatbots
- Personal assistant style interactions
- Low-stakes dialogue applications
Strengths
Inflection 3 Pi is engineered for dialogue that feels human and empathetic rather than transactional. The model excels at maintaining conversational context within its 8K window and adapting tone to match user sentiment. For applications where user experience hinges on warmth and natural back-and-forth — customer support, personal assistants, companion bots — this model delivers a distinct personality that generic LLMs lack.
Trade-offs
The 8K context window eliminates most document analysis and long-form content tasks. Without public benchmark data, you're flying blind on coding ability, reasoning depth, and factual accuracy relative to peers. Output pricing at $10/Mtok is steep when GPT-4o and Claude 3.5 Sonnet cost $15/Mtok but deliver far broader capabilities and 10-25x larger context windows. This model trades general competence for conversational polish.
Specifications
- Provider
- inflection
- Category
- llm
- Context length
- 8,000 tokens
- Max output
- 1,024 tokens
- Modalities
- text
- License
- proprietary
- Released
- 2024-10-11
Pricing
- Input
- $2.50/Mtok
- Output
- $10.00/Mtok
- Model ID
inflection/inflection-3-pi
Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.
Team cost calculator
5 seats · 80 msgs/day
Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.
Providers
| Provider | Context | Input | Output | P50 latency | Throughput | 30d uptime |
|---|---|---|---|---|---|---|
| inflection | 8k | $2.50/Mtok | $10.00/Mtok | — | — | — |
Performance
Benchmarks
Works well with
Top MCPs
Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.
How Switchy teams use it
Starter prompts
Empathetic Support Response
A customer just wrote: 'I've been waiting three days for a refund and no one has responded.' Write a support reply that validates their frustration, explains the delay, and gives a concrete timeline.Open in a Space →
Conversational Onboarding
Write a conversational onboarding flow for a budgeting app. The user just signed up. Walk them through linking their first account in a way that feels like a helpful friend, not a manual.Open in a Space →
Tone-Matched Email Draft
I received this email: 'Hey, quick question — can I change my subscription plan mid-month or do I have to wait?' Write a reply that matches their casual tone but clearly explains the policy.Open in a Space →
Personal Assistant Scheduling
A user says: 'I need to meet with Sarah sometime next week, preferably morning, and I have a dentist appointment Wednesday at 10.' Suggest three meeting times and explain your reasoning.Open in a Space →
Empathy-First FAQ Response
Our FAQ says: 'Refunds take 5-7 business days to process.' A user asks why their refund isn't showing yet (it's been 4 days). Write a response that's reassuring and explains what happens next.Open in a Space →
Example outputs
Illustrative - representative of the model's voice and quality, not literal recordings.
I'm feeling overwhelmed with work deadlines and family commitments this week. Can you help me think through how to prioritize?
I hear you — that sounds like a lot to juggle at once. Let's break this down together. First, can you tell me which deadline is hitting soonest? And are there any family commitments that are flexible versus ones that absolutely can't move? Sometimes just mapping out what's fixed versus what has wiggle room helps us see where we actually have choices. I'm here to think through this with you, not just hand you a to-do list.
Pi specializes in conversational, empathetic dialogue — this example shows its strength in active listening and collaborative problem-solving rather than directive advice. The 8K context window limits extended back-and-forth sessions compared to newer models with 100K+ windows.
Explain quantum entanglement to me like I'm curious but not a physicist.
Okay, imagine you have two magic coins. You flip one in New York, and instantly — faster than any signal could travel — the other coin in Tokyo lands on the opposite side. Not because they're sending messages, but because they're connected in a way that doesn't fit our everyday intuition about separate objects. That's entanglement: particles that share a quantum state, so measuring one immediately tells you about the other, no matter the distance. Einstein called it 'spooky action at a distance' because it bothered him too.
Demonstrates Pi's conversational teaching style — accessible analogies without condescension. The explanation balances accuracy with readability. At $10/Mtok output, extended tutoring sessions cost significantly more than models like GPT-4o ($15/Mtok) with larger context windows.
I'm drafting an email to decline a project proposal without burning bridges. The client is a long-term relationship. Can you help me find the right tone?
Absolutely. The key here is warmth plus clarity — you want them to feel respected, not dismissed. Start by acknowledging what you value about the relationship and the proposal itself. Then be direct about why it's not the right fit right now, without over-explaining or leaving false hope. Close by reinforcing that you want to stay connected for future opportunities. Would it help if I drafted a few opening lines for you to react to, so we can dial in the tone together?
Showcases Pi's collaborative approach to sensitive communication tasks — it offers process guidance before jumping to solutions. The model's empathetic framing suits interpersonal scenarios well, though the 8K token limit means you can't paste long email threads for context like you could with 128K+ models.
Use-case deep-dives
When empathetic routing matters more than raw accuracy
A 12-person SaaS support team handling 200+ inbound chats daily needs a model that can read emotional tone and route tickets without sounding robotic. Inflection 3 Pi is purpose-built for conversational empathy—it's trained to detect frustration, urgency, and confusion in natural language, then respond in a way that keeps customers calm while gathering context. At $2.50 input / $10.00 output per Mtok, it's 3-4× the cost of GPT-4o mini, so the math only works if you're measuring success by de-escalation rate, not ticket volume. The 8,000-token context window is tight for multi-turn threads that reference past orders or account history; if your average chat runs past 15 exchanges, you'll hit limits. Use this model when your brand voice depends on sounding human and your support load justifies the premium—otherwise, a cheaper general-purpose model with prompt tuning will close 80% of the gap.
Where Inflection's empathy training justifies the output cost
A telehealth startup building an AI companion for anxiety management needs a model that won't sound clinical or dismissive when users share vulnerable thoughts. Inflection 3 Pi is explicitly trained on empathetic dialogue—it's designed to validate emotion, ask clarifying questions, and avoid the blunt factual tone that makes other models feel cold in sensitive contexts. The $10.00 output cost per Mtok is steep (5× GPT-4o mini), but if your use-case is high-stakes emotional support where a single tone-deaf response tanks user trust, the premium buys you a model that consistently prioritizes relational safety over efficiency. The 8,000-token window limits session depth; if you need to reference weeks of prior conversation, you'll need retrieval augmentation. Deploy this model when empathy is the product and your users are paying for the experience, not when you're optimizing for cost per interaction.
When conversational warmth beats knowledge retrieval speed
A 40-person remote company wants an AI assistant that answers new-hire questions about benefits, PTO, and company culture without sounding like a policy bot. Inflection 3 Pi excels at this: it's trained to mirror conversational tone, ask follow-ups when questions are vague, and frame answers in a way that feels like talking to a friendly HR rep. At $2.50 input / $10.00 output per Mtok, it's pricier than GPT-4o mini, but the ROI shows up in engagement—new hires actually use it instead of ignoring the chatbot and Slacking HR directly. The 8,000-token context is enough for a single onboarding session but won't hold a week's worth of back-and-forth; if you need persistent memory, pair it with a vector store. Choose this model when your onboarding experience is a retention lever and you're willing to pay for a voice that doesn't sound like a FAQ scraper.
Frequently asked
Is Inflection 3 Pi good for general conversation and personal assistance?
Yes, Pi is specifically designed for conversational use cases and personal assistance. It prioritizes empathetic, natural dialogue over raw task performance. If you need a model for coding, analysis, or structured outputs, use Claude or GPT-4 instead. Pi excels at open-ended chat where tone and personality matter more than benchmark scores.
Is Inflection 3 Pi cheaper than GPT-4 or Claude?
Yes, significantly. At $2.50 input and $10 output per million tokens, Pi costs roughly 60-80% less than GPT-4 Turbo and about half of Claude Sonnet's pricing. For high-volume conversational applications where you don't need frontier reasoning capabilities, the cost advantage is substantial. Output tokens are the expensive part here, so keep responses concise.
Can Inflection 3 Pi handle long documents with its 8K context window?
No, 8,000 tokens is limiting for document work. That's roughly 6,000 words of input—enough for short articles or chat history, but inadequate for research papers, legal contracts, or multi-document analysis. If you need long-context work, use Claude Opus (200K), Gemini 1.5 Pro (1M), or GPT-4 Turbo (128K) instead.
How does Inflection 3 Pi compare to GPT-3.5 for chatbot applications?
Pi likely offers better conversational quality and personality than GPT-3.5, though without public benchmarks it's hard to quantify reasoning gaps. GPT-3.5 is cheaper ($0.50/$1.50 per Mtok) and has a 16K context window. Choose Pi if conversational tone is critical and you can afford the 4-6x price premium. Choose GPT-3.5 for cost-sensitive deployments or longer context needs.
Should I use Inflection 3 Pi for customer support chatbots?
Only if empathetic tone is your top priority and you don't need tool use or structured outputs. Pi lacks function calling and has no public accuracy benchmarks, making it risky for support scenarios requiring precise information retrieval or CRM integration. For production support, Claude or GPT-4 with function calling will be more reliable despite higher costs.