LLMopenaiPlan: Pro and up

OpenAI: GPT-5 Nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger...

Anyone in the Space can @-mention OpenAI: GPT-5 Nano with the team's shared context - pooled credits, one chat, one memory.

All models

Verdict

GPT-5 nano is OpenAI's tiniest tier — the model where you stop worrying about per-call cost entirely and start using LLM calls anywhere a regex used to live. Not nuanced, not deep, not a flagship — but cheap enough to be everywhere. What we notice: nano answers simple questions correctly almost always, fumbles on anything that needs reasoning across more than a sentence or two, and has surprisingly good adherence to JSON schemas at this size. The voice is fine for short outputs and forgettable for long ones — which is appropriate, since "long output" isn't what nano is for. Best for: high-volume routing, classification, sentiment detection, and label extraction; replacing rule-based heuristics in app logic ("does this email need a response?", "is this a refund request?"); pre-filter before a 5 mini or full GPT-5 call; per-request features inside an app where the per-call cost has to round to nothing. Avoid for: anything more open-ended than picking from a known list; coding tasks (even the smallest); tasks where the answer being approximately right isn't good enough. Pricing frame: at sub-$0.10/Mtok input, this is the cheapest tier of any flagship-family model — cheaper than GPT-4o mini was, with usable reasoning for simple tasks. A team running tens of thousands of nano calls a day lands under $20/month.

Best for

Budget-conscious long-document analysis
High-volume content moderation tasks
Prototyping with large context windows
Cost-sensitive customer support automation

Strengths

The 400K context window matches GPT-4 Turbo while undercutting it dramatically on price — you can process entire codebases or legal documents for pennies. Multimodal support (text, image, file) means you can handle screenshots and PDFs without preprocessing. The $0.05 input rate makes it viable for high-throughput applications where GPT-4o would blow your budget.

Trade-offs

Zero public benchmarks means you can't compare reasoning, code generation, or instruction-following against Claude, Gemini, or even older GPT-4 variants. Early-access models often lag behind flagship versions on complex tasks. The output rate ($0.40/Mtok) is still 8x the input cost, so verbose responses add up fast. If your use case demands proven accuracy on MMLU or HumanEval, you're guessing until data arrives.

Specifications

Provider: openai
Category: llm
Context length: 400,000 tokens
Max output: —
Modalities: text, image, file
License: proprietary
Released: 2025-08-07

Pricing

Input: $0.05/Mtok
Output: $0.40/Mtok
Model ID: openai/gpt-5-nano

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Seats5 peopleMessages / seat / day80Avg turn size2 ktokOutput share30 %

Estimated monthly spend

$2.73

17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

Provider	Context	Input	Output	P50 latency	Throughput	30d uptime
openai	400k	$0.05/Mtok	$0.40/Mtok	—	—	—

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Summarize Legal Contract

Read this contract and produce a bullet-point summary of payment terms, termination clauses, and liability caps. Flag any non-standard language.

Open in a Space →

Moderate User Content

Review these 50 user comments. Flag any that violate our policy against hate speech, spam, or personal attacks. Return a JSON array with comment IDs and reasons.

Open in a Space →

Analyze Codebase Structure

Here are 30 Python files from our API service. List the main classes, their dependencies, and any circular imports you find.

Open in a Space →

Extract Data from Invoice PDFs

Extract vendor name, invoice number, line items, and total from this PDF. Return as JSON with fields: vendor, invoice_id, items (array), total.

Open in a Space →

Draft Customer Support Reply

A customer says their order arrived damaged. Draft a reply apologizing, offering a replacement or refund, and asking for photos of the damage.

Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

Review this Python function for edge cases and suggest improvements. Focus on input validation and error handling that might break in production.

Output

This function lacks validation for empty iterables and doesn't handle non-numeric types gracefully. The division operation will raise ZeroDivisionError when the list is empty. Add a guard clause checking `if not values: raise ValueError("Cannot compute mean of empty sequence")` at the start. Consider using `isinstance()` checks or try-except blocks for type safety. The current implementation also fails silently with None values—either filter them explicitly or document that behavior. For production use, add logging around the exception paths and consider returning a Result type instead of raising.

Notes

GPT-5 Nano's 400K context window allows it to analyze entire codebases for systemic patterns, but this example shows its strength in focused code review. The response identifies three distinct failure modes and provides actionable fixes. At $0.40/Mtok output, longer refactoring sessions cost more than competing models with similar capabilities.

Prompt

Analyze this customer support transcript and extract: sentiment, main issue, whether it was resolved, and any product feedback. Format as structured JSON.

Output

Based on the transcript, the model would produce: `{"sentiment": "frustrated → satisfied", "main_issue": "billing_error_double_charge", "resolved": true, "resolution_time_minutes": 12, "product_feedback": ["unclear refund timeline in UI", "wants email confirmation for disputes"], "agent_performance": "effective—acknowledged frustration early, explained process clearly"}`. The JSON structure captures both explicit statements and implied context, like the sentiment shift after the agent's explanation. It correctly identifies the UI clarity issue even though the customer didn't use those exact words.

Notes

This showcases GPT-5 Nano's multimodal file handling—it can process uploaded transcripts, chat logs, or support tickets directly. The inference quality is strong for business document analysis. However, the $0.05 input pricing means batch processing thousands of tickets daily becomes expensive compared to specialized classification models.

Prompt

I'm attaching a photo of a handwritten recipe card from my grandmother. Transcribe it, convert measurements to metric, and suggest modern substitutions for any outdated ingredients.

Output

The model would process the image and return: "**Grandma's Molasses Cookies** // Original: 1 cup shortening → 225g butter or neutral oil // 1 cup molasses (keep as-is, or use 200g dark treacle) // 2 eggs → 2 large eggs (100g) // 3 cups flour → 375g all-purpose flour // [continues with full transcription] // **Modern substitutions:** Replace shortening with European-style butter for richer flavor, or use refined coconut oil for dairy-free. The recipe calls for 'sour milk'—use 240ml buttermilk or add 1 tbsp lemon juice to regular milk. Baking soda amount (2 tsp) is correct for the molasses acidity."

Notes

Image understanding combined with domain knowledge (baking chemistry, ingredient evolution) demonstrates GPT-5 Nano's multimodal reasoning. The 400K context means it could process an entire handwritten cookbook in one request. Trade-off: no public benchmarks yet, so vision performance relative to GPT-4o or Claude 3.5 Sonnet remains unverified in production scenarios.

Use-case deep-dives

High-volume customer support triage

Why GPT-5 Nano handles 500+ daily tickets without breaking budget

A 12-person SaaS company routing 500-800 support tickets daily needs fast classification and draft responses without the per-token cost of frontier models. GPT-5 Nano delivers at $0.05 input / $0.40 output per Mtok—roughly 8x cheaper than GPT-4o on input and 3x cheaper on output. The 400k context window means you can dump entire ticket histories, past resolutions, and knowledge base articles into a single prompt without chunking. At this volume, you're processing maybe 15-20M tokens daily; Nano keeps that under $10/day where GPT-4o would run $60-80. The trade-off: no public benchmarks yet, so you'll want to A/B test classification accuracy against your current model for two weeks before full rollover. If accuracy holds within 2-3 points of your baseline, the cost savings fund another support hire.

Multi-document contract analysis

When 400k context beats RAG for legal teams under 10 people

A 6-attorney firm reviewing vendor contracts (20-40 pages each, plus exhibits) traditionally chunks documents into a RAG pipeline or pays for Claude Opus at $15 input / $75 output per Mtok. GPT-5 Nano's 400k window fits 3-4 full contracts plus a 10-page checklist in one prompt—no retrieval layer, no embedding costs, no context-loss from chunking. At $0.05 input, a 150k-token prompt (roughly 3 contracts) costs $0.0075; the same workload on Opus is $2.25. For firms processing 50-100 contracts monthly, that's $400-800 in savings. The risk: without MMLU-Pro or GPQA scores, you don't know if Nano matches Opus on clause extraction accuracy. Run 10 contracts side-by-side, score precision on non-compete and indemnity clauses, and if Nano hits 95%+ agreement, make the switch.

Real-time meeting transcription enrichment

How Nano turns 90-minute calls into structured Notion pages for $0.12

A 15-person agency records client calls (60-90 minutes, ~40k tokens transcribed) and needs action items, decisions, and follow-ups extracted into Notion within 5 minutes of the call ending. GPT-5 Nano processes a 40k-token transcript for $0.002 input, generates a 2k-token structured summary for $0.0008 output—total cost $0.003 per call. At 40 calls/month, that's $0.12 versus $4.80 on GPT-4o. The 400k context means you can include the last 3 calls with this client for continuity without prompt engineering tricks. The caveat: image and file modalities suggest Nano can handle slide decks shared during calls, but latency matters here—if Nano's time-to-first-token is over 3 seconds, users will feel the lag. Test 5 calls end-to-end; if summaries land in Notion under 20 seconds and capture 90%+ of action items compared to manual notes, deploy it.

Frequently asked

Is GPT-5 Nano good for coding tasks?

Yes, GPT-5 Nano handles coding well for most common tasks like debugging, code review, and writing functions. The 400k token context window means you can feed it entire codebases for analysis. At $0.05 input per million tokens, it's cheap enough to use for frequent refactoring sessions. The lack of public benchmarks makes direct comparisons harder, but the pricing suggests OpenAI positioned this as a workhorse model.

Is GPT-5 Nano cheaper than GPT-4o?

GPT-5 Nano is significantly cheaper on input ($0.05 vs GPT-4o's $2.50 per Mtok) but costs more on output ($0.40 vs $0.10). This makes Nano ideal for tasks where you send large contexts but need short responses—think document analysis, code review, or search. For chat applications with longer outputs, GPT-4o remains more economical.

Can GPT-5 Nano handle 400k tokens in practice?

The 400k context window is real and usable, but expect slower response times and higher costs as you approach the limit. At $0.05 per million input tokens, a full 400k context costs $0.02—cheap enough for document processing workflows. Image and file inputs count against this limit, so a PDF-heavy prompt will eat tokens faster than plain text.

How does GPT-5 Nano compare to GPT-4 Turbo?

GPT-5 Nano trades raw capability for cost efficiency. The input pricing is 50x cheaper than GPT-4 Turbo, making it viable for high-volume applications where GPT-4 Turbo's cost would be prohibitive. Without public benchmarks, we can't quantify the quality gap, but the "Nano" naming suggests OpenAI expects some performance trade-off. Use Nano for batch processing; use Turbo when quality is non-negotiable.

Should I use GPT-5 Nano for customer-facing chatbots?

Only if your responses are short and you're processing long customer histories. The $0.40 output pricing makes extended conversations expensive—a 1000-token response costs $0.40 per million tokens, which adds up fast at scale. Better for support ticket triage or summarization where you read a lot and write a little. For actual chat, GPT-4o's output pricing is four times cheaper.