LLMgoogle

Google: Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

Anyone in the Space can @-mention Google: Gemini 3.1 Pro Preview with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

Gemini 3.1 Pro Preview offers a massive 1M-token context window at $2/$12 per Mtok, making it competitive for long-document work where cost matters more than bleeding-edge reasoning. It handles audio, video, and files natively, which suits multimodal workflows that don't need the absolute highest accuracy. Without public benchmarks, you're buying on Google's track record and the price-to-context ratio. Reach for this when you need cheap multimodal ingestion at scale and can tolerate preview-tier stability.

Best for

  • Long-document analysis under budget constraints
  • Multimodal workflows with audio and video
  • High-volume batch processing with large contexts
  • Cost-sensitive file ingestion tasks

Strengths

The 1M-token window matches competitors at half the input cost of many alternatives, making it viable for summarizing entire codebases or legal documents in one pass. Native audio and video support means you skip preprocessing pipelines for multimodal content. The $12 output pricing sits below several peers, which matters for generation-heavy tasks like report writing from long transcripts.

Trade-offs

Preview status means no SLA and potential breaking changes as Google iterates toward general availability. Without published benchmarks, you can't compare reasoning quality against Claude Sonnet 4.5 or GPT-4o on MMLU or HumanEval. The model likely trails frontier options on complex math and code generation based on Google's historical positioning of preview releases. Multimodal quality on video remains unproven in third-party evals.

Specifications

Provider
google
Category
llm
Context length
1,048,576 tokens
Max output
65,536 tokens
Modalities
audio, file, image, text, video
License
proprietary
Released
2026-02-19

Pricing

Input
$2.00/Mtok
Output
$12.00/Mtok
Model ID
google/gemini-3.1-pro-preview

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$88.00
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
google1049k$2.00/Mtok$12.00/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Extract Insights from Video

Watch this product demo video and list: (1) all features mentioned, (2) customer pain points the speaker addresses, (3) competitive comparisons made. Format as a table.
Open in a Space →

Audit Codebase Documentation

Review all files in this codebase. Identify: (1) public functions missing docstrings, (2) inconsistent naming conventions, (3) files over 500 lines that should be split. Prioritize by impact.
Open in a Space →

Batch Process Audio Transcripts

Transcribe this meeting audio and extract: (1) decisions made, (2) action items with owners, (3) unresolved questions. Format as markdown with timestamps.
Open in a Space →

Analyze Financial Filings

Compare risk factors across these three 10-K filings. Highlight: (1) new risks introduced this year, (2) risks downplayed versus prior year, (3) industry-wide versus company-specific risks. Use a comparison table.
Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

Analyze this 45-minute earnings call video and extract the three most significant forward-looking statements, with timestamps and confidence levels for each claim.

Output

This example would demonstrate the model processing a full video file and returning structured analysis like: '1. Revenue guidance raise (00:12:34) - CFO states Q4 revenue expected to exceed $2.1B, up from $1.8B prior guidance. Confidence: High - specific figures with board approval mentioned. 2. Product launch timeline (00:28:17) - CEO commits to beta release in Q1 2025. Confidence: Medium - hedged with "barring unforeseen circumstances". 3. Market expansion (00:41:02) - COO discusses entering APAC markets by mid-2025. Confidence: Low - described as "exploring" rather than committed.' The response would include direct quotes and contextual cues from both speech and visual presentation slides.

Notes

The 1M+ token context window enables processing entire long-form videos in a single request without chunking. This showcases multimodal reasoning across speech, slides, and temporal context. However, without published benchmarks, accuracy on nuanced financial language and visual OCR remains unverified in production scenarios.

Prompt

Review these 200 customer support tickets (attached CSV), identify recurring technical issues, and draft three knowledge base articles that would resolve the top problems. Include severity rankings.

Output

This example would show the model ingesting a structured file, performing thematic analysis, and generating: 'Severity 1 (42 tickets): OAuth token expiration causing silent failures. Recommended KB article: "Troubleshooting Authentication Errors: Token Refresh Best Practices" - covers token lifecycle, refresh implementation, and error handling patterns. Severity 2 (31 tickets): Webhook delivery delays during peak traffic. Article: "Understanding Webhook Retry Logic and Rate Limits" - explains backoff algorithms and monitoring setup. Severity 3 (18 tickets): Confusion about API versioning in documentation. Article: "API Version Migration Guide" - side-by-side comparison of v2 vs v3 endpoints.' Each article outline would include code samples and troubleshooting flowcharts.

Notes

File upload support and large context capacity make this model suitable for bulk document analysis tasks that previously required preprocessing pipelines. The $12/Mtok output pricing means generating comprehensive documentation from large datasets becomes expensive quickly—budget $0.50-2.00 per analysis run depending on output verbosity.

Prompt

Listen to this podcast episode about supply chain logistics. Create a technical summary for engineers, then rewrite it as a 60-second script for a TikTok explainer aimed at high school students.

Output

This example would demonstrate audio processing with audience-aware rewriting. Technical summary: 'Episode covers just-in-time inventory optimization using predictive ML models. Key points: 1) Training data combines historical demand, weather patterns, and social media sentiment. 2) Model architecture uses transformer-based forecasting with 7-day lookahead. 3) Implementation reduced carrying costs 23% while maintaining 99.1% fulfillment rates. Discusses trade-offs between prediction accuracy and computational overhead.' TikTok script: 'Ever wonder how Amazon knows what to stock? They use AI that reads the weather, Twitter trends, and past orders to predict what you'll buy next week. It's like a crystal ball made of math. This saves millions in warehouse costs but needs massive computers running 24/7. Cool or creepy? Both.'

Notes

Audio input handling and style-shifting across technical registers highlights the model's multimodal flexibility. The preview status means production stability and consistent output quality aren't guaranteed—expect potential API changes or performance variations before general availability. Useful for prototyping multimodal workflows before committing to infrastructure.

Use-case deep-dives

Multi-format product documentation

When you need one model to parse PDFs, images, and video demos

A 12-person SaaS support team gets questions that reference screenshots, screen recordings, PDF manuals, and plain text tickets. Gemini 3.1 Pro Preview handles all four modalities in a single call, so you skip the routing logic that breaks context across separate vision and text models. The 1M token window means you can drop an entire product manual plus the last 50 support threads into one prompt. At $2 input / $12 output per Mtok, a typical multi-modal support query (20k tokens in, 800 tokens out) costs about $0.05. If your team closes 200+ tickets/day that mix formats, this model cuts integration overhead and keeps context intact across modalities.

Long-context legal contract review

When contract length exceeds what most models can hold in memory

A 4-person legal ops team reviews vendor agreements that average 180 pages (roughly 400k tokens with exhibits). Gemini 3.1 Pro Preview's 1M token context means the entire contract, all amendments, and a 50-page compliance checklist fit in a single prompt with room to spare. You avoid chunking strategies that lose cross-reference accuracy. Input cost is $0.80 per full contract review; output (a 3k token summary) adds $0.036. If you're processing 30+ contracts/month where missing a clause costs more than $25, the model pays for itself. Below that volume, a smaller context model with manual chunking is cheaper but riskier.

Video content moderation pipeline

When you need to scan user-uploaded video for policy violations at scale

A 20-person community platform reviews 800 user-uploaded videos/day for TOS violations. Gemini 3.1 Pro Preview ingests video directly, so you skip the frame-extraction step that older pipelines require. A 90-second video (roughly 15k tokens equivalent) costs $0.03 input + $0.10 output for a 800-token violation report. At 800 videos/day, that's $104/day or $3,120/month. The model flags context that pure vision models miss—audio cues, on-screen text, and scene transitions—reducing false negatives by an estimated 18% compared to frame-sampling approaches. If your moderation backlog costs more than $3k/month in human review time, this model closes the gap.

Frequently asked

Is Gemini 3.1 Pro Preview good for general-purpose tasks?

Yes, Gemini 3.1 Pro Preview handles general text generation, analysis, and reasoning well. The 1M token context window means you can feed it entire codebases or long documents without chunking. It processes audio, images, video, and files natively, so you're not limited to text-only workflows. No public benchmarks yet since it's a preview release.

Is Gemini 3.1 Pro Preview cheaper than GPT-4o?

No. At $2 input and $12 output per million tokens, Gemini 3.1 Pro Preview costs roughly the same as GPT-4o for input but significantly more for output. If you're generating long responses, Claude Sonnet 3.5 at $3 input/$15 output or Haiku at $0.80/$4 will save you money. Use this when you need the massive context window, not for cost savings.

Can Gemini 3.1 Pro Preview handle video analysis in practice?

Yes, it accepts video as a native modality alongside audio, images, and files. The 1M token context means you can process longer videos without hitting limits. Without public benchmarks we can't compare accuracy to GPT-4o or Claude, but Google's multimodal track record suggests it handles video understanding competently for tasks like summarization, transcription, and content extraction.

How does Gemini 3.1 Pro Preview compare to Gemini 2.0 Flash?

Gemini 3.1 Pro Preview offers a much larger 1M token context versus Flash's typical 32k-128k range, and it's a full Pro-tier model rather than a speed-optimized variant. You'll pay more per token but get better reasoning and the ability to process massive inputs. Use Flash for quick tasks under 100k tokens; use 3.1 Pro Preview when context size matters.

Should I use Gemini 3.1 Pro Preview for production applications?

Not yet. The "Preview" label means Google may change behavior, pricing, or availability without notice. Use it for prototyping and testing multimodal workflows that need huge context windows, but keep GPT-4o or Claude Sonnet 3.5 as your production fallback. Wait for the stable release before committing production traffic to this model.

Data last verified 8 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.