LLMgoogle

Google: Gemma 4 31B (free)

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

Anyone in the Space can @-mention Google: Gemma 4 31B (free) with the team's shared context - pooled credits, one chat, one memory.

All models

Starter is free forever - 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

Gemma 4 31B delivers strong reasoning and multimodal capabilities at zero cost, making it ideal for prototyping and high-volume workflows where budget matters more than bleeding-edge performance. The 262K context window handles long documents comfortably, though response quality trails paid alternatives like Claude or GPT-4o on complex tasks. Reach for this when you need a capable workhorse that won't generate an invoice.

Best for

Prototyping multimodal applications without spend
High-volume document processing on tight budgets
Educational projects requiring vision capabilities
Internal tools where cost trumps peak accuracy

Strengths

The zero-cost pricing removes friction from experimentation and scales to unlimited volume without budget anxiety. Native image and video understanding means you can build multimodal workflows without stitching together separate services. The 262K context window accommodates full research papers or lengthy codebases in a single pass, and Google's infrastructure keeps latency reasonable despite the free tier.

Trade-offs

Response quality lags behind frontier models on nuanced reasoning tasks—expect more verbose outputs and occasional logic gaps compared to Claude Sonnet or GPT-4o. Multimodal performance is functional but not exceptional; vision tasks requiring fine detail or complex spatial reasoning will surface limitations. The proprietary license restricts commercial deployment options, and lack of public benchmarks makes apples-to-apples comparison difficult.

Specifications

Provider: google
Category: llm
Context length: 262,144 tokens
Max output: 32,768 tokens
Modalities: image, text, video
License: proprietary
Released: 2026-04-02

Pricing

Input: $0.00/Mtok
Output: $0.00/Mtok
Model ID: google/gemma-4-31b-it:free

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool - one plan, one balance for everyone.

Team cost calculator

Seats5 peopleMessages / seat / day80Avg turn size2 ktokOutput share30 %

Estimated monthly spend

Freeno token cost

17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool - one plan, one balance for everyone.

Providers

Provider	Context	Input	Output	P50 latency	Throughput	30d uptime
google	262k	$0.00/Mtok	$0.00/Mtok	—	—	—

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Analyze Product Screenshots

Analyze this screenshot and list every interactive element you see. For each button, input field, and navigation item, describe its purpose and suggest one UX improvement.

Open in a Space →

Summarize Long Research Papers

Read this entire research paper and provide a 300-word summary covering: research question, methodology, key findings, and limitations. Use bullet points for findings.

Open in a Space →

Generate Test Data Sets

Generate 50 realistic customer records with name, email, purchase history (3-5 items each), and account creation date. Format as JSON array. Make the data diverse and believable.

Open in a Space →

Explain Code from Screenshots

Look at this code screenshot and explain what the function does in plain English. Identify any potential bugs or performance issues, and suggest one refactoring improvement.

Open in a Space →

Draft Educational Content

Create a 5-minute lesson plan teaching high schoolers about photosynthesis. Include a hook, three key concepts with analogies, and one hands-on activity they can do at home.

Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

Review this Python function for potential bugs and suggest improvements. Focus on edge cases and performance: def calculate_discount(price, discount_pct): return price - (price * discount_pct / 100)

Output

The function has several issues worth addressing. First, there's no input validation — negative prices or discount percentages above 100 would produce nonsensical results. Second, floating-point arithmetic can introduce precision errors for currency calculations. Consider: adding type hints, validating that 0 <= discount_pct <= 100, checking price >= 0, and using Decimal for monetary values. For performance, this function is already O(1), but if called frequently in a loop, pre-calculating the multiplier (discount_pct / 100) outside the loop would help. A robust version might look like: from decimal import Decimal; def calculate_discount(price: Decimal, discount_pct: float) -> Decimal with validation guards at the start.

Notes

This example demonstrates Gemma 4's code review capabilities across multiple dimensions — correctness, edge cases, and performance. The 262K token context window means it could review entire codebases in one pass. However, without benchmark data, we can't verify how it compares to specialized code models on complex refactoring tasks.

Prompt

I have a 45-minute video of a product demo. Extract the key feature announcements, their timestamps, and any pricing information mentioned. Summarize each feature in one sentence.

Output

The model would process the video input and return a structured summary like: **00:03:20 - Cloud Sync**: Automatic backup to three cloud providers with end-to-end encryption. **00:12:45 - Collaboration Mode**: Real-time co-editing for up to 50 users simultaneously. **00:28:10 - API Access**: RESTful API with 10,000 requests/month on the free tier. **00:38:55 - Pricing**: $29/month for Pro (unlimited projects), $99/month for Enterprise (custom integrations, SSO). **00:42:30 - Mobile Apps**: Native iOS and Android apps launching Q2 2024. The model would identify these moments by analyzing both visual slides and spoken narration across the full video length.

Notes

Gemma 4's multimodal capabilities (text, image, video) make it suitable for long-form video analysis tasks that would require multiple tools otherwise. The 262K context window accommodates roughly 45-60 minutes of dense video content. The free pricing removes cost barriers for exploratory video processing, though processing speed isn't specified.

Prompt

Generate a detailed creative brief for a product photoshoot. Product: ergonomic wireless mouse. Target audience: remote workers aged 28-45. Brand tone: minimal, professional, slightly warm. Include lighting setup, composition notes, and three specific shot ideas.

Output

**Creative Brief: Ergonomic Wireless Mouse Campaign** **Lighting**: Soft, diffused natural light from a north-facing window, supplemented with a white reflector on the shadow side. Aim for a 2:1 lighting ratio to maintain detail while creating gentle dimensionality. **Composition**: Clean, uncluttered backgrounds — think light oak desk surfaces, linen textures, or matte concrete. Negative space should dominate; the mouse occupies no more than 30% of the frame. **Shot 1 - The Workspace Hero**: Mouse positioned at a 45-degree angle on a minimal desk with a laptop edge visible, coffee cup in soft focus background. Conveys integration into daily workflow. **Shot 2 - The Ergonomic Detail**: Extreme close-up on the thumb rest area, showing texture and contour. Human hand entering frame from the side, not yet touching — implies comfort before contact. **Shot 3 - The Lifestyle Context**: Overhead flat lay with mouse, notebook, wireless earbuds, and a small plant. Suggests the complete remote work ecosystem.

Notes

This showcases Gemma 4's ability to generate structured creative direction that bridges technical photography knowledge with brand strategy. The model handles the image-related reasoning well despite being text-output. The free tier makes it accessible for small creative teams doing planning work, though actual image generation would require a different model.

Use-case deep-dives

Prototype chatbot development

Why Gemma 4 31B works for early-stage product validation

A 4-person startup building a customer support bot needs to iterate fast without burning runway on API costs. Gemma 4 31B is the right call here: $0.00/Mtok means unlimited experimentation during the prototype phase when you're testing 20 prompt variations a day and don't know which conversation flows will stick. The 262k context window handles full support ticket histories without truncation, and multimodal support (text, image, video) future-proofs the bot if you add screenshot troubleshooting later. The trade-off is response quality—once you're past product-market fit and serving real customers at scale, you'll likely graduate to a paid model with stronger reasoning benchmarks. For now, validate the concept without a credit card.

Internal document Q&A

When free context length beats paid accuracy for knowledge retrieval

A 12-person consulting firm wants employees to query a 200-page compliance manual without reading the whole thing. Gemma 4 31B's 262k token window fits the entire document in a single prompt, eliminating the chunking and embedding overhead that paid RAG setups require. At $0.00/Mtok, every team member can ask 50 questions a day without budget approval. The catch: this model lacks public benchmarks, so you're flying blind on accuracy for nuanced legal interpretation. If the manual is straightforward procedural text and you can afford a 5-10% error rate that humans catch, this works. If wrong answers create liability exposure, pay for a benchmarked model with proven MMLU scores above 80.

Video content moderation

How multimodal input at zero cost changes moderation economics

A 20-person online learning platform reviews 300 user-uploaded tutorial videos per week for policy violations. Gemma 4 31B's video modality support means you can pipe raw MP4 files into the model without transcription middleware, and the $0.00 pricing makes high-volume scanning viable even at low revenue per student. The 262k context handles 15-minute videos in full. The risk is false negatives—without public safety benchmarks, you don't know this model's precision on edge cases like subtle hate speech or copyright infringement. Use it as a first-pass filter that flags 80% of obvious violations, then route the borderline 20% to human review or a specialized moderation API with published accuracy metrics.

Frequently asked

Is Gemma 4 31B good for general text generation tasks?

Yes, with 31 billion parameters and a 262k token context window, Gemma 4 31B handles most text tasks well — drafting, summarization, Q&A. It's free, so there's no cost barrier for experimentation. Expect solid performance on everyday prompts, though you'll hit limits on highly specialized domains without fine-tuning.

How does Gemma 4 31B pricing compare to GPT-4o or Claude?

Gemma 4 31B is completely free ($0.00 per million tokens in and out), while GPT-4o costs $2.50/$10.00 and Claude Sonnet runs $3.00/$15.00 per Mtok. For prototyping or high-volume use cases where budget matters more than cutting-edge reasoning, Gemma 4 31B wins on economics alone.

Can Gemma 4 31B process images and video effectively?

Gemma 4 31B supports image, text, and video modalities, which is rare for a free model. Expect basic visual understanding — object recognition, scene description — but don't rely on it for nuanced image analysis or frame-by-frame video reasoning that frontier models handle. Test your specific use case early.

How does Gemma 4 31B compare to the previous Gemma 2 models?

Gemma 4 31B adds multimodal support (images and video) that earlier Gemma 2 versions lacked, and the 262k context window is a major upgrade from Gemma 2's 8k limit. The 31B parameter count sits between Gemma 2's 9B and 27B variants, offering a middle ground for capability and speed.

Should I use Gemma 4 31B for production chatbots?

If you need a free, self-hosted option with decent conversational ability and a large context window, yes. Gemma 4 31B works for internal tools or MVPs where zero API costs matter. For customer-facing bots requiring high accuracy and safety guardrails, consider paid alternatives with better support and reliability SLAs.