LLMnvidia

NVIDIA: Nemotron 3 Ultra (free)

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

Anyone in the Space can @-mention NVIDIA: Nemotron 3 Ultra (free) with the team's shared context — pooled credits, one chat, one memory.

All models

Starter is free forever — 1 Space, 100 credits/month, 1 MCP. No card.

Verdict

Nemotron 3 Ultra offers a massive 1M-token context window at zero cost, making it a strong choice for budget-constrained teams processing long documents or large codebases. The free tier means no usage anxiety for experimentation or high-volume tasks. Trade-off: without public benchmarks, you're flying blind on quality relative to GPT-4o or Claude—expect to validate outputs more carefully. Reach for this when context length and cost matter more than proven accuracy.

Best for

Long-document analysis on tight budgets
Processing large codebases without token limits
High-volume experimentation with zero cost
Multi-turn conversations requiring deep context
Prototyping before committing to paid models

Strengths

The 1M-token context window handles entire repositories, legal briefs, or multi-chapter manuscripts in a single pass. Zero pricing removes the usual calculus around token optimization—you can throw context at it without watching the meter. NVIDIA's infrastructure backing suggests reliable uptime and throughput, useful for teams running batch jobs or continuous workflows where cost predictability matters more than bleeding-edge reasoning.

Trade-offs

No public benchmarks means you can't compare reasoning quality, instruction-following, or factual accuracy against established models like GPT-4o or Claude Sonnet. Early NVIDIA LLMs have lagged behind Anthropic and OpenAI on complex reasoning tasks in third-party evals. The free tier likely comes with rate limits or throttling during peak demand, though specifics aren't published. You'll need to run your own validation suite before trusting it for production use cases where accuracy is non-negotiable.

Specifications

Provider: nvidia
Category: llm
Context length: 1,000,000 tokens
Max output: 65,536 tokens
Modalities: text
License: proprietary
Released: 2026-06-04

Pricing

Input: $0.00/Mtok
Output: $0.00/Mtok
Model ID: nvidia/nemotron-3-ultra-550b-a55b:free

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool — one plan, one balance for everyone.

Team cost calculator

Seats5 peopleMessages / seat / day80Avg turn size2 ktokOutput share30 %

Estimated monthly spend

Freeno token cost

17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool — one plan, one balance for everyone.

Providers

Provider	Context	Input	Output	P50 latency	Throughput	30d uptime
nvidia	1000k	$0.00/Mtok	$0.00/Mtok	—	—	—

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Summarize Legal Brief

Read this entire legal document and provide a structured summary: (1) parties and effective dates, (2) core obligations for each party, (3) termination clauses, (4) liability caps and indemnification terms. Highlight any unusual or high-risk provisions.

Open in a Space →

Analyze Codebase Architecture

You're reviewing this entire codebase. Identify the main architectural patterns (MVC, microservices, etc.), list the core modules and their responsibilities, and flag any circular dependencies or code smells you notice.

Open in a Space →

Extract Research Themes

I'm pasting five research papers below. Read all of them and identify the top three recurring themes or methodologies. For each theme, cite which papers discuss it and summarize the consensus or disagreement among authors.

Open in a Space →

Draft Multi-Chapter Outline

Using the research notes and chapter ideas I've provided, draft a detailed outline for a non-fiction book. Include chapter titles, 3-4 subheadings per chapter, and a one-sentence summary of each section's argument.

Open in a Space →

Compare Policy Documents

Compare these two versions of our company policy document. List every substantive change—additions, deletions, and modifications—organized by section. Flag any changes that introduce new compliance requirements or alter employee obligations.

Open in a Space →