LLMnvidia

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

Anyone in the Space can @-mention NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 with the team's shared context — pooled credits, one chat, one memory.

All models

Starter is free forever — 1 Space, 100 credits/month, 1 MCP. No card.

Specifications

Provider
nvidia
Category
llm
Context length
131,072 tokens
Max output
16,384 tokens
Modalities
text
License
proprietary
Released
2025-10-10

Pricing

Input
$0.10/Mtok
Output
$0.40/Mtok
Model ID
nvidia/llama-3.3-nemotron-super-49b-v1.5

Per-token prices show what the model costs upstream. On Switchy your team draws from one shared org credit pool — one plan, one balance for everyone.

Team cost calculator

Estimated monthly spend
$3.34
17.6M tokens / month
5 seats · 80 msgs/day

Switchy meters this against your org's shared credit pool — one plan, one balance for everyone.

Providers

ProviderContextInputOutputP50 latencyThroughput30d uptime
nvidia131k$0.10/Mtok$0.40/Mtok

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Starter prompts for this model will land here soon.
Data last verified 14 hours ago.Sources aggregated hourly to weekly. See docs/architecture/model-directory.md.