LLMnvidia

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

Try in a Space All models

Specifications

Provider: nvidia
Category: llm
Context length: 131,072 tokens
Max output: —
Modalities: text
License: proprietary
Released: 2025-10-10

Pricing

Input: $0.10/Mtok
Output: $0.40/Mtok
Model ID: nvidia/llama-3.3-nemotron-super-49b-v1.5

Team cost calculator

Seats5 peopleMessages / seat / day80Avg turn size2 ktokOutput share30 %

Estimated monthly spend

$3.34

17.6M tokens / month
5 seats · 80 msgs/day

Providers

Provider	Context	Input	Output	P50 latency	Throughput	30d uptime
nvidia	131k	$0.10/Mtok	$0.40/Mtok	—	—	—

Performance

Performance snapshots are collected daily. Check back after the next ingestion run.

Benchmarks

Public benchmark scores are not available yet for this model. Check back after the next ingestion run.

Works well with

Top MCPs

Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.

How Switchy teams use it

Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.

Starter prompts

Starter prompts for this model will land here soon.