LLMnvidia
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5
Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...
Specifications
- Provider
- nvidia
- Category
- llm
- Context length
- 131,072 tokens
- Max output
- —
- Modalities
- text
- License
- proprietary
- Released
- 2025-10-10
Pricing
- Input
- $0.10/Mtok
- Output
- $0.40/Mtok
- Model ID
nvidia/llama-3.3-nemotron-super-49b-v1.5
Team cost calculator
Estimated monthly spend
$3.34
17.6M tokens / month
5 seats · 80 msgs/day
5 seats · 80 msgs/day
Providers
| Provider | Context | Input | Output | P50 latency | Throughput | 30d uptime |
|---|---|---|---|---|---|---|
| nvidia | 131k | $0.10/Mtok | $0.40/Mtok | — | — | — |
Performance
Performance snapshots are collected daily. Check back after the next ingestion run.
Benchmarks
Works well with
Top MCPs
Compatibility data comes from first-party telemetry; once we have enough co-usage signal, top MCPs for this model will appear here.
How Switchy teams use it
Not enough Spaces have used this model yet to share anonymised team stats. We wait for at least 50 distinct Spaces per week before publishing any aggregate.