# NVIDIA: Llama 3.1 Nemotron 70B Instruct

Provider: nvidia  
Category: llm  
Model ID: `nvidia/llama-3.1-nemotron-70b-instruct`

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.1-70b-instruct) architecture and Reinforcement Learning from Human Feedback (RLHF), it excels...

## Specs

- Context length: 131072 tokens
- Max output: 16384 tokens
- Modalities: text
- Released: 2024-10-15

## Pricing

- Input: $1.20 per million tokens
- Output: $1.20 per million tokens

## Providers

- **nvidia** — ctx 131072, input $1.20/M, output $1.20/M

---
Last verified: 2026-04-23T23:46:29.618Z  
Canonical URL: https://switchy.build/models/llama-3-1-nemotron-70b-instruct