Nemotron 3 Ultra

● Excellent

by NVIDIA Intelligent Rank #21 of 48

8.5
/ 10

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning model with 55B active parameters (550B total) using a hybrid Transformer-Mamba MoE architecture. Supports 1M-token context window for agentic workflows including orchestration, coding, and deep research. Strong multi-step reasoning with high-throughput inference for agent pipelines.

Compare NVIDIA models

Specifications

Specifications for Nemotron 3 Ultra
AttributeValue
Lab NVIDIA
Tags Intelligent
Overall Score 8.5/10
Release Date 2026-06
Context Window 1,000,000 tokens
Input Price / 1M $0.50
Output Price / 1M $2.50
Input Modalities Text
Output Modalities Text

Strengths

  • 55B active / 550B total MoE with hybrid Transformer-Mamba architecture
  • 1M-token context window for long-horizon reasoning and agent orchestration
  • Open-weight release on HuggingFace (BF16 weights)
  • High-throughput inference optimized for production agent pipelines
  • Strong multi-step reasoning and planning capabilities

Weaknesses

  • Text-only — no vision, image, or multimodal input support
  • Newer entrant with less ecosystem maturity than established labs

Best For

  • Agent orchestration and multi-agent pipelines
  • Long-horizon reasoning and deep research tasks
  • High-volume enterprise agent deployments
  • Complex coding with extended context windows

Sources & Further Reading

Related Models

Scores are aggregated from public benchmarks (MMLU, HumanEval, MATH, GSM8K, LMSYS) and normalized to a 1–10 scale. Methodology →