Nemotron 3 Ultra

Name: Nemotron 3 Ultra
Rating: 8.5 (24 reviews)
Author: NVIDIA

● Excellent

by NVIDIA Intelligent Rank #21 of 48

8.5

/ 10

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning model with 55B active parameters (550B total) using a hybrid Transformer-Mamba MoE architecture. Supports 1M-token context window for agentic workflows including orchestration, coding, and deep research. Strong multi-step reasoning with high-throughput inference for agent pipelines.

Compare NVIDIA models

Specifications

Specifications for Nemotron 3 Ultra
Attribute	Value
Lab	NVIDIA
Tags	Intelligent
Overall Score	8.5/10
Release Date	2026-06
Context Window	1,000,000 tokens
Input Price / 1M	$0.50
Output Price / 1M	$2.50
Input Modalities	Text
Output Modalities	Text

Strengths

55B active / 550B total MoE with hybrid Transformer-Mamba architecture
1M-token context window for long-horizon reasoning and agent orchestration
Open-weight release on HuggingFace (BF16 weights)
High-throughput inference optimized for production agent pipelines
Strong multi-step reasoning and planning capabilities

Weaknesses

Text-only — no vision, image, or multimodal input support
Newer entrant with less ecosystem maturity than established labs

Best For

Agent orchestration and multi-agent pipelines
Long-horizon reasoning and deep research tasks
High-volume enterprise agent deployments
Complex coding with extended context windows

Sources & Further Reading

Related Models

Scores are aggregated from public benchmarks (MMLU, HumanEval, MATH, GSM8K, LMSYS) and normalized to a 1–10 scale. Methodology →