DeepSeek: Deepseek R1 0528 Qwen3 8B

by deepseek

0 stars
Context 131K tokens
Modalities Text
Input Price $0.01 / million tokens
Output Price $0.05 / million tokens

Overview

DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1 that taps more compute and smarter post-training tricks, pushing its reasoning and inference to the brink of flagship models like O3 and Gemini 2.5 Pro. It now tops math, programming, and logic leaderboards, showcasing a step-change in depth-of-thought. The distilled variant, DeepSeek-R1-0528-Qwen3-8B, transfers this chain-of-thought into an 8 B-parameter form, beating standard Qwen3 8B by +10 pp and tying the 235 B “thinking” giant on AIME 2024.

Key Features

  • 131K tokens context window
  • API access available

Model Information

Developer:

deepseek

Release Date:

May 29, 2025

Context Window:

131K tokens

Modalities:

Text

Pricing

Input Tokens $0.01 / million tokens
Output Tokens $0.05 / million tokens
Get API Key

Discussion

No comments yet. Be the first to share your thoughts about this model!