Name: DeepSeek: DeepSeek V3.1 Base
Brand: deepseek
Price: 2.5e-7 USD

DeepSeek: DeepSeek V3.1 Base

by deepseek

0 stars

Context 164K tokens

Modalities Text

Input Price $0.25 / million tokens

Output Price $1.00 / million tokens

Overview

This is a base model, trained only for raw next-token prediction. Unlike instruct/chat models, it has not been fine-tuned to follow user instructions. Prompts need to be written more like training text or examples rather than simple requests (e.g., “Translate the following sentence…” instead of just “Translate this”). DeepSeek-V3.1 Base is a 671B parameter open Mixture-of-Experts (MoE) language model with 37B active parameters per forward pass and a context length of 128K tokens. Trained on 14.8T tokens using FP8 mixed precision, it achieves high training efficiency and stability, with strong performance across language, reasoning, math, and coding tasks.

Key Features

164K tokens context window
API access available

Model Information

Developer:

deepseek

Release Date:

August 20, 2025

Context Window:

164K tokens

Modalities:

Text

Pricing

Input Tokens $0.25 / million tokens

Output Tokens $1.00 / million tokens

Get API Key

Discussion

No comments yet. Be the first to share your thoughts about this model!