MoonshotAI: Kimi VL A3B Thinking

by moonshotai

0 stars
Context 131K tokens
Modalities Text, Image → Text
Input Price $0.06 / million tokens
Output Price $0.25 / million tokens

Overview

Kimi-VL is a lightweight Mixture-of-Experts vision-language model that activates only 2.8B parameters per step while delivering strong performance on multimodal reasoning and long-context tasks. The Kimi-VL-A3B-Thinking variant, fine-tuned with chain-of-thought and reinforcement learning, excels in math and visual reasoning benchmarks like MathVision, MMMU, and MathVista, rivaling much larger models such as Qwen2.5-VL-7B and Gemma-3-12B. It supports 128K context and high-resolution input via its MoonViT encoder.

Key Features

  • Multimodal capabilities (Text, Image → Text)
  • 131K tokens context window
  • API access available

Model Information

Developer:

moonshotai

Release Date:

April 10, 2025

Context Window:

131K tokens

Modalities:

Text, Image → Text

Pricing

Input Tokens $0.06 / million tokens
Output Tokens $0.25 / million tokens
Get API Key

Discussion

No comments yet. Be the first to share your thoughts about this model!