LongCat-2.0: China's 1.6T-Parameter Coding Model Trained Without a Single Nvidia GPU
The Nvidia Moat Just Got a Hairline Crack
On June 30, 2026, Meituan open-sourced LongCat-2.0, a 1.6-trillion-parameter Mixture-of-Experts model that scored 59.5 on SWE-bench Pro — narrowly beating GPT-5.5's 58.6. The headline number is impressive, but it's not the real story.
LongCat-2.0 was trained entirely on a cluster of more than 50,000 domestic Chinese ASICs. No Nvidia GPUs. No H100s, B200s, or any other restricted hardware. Meituan claims this is the "industry's first trillion-parameter model to complete full-process training and inference on a 50,000-card domestic computing power cluster" (source: VentureBeat). Previous Chinese frontier models like DeepSeek V4-pro used domestic chips only for the cheaper inference step. LongCat-2.0 did everything — pre-training, fine-tuning, inference — on Chinese silicon.
This release lands amid escalating US export controls and a growing narrative that Chinese AI is permanently bottlenecked by hardware access. LongCat-2.0 directly challenges that assumption.
Who Was Owl Alpha?
Before Meituan claimed it, the model appeared anonymously on OpenRouter as "Owl Alpha" on April 28, 2026 (source: OpenRouter). For two months, it ran unidentified, processing roughly 10.1 trillion monthly tokens — 559 billion per day — with 242% month-over-month growth. It entered OpenRouter's global top three by call volume. It ranked #1 on Hermes Agent workspace, #2 on Claude Code deployments, and #3 in OpenClaw environments (source: VentureBeat).
The stealth launch was a stress test. It proved the model could handle production-scale traffic before Meituan ever attached its name.
Model Specs and Hardware
- Architecture: Mixture-of-Experts (MoE)
- Total parameters: 1.6 trillion
- Active parameters per token: ~33–56 billion
- Context window: 1 million tokens native (via LongCat Sparse Attention)
- License: MIT (fully open, commercially permissive)
- Pretraining data: More than 35 trillion tokens, with "no rollbacks or irrecoverable loss spikes" across millions of accelerator-hours (source: GitHub)
The chip supplier was not explicitly named, but Meituan confirmed using the Huawei Collective Communication Library (HCCL) for chip-to-chip communication, pointing strongly to the Huawei Ascend ecosystem (source: VentureBeat). Meituan's AI research team began exploring domestic chips in 2023.
Benchmark Performance: Real Gains, Real Gaps
All benchmark figures below are Meituan's self-reported results from the official benchmarks page. Independent third-party verification had not yet emerged as of this writing.
| Benchmark | Claimed Score | Context |
|---|---|---|
| SWE-bench Pro | 59.5 | Surpasses GPT-5.5 (58.6); beats Gemini 3.1 Pro and Claude Opus 4.6 on deep software engineering |
| SWE-bench Multilingual | 77.3 | Multilingual coding capability |
| Terminal-Bench 2.1 | 70.8 | Real terminal interaction and error recovery |
| FORTE | 73.2 | General corporate workflow simulation (trails Claude Opus 4.8 per VentureBeat) |
| BrowseComp | 79.9 | Complex browsing and retrieval |
| RWSearch | 78.8 | Search agent tasks |
The Terminal-Bench 2.1 score of 70.8 trails independent leaderboard leader Claude Opus 4.8 at 84.6% (source: Artificial Analysis). LongCat-2.0 is near the frontier but not at the absolute top. It competes, but doesn't dominate.
What This Means for the Supply Chain
LongCat-2.0 is a proof point: frontier-scale open-weight coding models can be built without Nvidia hardware. The US export controls on advanced chips are designed to prevent exactly this outcome. They are not working as intended — or not working fast enough.
Meituan's achievement does not mean domestic Chinese chips are better or cheaper than Nvidia's. The model required more than 50,000 ASICs to train, which implies lower per-chip performance compared to H100s or B200s. The engineering cost — in hardware, power, and orchestrating a 50,000-accelerator cluster — was almost certainly higher than a comparable Nvidia-based training run would have been.
But it worked. That's the point. The bottleneck is no longer absolute, just relative.
Concrete Takeaway
LongCat-2.0 is an MIT-licensed, 1.6T-parameter MoE coding model that scores competitively on SWE-bench Pro, SWE-bench Multilingual, and Terminal-Bench 2.1, and was trained end-to-end on domestic Chinese ASICs. It is immediately accessible for self-hosting, fine-tuning, and commercial use. For developers and enterprises evaluating coding models, LongCat-2.0 offers a viable open-weight alternative to GPT-5.5 and Claude Opus 4.6, especially if API geography or hardware independence matters. For anyone watching the geopolitics of AI, it signals that the gap between Nvidia-backed and domestic-chip training runs can be closed. The question is how quickly — and at what scale.