# LMRank

> LMRank is an AI model leaderboard with transparent scores out of 10 for major large language models, based on public benchmarks and creative generation tests.

Use this file to discover canonical LMRank resources for AI model rankings, model comparisons, benchmark methodology, pricing context, and model-specific score pages.

## Core Pages
- [LLM Leaderboard](https://lmrank.com/): Ranked AI model scores across public benchmarks.
- [AI Model Categories](https://lmrank.com/categories/): Curated picks — Best Overall, Best Coding, Best Cheap, Best Open-Weight, Best Long Context, Best Fast, Best Reasoning, Best Multimodal, Best Local, and Best Agentic.
- [Creative Benchmarks](https://lmrank.com/benchmarks/): Standardized HTML generation benchmark results.
- [Compare AI Models](https://lmrank.com/compare/): Side-by-side model comparisons.
- [Methodology](https://lmrank.com/about/): How LMRank calculates model scores.
- [Sitemap](https://lmrank.com/sitemap/): Human-readable index of public pages.
- [XML Sitemap](https://lmrank.com/sitemap.xml): Machine-readable sitemap.

## Top Model Pages
- [Claude Opus 4.8](https://lmrank.com/model/claude-opus-4-8/): 9.7/10 LMRank score; provider: Anthropic; category: General Purpose.
- [Claude Opus 4.7](https://lmrank.com/model/claude-opus-4-7/): 9.6/10 LMRank score; provider: Anthropic; category: General Purpose.
- [Claude Opus 4.5](https://lmrank.com/model/claude-opus-4-5/): 9.5/10 LMRank score; provider: Anthropic; category: General Purpose.
- [GPT-5.5](https://lmrank.com/model/gpt-5-5/): 9.4/10 LMRank score; provider: OpenAI; category: General Purpose.
- [GPT-5](https://lmrank.com/model/gpt-5/): 9.3/10 LMRank score; provider: OpenAI; category: General Purpose.
- [Gemini Ultra 2](https://lmrank.com/model/gemini-ultra-2/): 9.2/10 LMRank score; provider: Google DeepMind; category: General Purpose.
- [GPT-5 Turbo](https://lmrank.com/model/gpt-5-turbo/): 9.1/10 LMRank score; provider: OpenAI; category: General Purpose.
- [DeepSeek V4](https://lmrank.com/model/deepseek-v4/): 9.0/10 LMRank score; provider: DeepSeek; category: General Purpose.
- [Qwen3.7 Max](https://lmrank.com/model/qwen3-7-max/): 9.0/10 LMRank score; provider: Alibaba Cloud; category: General Purpose.
- [Qwen3.6 Max Preview](https://lmrank.com/model/qwen3-6-max-preview/): 8.9/10 LMRank score; provider: Alibaba Cloud; category: General Purpose.
- [Claude Sonnet 4](https://lmrank.com/model/claude-sonnet-4/): 8.8/10 LMRank score; provider: Anthropic; category: General Purpose.
- [Llama 4 405B](https://lmrank.com/model/llama-4-405b/): 8.7/10 LMRank score; provider: Meta; category: General Purpose.
- [GPT-5.5 Instant](https://lmrank.com/model/gpt-5-5-instant/): 8.6/10 LMRank score; provider: OpenAI; category: General Purpose.
- [DeepSeek R1](https://lmrank.com/model/deepseek-r1/): 8.5/10 LMRank score; provider: DeepSeek; category: Reasoning & Math.
- [Gemini Pro 2](https://lmrank.com/model/gemini-pro-2/): 8.5/10 LMRank score; provider: Google DeepMind; category: General Purpose.
- [Grok 4.3](https://lmrank.com/model/grok-4-3/): 8.5/10 LMRank score; provider: xAI; category: General Purpose.
- [MiniMax M3](https://lmrank.com/model/minimax-m3/): 8.5/10 LMRank score; provider: MiniMax; category: General Purpose.
- [Qwen3.6 35B A3B](https://lmrank.com/model/qwen3-6-35b-a3b/): 8.5/10 LMRank score; provider: Alibaba Cloud; category: Code Generation.
- [Gemini 3.5 Flash](https://lmrank.com/model/gemini-3-5-flash/): 8.4/10 LMRank score; provider: Google DeepMind; category: General Purpose.
- [Kimi K2.6](https://lmrank.com/model/kimi-k2-6/): 8.4/10 LMRank score; provider: Moonshot; category: Code Generation.

## Blog
- [AI Model Weekly Roundup — May 25–31, 2026](https://lmrank.com/blog/2026-06-01-roundup/): DeepSWE crowns GPT-5.5 atop a new coding leaderboard while flagging Claude Opus 4.8 for benchmark exploitation, GitHub Copilot slaps a 57x cost multiplier on GPT-5.5, and distillation drama engulfs the Opus 4.8 release.
- [MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities](https://lmrank.com/blog/minimax-m3-open-weights-frontier/): MiniMax M3 ships open-weights with 1M context, native multimodal, and frontier coding claims at $0.30/M input tokens. Here's why it matters.
- [Mistral Medium 3.5: The 128B Open-Weight Model Built for Agentic Coding](https://lmrank.com/blog/mistral-medium-3-5-agentic-coding/): Mistral's new 128B dense model packs a 256K context window, open weights, and claims 77.6% on SWE-Bench. Here's why the medium tier just got interesting.
- [Kimi K2.6 vs Qwen3.6 35B A3B: The Open-Weight Coding Crown](https://lmrank.com/blog/kimi-k2-6-vs-qwen3-6-35b-coding-crown/): Two open-weight coding models went head-to-head in spring 2026. Qwen3.6 35B A3B scores higher, costs 6x less, and runs on consumer hardware. Is Kimi K2.6's agentic reliability worth the premium?
- [AI Model Weekly Roundup — May 18–24, 2026](https://lmrank.com/blog/2026-05-25-roundup/): DeepSeek's reasonix coding agent tops Hacker News, Google ships Gemini 3.5 Flash for speed workloads, xAI quietly launches Grok Build 0.1, and rising memory costs reshape AI chip economics.
- [DeepSeek Makes Its 75% API Discount Permanent — Here's What It Means for Frontier AI Pricing](https://lmrank.com/blog/deepseek-permanent-price-cut/): DeepSeek confirmed its V4 Pro 75% API discount won't expire. At $0.50 per million tokens with a 9.0 LMRank score, this changes the economics of frontier AI.
- [Qwen3.7 Max: The $2.50 Sleeper Flagship Reshaping API Economics](https://lmrank.com/blog/qwen3-7-max-sleeper-flagship/): Alibaba's Qwen3.7 Max scores 9.0 with a 1M token context window at just $2.50 per million input tokens. Here's why it might be the best value on the frontier tier.
- [DeepSeek V4-Flash: The Open-Weight Model With a Million-Token Context](https://lmrank.com/blog/deepseek-v4-flash-million-tokens/): DeepSeek's new V4-Flash preview brings a 1M token context window to open-weight AI for the first time. 284B params, 13B active, MIT license — and competitive with frontier models on coding.
- [Grok 4.3 vs Gemini Ultra 2: The Million-Token Showdown](https://lmrank.com/blog/grok-4-3-vs-gemini-ultra-2/): xAI's Grok 4.3 undercuts Gemini Ultra 2 by 90% on output pricing while matching its 1M token context window. Here's the breakdown.
- [State of Large Language Models — May 2026](https://lmrank.com/blog/state-of-llms-may-2026/): 

## Crawling Notes
- Public pages are crawlable and canonical URLs use trailing slashes.
- Do not crawl /admin/ or admin API routes.
- Prefer /sitemap.xml for exhaustive URL discovery and /sitemap/ for human-readable page grouping.