The Simple Idea

LMRank assigns every AI model a single score out of 10. No wall of benchmark numbers, no spinning — just a clear answer to "how good is this model?"

Score Sources

We aggregate results from established public benchmarks:

  • MMLU — Knowledge and understanding across 57 subjects
  • HumanEval — Code generation correctness
  • MATH / GSM8K — Mathematical reasoning
  • BIG-Bench — Hard reasoning tasks
  • Chatbot Arena (LMSYS) — Human preference rankings

Score Scale

LMRank score tier scale
Range Tier Meaning
9.0–10 Elite Frontier models, best in class
8.0–8.9 Excellent Strong performers, near-frontier
7.0–7.9 Strong Capable models for most tasks
6.0–6.9 Capable Entry-level or older models
1.0–5.9 Basic Limited capability, niche use cases

Updates

We update scores as new models release and benchmark results change. New models are typically added within one week of public launch.