Crosshair
Benchmarks
Knowledgehigher is better

MMLU-Pro

A harder, cleaned-up successor to MMLU spanning 57+ subjects with 10-way multiple choice and reasoning-heavy items.

Benchmark source
Domain
Knowledge
Metric
%
Orientation
Higher is better
Results
10

Ranking

#ModelScoreSourceStatus
1DeepSeek V4-Pro
DeepSeek
87.5%DeepSeek — V4-Pro model cardvendorunverified
2Nemotron 3 Ultra
NVIDIA
86.8%NVIDIA — Nemotron 3 Ultra model cardvendorunverified
3Qwen3.6-27B
Alibaba Qwen
86.2%Alibaba — Qwen3.6-27B model cardvendorunverified
4DeepSeek V4-Flash
DeepSeek
86.2%DeepSeek — V4-Flash model cardvendorunverified
5Qwen3.6-35B-A3B
Alibaba Qwen
85.2%Alibaba — Qwen3.6-35B-A3B model cardvendorunverified
6DeepSeek V3.2
DeepSeek
85%DeepSeek — V3.2 technical reportvendorunverified
7Kimi K2 Thinking
Moonshot AI
84.6%Moonshot — Kimi K2 Thinking model cardvendorunverified
8Nova 2 Pro
Amazon
81.6%Amazon — Nova 2 technical reportvendorunverified
9Llama 4 Maverick
Meta AI
80.5%Meta — Llama 4vendorunverified
10Llama 4 Scout
Meta AI
74.3%Meta — Llama 4vendorunverified