Crosshair
Benchmarks
Frontierhigher is better

Humanity's Last Exam

A broad, extremely difficult exam across math, humanities, and science designed to remain unsaturated by frontier models. Reported here without external tools.

Benchmark source
Domain
Frontier
Metric
%
Orientation
Higher is better
Results
22

Ranking

#ModelScoreSourceStatus
1Doubao Seed 2.0 Pro
ByteDance
54.2%ByteDance — Doubao Seed 2.0vendorunverified
2Claude Opus 4.8
Anthropic
49.8%Anthropic — Claude Opus 4.8vendorunverified
3Claude Opus 4.7
Anthropic
46.9%Anthropic — Claude Opus 4.7vendorunverified
4Gemini 3.1 Pro
Google DeepMind
44.4%Google DeepMind — Gemini 3.1 Pro model cardvendorunverified
5Qwen3.7 Max
Alibaba Qwen
41.4%Qwen — Qwen3.7 Maxvendorunverified
6GPT-5.5
OpenAI
41.4%OpenAI — GPT-5.5vendorunverified
7Gemini 3.5 Flash
Google DeepMind
40.2%Google — Gemini 3.5 Flashvendorunverified
8Claude Opus 4.6
Anthropic
40%Anthropic — Claude Opus 4.6vendorunverified
9Muse Spark
Meta AI
39.9%Artificial Analysis — Muse Spark3rd-partyunverified
10DeepSeek V4-Pro
DeepSeek
37.7%DeepSeek — V4-Pro model cardvendorunverified
11Gemini 3 Pro
Google DeepMind
37.5%Google — Gemini 3 Provendorunverified
12DeepSeek V4-Flash
DeepSeek
34.8%DeepSeek — V4-Flash model cardvendorunverified
13Kimi K2.6
Moonshot AI
34.7%Moonshot — Kimi K2.6 model cardvendorunverified
14GPT-5.2
OpenAI
34.5%llm-stats — GPT-5.2 (vendor-reported)3rd-partyunverified
15Gemini 3 Flash
Google DeepMind
33.7%Google — Gemini 3 Flashvendorunverified
16GLM-5.1
Z.ai (Zhipu)
31%Zhipu / Z.ai — GLM-5.1 model cardvendorunverified
17Nemotron 3 Ultra
NVIDIA
26.7%NVIDIA — Nemotron 3 Ultra model cardvendorunverified
18DeepSeek V3.2
DeepSeek
25.1%DeepSeek — V3.2 technical reportvendorunverified
19Qwen3.6-27B
Alibaba Qwen
24%Alibaba — Qwen3.6-27B model cardvendorunverified
20Kimi K2 Thinking
Moonshot AI
23.9%Moonshot — Kimi K2 Thinking model cardvendorunverified
21Gemini 2.5 Pro
Google DeepMind
21.6%Google DeepMind — Gemini 2.5 Pro model cardvendorunverified
22Qwen3.6-35B-A3B
Alibaba Qwen
21.4%Alibaba — Qwen3.6-35B-A3B model cardvendorunverified