Crosshair
Benchmarks
Lawhigher is better

LegalBench

Legal-reasoning task suite (originated by Stanford CodeX), run independently by Vals AI and reported as overall accuracy across tasks.

Benchmark source
Domain
Law
Metric
%
Orientation
Higher is better
Results
24

Ranking

#ModelScoreSourceStatus
1Gemini 3.1 Pro
Google DeepMind
87.4%Vals AI — LegalBench3rd-partyunverified
2Gemini 3 Pro
Google DeepMind
87%Vals AI — LegalBench3rd-partyunverified
3Gemini 3 Flash
Google DeepMind
86.9%Vals AI — LegalBench3rd-partyunverified
4GPT-5.5
OpenAI
86.5%Vals AI — LegalBench3rd-partyunverified
5GPT-5.4
OpenAI
86%Vals AI — LegalBench3rd-partyunverified
6Claude Opus 4.7
Anthropic
85.3%Vals AI — LegalBench3rd-partyunverified
7Claude Opus 4.6
Anthropic
85.3%Vals AI — LegalBench3rd-partyunverified
8Qwen3.7 Max
Alibaba Qwen
84.9%Vals AI — LegalBench3rd-partyunverified
9Kimi K2.6
Moonshot AI
84.7%Vals AI — LegalBench3rd-partyunverified
10Grok 4.3
xAI
84.5%Vals AI — LegalBench3rd-partyunverified
11GLM-5.1
Z.ai (Zhipu)
84.4%Vals AI — LegalBench3rd-partyunverified
12Muse Spark
Meta AI
84.2%Vals AI — LegalBench3rd-partyunverified
13Claude Opus 4.8
Anthropic
83.6%Vals AI — LegalBench3rd-partyunverified
14Gemini 3.5 Flash
Google DeepMind
83.6%Vals AI — LegalBench3rd-partyunverified
15GPT-5.2
OpenAI
82.8%Vals AI — LegalBench3rd-partyunverified
16Claude Sonnet 4.6
Anthropic
82.1%Vals AI — LegalBench3rd-partyunverified
17Claude Haiku 4.5
Anthropic
81.2%Vals AI — LegalBench3rd-partyunverified
18DeepSeek V4-Pro
DeepSeek
80.3%Vals AI — LegalBench3rd-partyunverified
19Kimi K2 Thinking
Moonshot AI
80.2%Vals AI — LegalBench3rd-partyunverified
20Mistral Large 3
Mistral AI
79.1%Vals AI — LegalBench3rd-partyunverified
21Llama 4 Maverick
Meta AI
77.8%Vals AI — LegalBench3rd-partyunverified
22Grok 4.20
xAI
77.7%Vals AI — LegalBench3rd-partyunverified
23DeepSeek V3.2
DeepSeek
76.1%Vals AI — LegalBench3rd-partyunverified
24Llama 4 Scout
Meta AI
72%Vals AI — LegalBench3rd-partyunverified