Benchmarks
Financehigher is better
Corporate Finance
Vals AI CorpFin v2 — expert-built questions over long-context corporate credit agreements; an independent, in-house-run finance benchmark.
Benchmark source- Domain
- Finance
- Metric
- %
- Orientation
- Higher is better
- Results
- 26
Ranking
| # | Model | Score | Source | Status |
|---|---|---|---|---|
| 1 | Grok 4.3 xAI | 68.5% | Vals AI — CorpFin v23rd-party | unverified |
| 2 | GPT-5.5 OpenAI | 68.4% | Vals AI — CorpFin v23rd-party | unverified |
| 3 | Claude Opus 4.6 Anthropic | 67% | Vals AI — CorpFin v23rd-party | unverified |
| 4 | Claude Opus 4.8 Anthropic | 66.7% | Vals AI — CorpFin v23rd-party | unverified |
| 5 | Kimi K2.6 Moonshot AI | 66.7% | Vals AI — CorpFin v23rd-party | unverified |
| 6 | Gemini 3 Flash Google DeepMind | 66.4% | Vals AI — CorpFin v23rd-party | unverified |
| 7 | Claude Opus 4.7 Anthropic | 66.1% | Vals AI — CorpFin v23rd-party | unverified |
| 8 | GPT-5.2 OpenAI | 65.9% | Vals AI — CorpFin v23rd-party | unverified |
| 9 | Claude Sonnet 4.6 Anthropic | 65.3% | Vals AI — CorpFin v23rd-party | unverified |
| 10 | GPT-5.4 OpenAI | 65.3% | Vals AI — CorpFin v23rd-party | unverified |
| 11 | Muse Spark Meta AI | 65.1% | Vals AI — CorpFin v23rd-party | unverified |
| 12 | Gemini 3.5 Flash Google DeepMind | 64.7% | Vals AI — CorpFin v23rd-party | unverified |
| 13 | Gemini 3.1 Pro Google DeepMind | 64.5% | Vals AI — CorpFin v23rd-party | unverified |
| 14 | GLM-5.1 Z.ai (Zhipu) | 64.5% | Vals AI — CorpFin v23rd-party | unverified |
| 15 | Qwen3.7 Max Alibaba Qwen | 63.7% | Vals AI — CorpFin v23rd-party | unverified |
| 16 | Gemini 3 Pro Google DeepMind | 63.7% | Vals AI — CorpFin v23rd-party | unverified |
| 17 | Grok 4.20 xAI | 63.7% | Vals AI — CorpFin v23rd-party | unverified |
| 18 | Qwen3.6-27B Alibaba Qwen | 62.3% | Vals AI — CorpFin v23rd-party | unverified |
| 19 | DeepSeek V4-Pro DeepSeek | 61.4% | Vals AI — CorpFin v23rd-party | unverified |
| 20 | Mistral Large 3 Mistral AI | 61% | Vals AI — CorpFin v23rd-party | unverified |
| 21 | Gemini 2.5 Pro Google DeepMind | 60.8% | Vals AI — CorpFin v23rd-party | unverified |
| 22 | Kimi K2 Thinking Moonshot AI | 60.6% | Vals AI — CorpFin v23rd-party | unverified |
| 23 | Claude Haiku 4.5 Anthropic | 60.6% | Vals AI — CorpFin v23rd-party | unverified |
| 24 | DeepSeek V3.2 DeepSeek | 51% | Vals AI — CorpFin v23rd-party | unverified |
| 25 | Llama 4 Maverick Meta AI | 49.7% | Vals AI — CorpFin v23rd-party | unverified |
| 26 | Llama 4 Scout Meta AI | 46.8% | Vals AI — CorpFin v23rd-party | unverified |
