Benchmarks
Compositehigher is better
AA Intelligence Index
Artificial Analysis Intelligence Index (v4.0) — an independent composite across ~10 evaluations (incl. GPQA Diamond, HLE, Terminal-Bench, SciCode, GDPval, τ²-Bench). The de-facto cross-model standard; higher is better. Shown for reference and normalized relative to this set.
Benchmark source- Domain
- Composite
- Metric
- pts
- Orientation
- Higher is better
- Results
- 31
Ranking
| # | Model | Score | Source | Status |
|---|---|---|---|---|
| 1 | Claude Opus 4.8 Anthropic | 61 | Artificial Analysis3rd-party | unverified |
| 2 | GPT-5.5 OpenAI | 60 | Artificial Analysis3rd-party | unverified |
| 3 | Claude Opus 4.7 Anthropic | 57 | Artificial Analysis3rd-party | unverified |
| 4 | Gemini 3.1 Pro Google DeepMind | 57 | Artificial Analysis3rd-party | unverified |
| 5 | Qwen3.7 Max Alibaba Qwen | 57 | Artificial Analysis3rd-party | unverified |
| 6 | GPT-5.4 OpenAI | 57 | Artificial Analysis3rd-party | unverified |
| 7 | Gemini 3.5 Flash Google DeepMind | 55 | Artificial Analysis3rd-party | unverified |
| 8 | MiniMax M3 MiniMax | 55 | Artificial Analysis3rd-party | unverified |
| 9 | Kimi K2.6 Moonshot AI | 54 | Artificial Analysis3rd-party | unverified |
| 10 | Claude Opus 4.6 Anthropic | 53 | Artificial Analysis3rd-party | unverified |
| 11 | Grok 4.3 xAI | 53 | Artificial Analysis3rd-party | unverified |
| 12 | DeepSeek V4-Pro DeepSeek | 52 | Artificial Analysis3rd-party | unverified |
| 13 | Muse Spark Meta AI | 52 | Artificial Analysis3rd-party | unverified |
| 14 | GLM-5.1 Z.ai (Zhipu) | 51 | Artificial Analysis3rd-party | unverified |
| 15 | GPT-5.2 OpenAI | 51 | Artificial Analysis3rd-party | unverified |
| 16 | Grok 4.20 xAI | 49 | Artificial Analysis3rd-party | unverified |
| 17 | Gemini 3 Pro Google DeepMind | 48 | Artificial Analysis3rd-party | unverified |
| 18 | Nemotron 3 Ultra NVIDIA | 48 | Artificial Analysis3rd-party | unverified |
| 19 | DeepSeek V4-Flash DeepSeek | 47 | Artificial Analysis3rd-party | unverified |
| 20 | Qwen3.6-27B Alibaba Qwen | 46 | Artificial Analysis3rd-party | unverified |
| 21 | Claude Sonnet 4.6 Anthropic | 44 | Artificial Analysis3rd-party | unverified |
| 22 | Qwen3.6-35B-A3B Alibaba Qwen | 43 | Artificial Analysis3rd-party | unverified |
| 23 | Kimi K2 Thinking Moonshot AI | 41 | Artificial Analysis3rd-party | unverified |
| 24 | Gemini 3 Flash Google DeepMind | 35 | Artificial Analysis3rd-party | unverified |
| 25 | Gemini 2.5 Pro Google DeepMind | 35 | Artificial Analysis3rd-party | unverified |
| 26 | DeepSeek V3.2 DeepSeek | 32 | Artificial Analysis3rd-party | unverified |
| 27 | Claude Haiku 4.5 Anthropic | 31 | Artificial Analysis3rd-party | unverified |
| 28 | Nova 2 Pro Amazon | 23 | Artificial Analysis3rd-party | unverified |
| 29 | Mistral Large 3 Mistral AI | 23 | Artificial Analysis3rd-party | unverified |
| 30 | Llama 4 Maverick Meta AI | 18 | Artificial Analysis3rd-party | unverified |
| 31 | Llama 4 Scout Meta AI | 14 | Artificial Analysis3rd-party | unverified |
