Leaderboard
Language ModelsProprietary
Claude Opus 4.8
Anthropic's frontier model and the current intelligence leader; excels at agentic coding and long-horizon reasoning.
Crosshair Index
76.4
#2 of 32 · Language Models
- Provider
- Anthropic
- Released
- 2026-05-28
- Parameters
- Undisclosed
- Context
- 1M tokens
Token pricing
- Input
- $5 /1M
- Output
- $25 /1M
- Cache read
- $0.5 /1M
- Cache write
- $6.25 /1M
USD per 1M tokens · cache read = cached input (hit), cache write = caching surcharge · official list pricing (June 2026).
Industry skill web
Professional-domain strengths, composed from the benchmarks relevant to each field. Highlight an axis to see the benchmarks behind it.
Software Engineering
89
skill
Shipping working code against real repositories: bug fixes, feature patches, and competitive programming under tests.
Scorecard
| Benchmark | Score | Source | Status |
|---|---|---|---|
| GPQA Diamond Reasoning | 93.6% | Anthropic — Claude Opus 4.8vendor | unverified |
| Humanity's Last Exam Frontier | 49.8% | Anthropic — Claude Opus 4.8vendor | unverified |
| SWE-bench Verified Agentic Coding | 88.6%best | Anthropic — Claude Opus 4.8vendor | unverified |
| LiveCodeBench Coding | — | not evaluated | |
| MMLU-Pro Knowledge | — | not evaluated | |
| LMArena Elo Human Preference | — | not evaluated | |
| AA Intelligence Index Composite | 61best | Artificial Analysis3rd-party | unverified |
| Corporate Finance Finance | 66.7% | Vals AI — CorpFin v23rd-party | unverified |
| LegalBench Law | 83.6% | Vals AI — LegalBench3rd-party | unverified |
| TaxEval Tax & Accounting | 75.6% | Vals AI — TaxEval v23rd-party | unverified |
| Medical Coding Medicine | 53.2% | Vals AI — MedCode3rd-party | unverified |
