Crosshair
Leaderboard
Language ModelsProprietary

Gemini 3.1 Pro

Google's most capable Gemini (preview); tops several reasoning benchmarks.

textimageaudiovideocodeOfficial site
Crosshair Index
75.9
#3 of 32 · Language Models
Provider
Google DeepMind
Released
2026-02-19
Parameters
Undisclosed
Context
1.048576M tokens

Token pricing

Input
$2 /1M
Output
$12 /1M
Cache read
$0.2 /1M

USD per 1M tokens · cache read = cached input (hit), cache write = caching surcharge · official list pricing (June 2026).

Industry skill web

Professional-domain strengths, composed from the benchmarks relevant to each field. Highlight an axis to see the benchmarks behind it.

SoftwareIB / FinanceLawMedicineResearchConsultingAccounting

Software Engineering

81
skill

Shipping working code against real repositories: bug fixes, feature patches, and competitive programming under tests.

Scorecard

BenchmarkScoreSourceStatus
GPQA Diamond
Reasoning
94.3%bestGoogle DeepMind — Gemini 3.1 Pro model cardvendorunverified
Humanity's Last Exam
Frontier
44.4%Google DeepMind — Gemini 3.1 Pro model cardvendorunverified
SWE-bench Verified
Agentic Coding
80.6%Google DeepMind — Gemini 3.1 Pro model cardvendorunverified
LiveCodeBench
Coding
not evaluated
MMLU-Pro
Knowledge
not evaluated
LMArena Elo
Human Preference
1,488LMArena (arena.ai)3rd-partyunverified
AA Intelligence Index
Composite
57Artificial Analysis3rd-partyunverified
Corporate Finance
Finance
64.5%Vals AI — CorpFin v23rd-partyunverified
LegalBench
Law
87.4%bestVals AI — LegalBench3rd-partyunverified
TaxEval
Tax & Accounting
72.9%Vals AI — TaxEval v23rd-partyunverified
Medical Coding
Medicine
59.1%bestVals AI — MedCode3rd-partyunverified