Crosshair
Benchmarks
Industry3 benchmarks

Scientific Research

Frontier problem solving — graduate-level science, the hardest multi-domain exams, and broad expert knowledge.

The Scientific Researchscore is the mean of a model’s normalized 0–100 scores (direction-aware, so lower-is-better metrics are inverted) across the 3 benchmarks below — the same figure the leaderboard’s industry view ranks by.

Leaders

Nova 2 Pro leads this industry with a score of 81.5.

Benchmarks in this score

Each model’s scores on these are normalized and averaged to produce the industry score above.