Crosshair
Benchmarks
Industry3 benchmarks

Medicine

Clinical knowledge and diagnostic reasoning, including the medical coding accuracy and science depth that real practice demands.

The Medicinescore is the mean of a model’s normalized 0–100 scores (direction-aware, so lower-is-better metrics are inverted) across the 3 benchmarks below — the same figure the leaderboard’s industry view ranks by.

Leaders

DeepSeek V4-Flash leads this industry with a score of 87.2.

Benchmarks in this score

Each model’s scores on these are normalized and averaged to produce the industry score above.