Benchmarks
Codinghigher is better
LiveCodeBench
Contamination-resistant competitive-programming problems collected over time to avoid training-set overlap.
Benchmark source- Domain
- Coding
- Metric
- %
- Orientation
- Higher is better
- Results
- 13
Ranking
| # | Model | Score | Source | Status |
|---|---|---|---|---|
| 1 | DeepSeek V4-Pro DeepSeek | 93.5% | DeepSeek — V4-Pro model cardvendor | unverified |
| 2 | Qwen3.7 Max Alibaba Qwen | 91.6% | Qwen — Qwen3.7 Maxvendor | unverified |
| 3 | DeepSeek V4-Flash DeepSeek | 91.6% | DeepSeek — V4-Flash model cardvendor | unverified |
| 4 | Kimi K2.6 Moonshot AI | 89.6% | Moonshot — Kimi K2.6 model cardvendor | unverified |
| 5 | Nemotron 3 Ultra NVIDIA | 89% | NVIDIA — Nemotron 3 Ultra model cardvendor | unverified |
| 6 | Qwen3.6-27B Alibaba Qwen | 83.9% | Alibaba — Qwen3.6-27B model cardvendor | unverified |
| 7 | DeepSeek V3.2 DeepSeek | 83.3% | DeepSeek — V3.2 technical reportvendor | unverified |
| 8 | Kimi K2 Thinking Moonshot AI | 83.1% | Moonshot — Kimi K2 Thinking model cardvendor | unverified |
| 9 | Qwen3.6-35B-A3B Alibaba Qwen | 80.4% | Alibaba — Qwen3.6-35B-A3B model cardvendor | unverified |
| 10 | Nova 2 Pro Amazon | 74.6% | Amazon — Nova 2 technical reportvendor | unverified |
| 11 | Gemini 2.5 Pro Google DeepMind | 69% | Google DeepMind — Gemini 2.5 Pro model cardvendor | unverified |
| 12 | Llama 4 Maverick Meta AI | 43.4% | Meta — Llama 4vendor | unverified |
| 13 | Llama 4 Scout Meta AI | 32.8% | Meta — Llama 4vendor | unverified |
