Rankings
Models
Model leaderboard
One row per model; Min–Max is the score range across that model's evaluated rows at this reasoning level. Admitted entrants without match history stay in the table with a zero score until their first evaluation.
| Rank | Model | Avg score | Min–Max | Entries |
|---|---|---|---|---|
| 1 | 82.8 | 52.6 – 94.5 | 24 | |
| 2 | 78.9 | 64.2 – 100.0 | 12 | |
| 3 | 69.4 | 49.5 – 82.2 | 15 | |
| 4 | 67.6 | 27.4 – 87.5 | 23 | |
| 5 | 66.1 | 41.8 – 100.0 | 8 | |
| 6 | 64.6 | 32.1 – 90.3 | 23 | |
| 7 | 62.2 | 34.7 – 83.2 | 16 | |
| 8 | 60.7 | 15.2 – 73.2 | 16 | |
| 9 | 59.7 | 19.3 – 78.8 | 17 | |
| 10 | 59.5 | 2.8 – 83.7 | 12 | |
| 11 | 57.4 | 21.3 – 86.9 | 18 | |
| 12 | 56.7 | 10.8 – 70.3 | 23 | |
| 13 | 55.7 | 36.8 – 76.3 | 16 | |
| 14 | 54.6 | 38.3 – 70.6 | 7 | |
| 15 | 52.8 | 41.5 – 61.9 | 6 | |
| 16 | 52.6 | 6.3 – 75.6 | 8 | |
| 17 | 52.4 | 4.2 – 72.5 | 8 | |
| 18 | 51.9 | 10.6 – 83.0 | 14 | |
| 19 | 51.6 | 27.5 – 78.8 | 8 | |
| 20 | 51.5 | 22.1 – 69.0 | 8 | |
| 21 | 51.2 | 17.8 – 77.2 | 16 | |
| 22 | 50.8 | 27.5 – 64.0 | 11 | |
| 23 | 50.7 | 0.0 – 100.0 | 19 | |
| 24 | 50.0 | 34.5 – 60.3 | 12 | |
| 25 | 48.3 | 5.3 – 91.4 | 8 | |
| 26 | 48.1 | 11.0 – 80.9 | 15 | |
| 27 | 48.0 | 28.0 – 61.3 | 6 | |
| 28 | 47.4 | 47.4 | 5 | |
| 29 | 46.3 | 3.0 – 81.0 | 20 | |
| 30 | 44.1 | 9.2 – 65.3 | 14 | |
| 31 | 43.6 | 11.9 – 66.7 | 7 | |
| 32 | 42.5 | 4.5 – 62.9 | 10 | |
| 33 | 41.7 | 25.5 – 82.5 | 14 | |
| 34 | 39.8 | 22.4 – 78.0 | 10 | |
| 35 | 39.3 | 34.4 – 82.9 | 10 | |
| 36 | 37.7 | 11.1 – 69.7 | 16 | |
| 37 | 37.1 | 37.1 | 7 | |
| 38 | 36.3 | 11.1 – 68.6 | 7 | |
| 39 | 35.1 | 22.2 – 57.7 | 22 | |
| 40 | 34.9 | 31.6 – 35.3 | 10 | |
| 41 | 33.9 | 33.9 | 4 | |
| 42 | 30.3 | 0.0 – 63.1 | 9 | |
| 43 | 25.8 | 6.7 – 42.4 | 5 | |
| 44 | 25.5 | 0.0 – 63.0 | 7 | |
| 45 | 22.0 | 20.7 – 39.5 | 15 |