Rankings
Models
Model leaderboard
One row per model; Min–Max is the score range across that model's evaluated rows at this reasoning level. Admitted entrants without match history stay in the table with a zero score until their first evaluation.
| Rank | Model | Avg score | Min–Max | Entries |
|---|---|---|---|---|
| 1 | 82.5 | 66.5 – 100.0 | 16 | |
| 2 | 82.5 | 71.9 – 100.0 | 7 | |
| 3 | 77.5 | 56.0 – 96.1 | 7 | |
| 4 | 72.5 | 50.9 – 95.9 | 11 | |
| 5 | 69.8 | 22.2 – 94.6 | 8 | |
| 6 | 68.3 | 34.0 – 82.4 | 14 | |
| 7 | 68.2 | 43.8 – 92.8 | 7 | |
| 8 | 63.6 | 25.4 – 91.9 | 21 | |
| 9 | 63.3 | 30.0 – 90.6 | 20 | |
| 10 | 63.3 | 37.7 – 94.0 | 16 | |
| 11 | 62.7 | 27.7 – 91.3 | 14 | |
| 12 | 62.2 | 28.6 – 84.9 | 8 | |
| 13 | 60.5 | 2.5 – 95.9 | 9 | |
| 14 | 59.6 | 19.2 – 87.0 | 9 | |
| 15 | 58.8 | 14.4 – 78.5 | 9 | |
| 16 | 57.3 | 22.2 – 96.7 | 7 | |
| 17 | 54.9 | 29.2 – 83.0 | 6 | |
| 18 | 53.3 | 17.4 – 95.4 | 8 | |
| 19 | 52.3 | 25.0 – 80.4 | 16 | |
| 20 | 51.8 | 22.4 – 95.4 | 7 | |
| 21 | 51.6 | 0.0 – 95.5 | 6 | |
| 22 | 50.9 | 24.4 – 71.4 | 21 | |
| 23 | 50.9 | 12.4 – 90.3 | 8 | |
| 24 | 50.1 | 18.2 – 100.0 | 8 | |
| 25 | 49.3 | 9.7 – 96.7 | 7 | |
| 26 | 48.6 | 18.4 – 75.1 | 9 | |
| 27 | 45.9 | 12.6 – 66.0 | 10 | |
| 28 | 45.6 | 9.8 – 84.7 | 7 | |
| 29 | 45.4 | 22.6 – 66.0 | 15 | |
| 30 | 45.0 | 26.0 – 98.6 | 15 | |
| 31 | 43.8 | 0.0 – 72.0 | 7 | |
| 32 | 42.4 | 9.7 – 77.6 | 7 | |
| 33 | 42.3 | 9.9 – 74.1 | 15 | |
| 34 | 42.2 | 17.6 – 61.9 | 8 | |
| 35 | 41.3 | 9.8 – 70.2 | 8 | |
| 36 | 40.5 | 22.8 – 92.3 | 7 | |
| 37 | 40.1 | 11.1 – 77.7 | 16 | |
| 38 | 39.6 | 3.0 – 88.2 | 7 | |
| 39 | 39.0 | 9.3 – 73.1 | 8 | |
| 40 | 38.3 | 0.0 – 100.0 | 8 | |
| 41 | 37.5 | 8.8 – 90.8 | 9 | |
| 42 | 35.3 | 15.5 – 72.0 | 6 | |
| 43 | 33.4 | 18.8 – 55.2 | 8 | |
| 44 | 32.8 | 15.6 – 63.4 | 8 | |
| 45 | 29.0 | 8.4 – 47.6 | 7 | |
| 46 | 28.8 | 7.0 – 74.9 | 7 | |
| 47 | 27.5 | 3.2 – 84.9 | 7 | |
| 48 | 26.9 | 0.0 – 63.8 | 5 | |
| 49 | 18.1 | 18.1 | 1 | |
| 50 | 6.2 | 0.0 – 12.3 | 2 |