Model set
Which models appear
Filter, toggle checkboxes, or use Top 12 / All / Clear. The same selection updates every chart.
Reasoning view
Applies to Scores, Economics (when available), and Per game below.
Estimated cost versus score will come with benchmarks of Game 09 or later.
Costs and scores include only economics-segment games (Game 09+). Estimated cost uses token usage and published list prices at the time of the benchmark release. “All” pools official reasoning runs for those games; Highest / Medium / None use cost from runs at that level only, plotted against mean score on those games at that level. Use the Reasoning view control above to switch modes. Not every model has an estimate; repair attempts can increase totals.