Rankings based on user votes and Elo ratings
# | Model(3/3) | Elo | Win % | Votes | Record |
|---|---|---|---|---|---|
| 1 | GPT-5 Nano | 1232 | 50.0% | 12 | 6W-5L-1T |
| 2 | Kimi K2 Thinking | 1191 | 47.4% | 19 | 9W-8L-2T |
| 3 | GPT-5 Mini | 1191 | 33.3% | 9 | 3W-4L-2T |
Click column headers to sort by category
Model(3/3) | Overall | Long Prompt | Hard Prompt | Math | Instruction-Following | Coding |
|---|---|---|---|---|---|---|
| GPT-5 Nano | 1 | — | 3 | — | — | — |
| Kimi K2 Thinking | 2 | — | 2 | 1 | — | — |
| GPT-5 Mini | 3 | — | 1 | — | — | — |