Model Rankings
This document was translated from Chinese by AI and has not yet been reviewed.
This is a leaderboard generated through an automated process based on Chatbot Arena (lmarena.ai) data.
Data Update Time: 2025-07-09 11:44:37 UTC / 2025-07-09 19:44:37 CST (Beijing Time)
Leaderboard
Explanation
Rank (UB): Ranking calculated based on the Bradley-Terry model. This ranking reflects the model's overall performance in the arena and provides an upper bound estimate of its Elo score, helping to understand the model's potential competitiveness.
Rank (StyleCtrl): Ranking after conversational style control. This ranking aims to reduce preference bias caused by model response styles (e.g., verbosity, conciseness), evaluating the model's core capabilities more purely.
Model Name: The name of the Large Language Model (LLM). This column embeds relevant model links, which can be clicked to navigate.
Score: The Elo rating obtained by the model through user votes in the arena. Elo rating is a relative ranking system, where a higher score indicates better model performance. This score is dynamic and reflects the model's relative strength in the current competitive environment.
Confidence Interval: The 95% confidence interval of the model's Elo rating (e.g.,
+6/-6
). A smaller interval indicates a more stable and reliable rating; conversely, a larger interval may suggest insufficient data or significant fluctuations in model performance. It provides a quantitative assessment of rating accuracy.Votes: The total number of votes received by the model in the arena. More votes generally imply higher statistical reliability of its rating.
Provider: The organization or company providing the model.
License: The type of license for the model, such as Proprietary, Apache 2.0, MIT, etc.
Knowledge Cutoff Date: The knowledge cutoff date of the model's training data. No Data indicates that relevant information is not provided or unknown.
Data Source and Update Frequency
This leaderboard data is automatically generated and provided by the fboulnois/llm-leaderboard-csv project, which obtains and processes data from lmarena.ai. This leaderboard is automatically updated daily by GitHub Actions.
Disclaimer
This report is for reference only. The leaderboard data is dynamic and based on user preference votes on Chatbot Arena within a specific period. The completeness and accuracy of the data depend on the upstream data source and the updates and processing of the fboulnois/llm-leaderboard-csv
project. Different models may adopt different license agreements; please refer to the official documentation of the model provider when using them.
最后更新于
这有帮助吗?