Model Rankings
This document was translated from Chinese by AI and has not yet been reviewed.
This document is translated from Chinese by AI and has not yet been reviewed. I will try to check the document one by one to ensure the translation is reasonable.
This is a leaderboard based on Chatbot Arena (lmarena.ai) data, generated through an automated process.
Data Update Time: 2025-10-29 08:08:13 UTC / 2025-10-29 16:08:13 CST (Beijing Time)
Leaderboard
Explanation
Rank (UB): Rank calculated based on the Bradley-Terry model. This rank reflects the overall performance of the model in the arena and provides an upper bound estimate of its Elo score, helping to understand the model's potential competitiveness.
Model: The name of the Large Language Model (LLM). Some model names may have embedded links.
Score: The Elo rating obtained by the model through user votes in the arena. Elo rating is a relative ranking system, where a higher score indicates better model performance.
95% Confidence Interval (±): The 95% confidence interval of the model's Elo rating (e.g.,
±6). A smaller interval indicates a more stable and reliable model rating.Votes: The total number of votes received by the model in the arena. More votes generally mean higher statistical reliability of its rating.
Organization/Company: The organization or company that provides the model.
License: The type of license agreement for the model, such as Proprietary, Apache 2.0, MIT, etc.
Data Source and Update Frequency
This leaderboard data is obtained directly from the official website 1 2 by an automated script. This leaderboard is automatically updated daily by GitHub Actions.
Disclaimer
This report is for informational purposes only. Leaderboard data is dynamic and based on user preference votes on Chatbot Arena over a specific period. The completeness and accuracy of the data depend on the upstream data source. Different models may adopt different license agreements; please refer to the official documentation of the model provider when using them.
Last updated
Was this helpful?