排名更新时间: 2025-01-19
1
1
Gemini-Exp-1206
1374
+5/-4
20,845
2
1
ChatGPT-4o-latest (2024-11-20)
1365
+4/-4
34,030
OpenAI
2
4
Gemini-2.0-Flash-Thinking-Exp-1219
1364
+5/-5
16,357
2
4
Gemini-2.0-Flash-Exp
1357
+5/-5
19,640
4
1
o1-2024-12-17
1352
+7/-7
7,957
OpenAI
6
4
o1-preview
1335
+5/-3
33,197
OpenAI
7
7
DeepSeek-V3
1320
+5/-6
11,591
DeepSeek
7
10
Step-2-16K-Exp
1306
+8/-8
4,028
StepFun
8
11
o1-mini
1305
+4/-3
48,654
OpenAI
8
8
Gemini-1.5-Pro-002
1303
+3/-3
45,203
11
13
Grok-2-08-13
1288
+4/-3
66,281
xAI
11
15
Yi-Lightning
1287
+5/-3
28,959
01 AI
11
10
GPT-4o-2024-05-13
1285
+3/-2
117,760
OpenAI
11
7
Claude 3.5 Sonnet (20241022)
1284
+3/-3
47,437
Anthropic
11
22
Qwen2.5-plus-1127
1282
+7/-6
7,680
Alibaba
11
18
Deepseek-v2.5-1210
1279
+7/-5
7,261
DeepSeek
14
22
Athene-v2-Chat-72B
1277
+5/-4
21,014
NexusFlow
15
21
GLM-4-Plus
1274
+5/-3
27,773
Zhipu AI
15
21
GPT-4o-mini-2024-07-18
1273
+3/-2
60,551
OpenAI
16
24
Gemini-1.5-Flash-002
1271
+3/-3
34,540
16
35
Llama-3.1-Nemotron-70B-Instruct
1269
+6/-6
7,596
Nvidia
18
11
Meta-Llama-3.1-405B-Instruct-bf16
1268
+4/-3
21,285
Meta
20
10
Claude 3.5 Sonnet (20240620)
1268
+2/-3
86,177
Anthropic
20
12
Meta-Llama-3.1-405B-Instruct-fp8
1267
+3/-3
63,202
Meta
20
11
Gemini Advanced App (2024-05-14)
1266
+3/-3
52,148
20
31
Grok-2-Mini-08-13
1266
+3/-3
54,893
xAI
20
12
GPT-4o-2024-08-06
1265
+3/-2
47,981
OpenAI
21
23
Qwen-Max-0919
1263
+4/-4
17,436
Alibaba
28
19
Gemini-1.5-Pro-001
1260
+2/-2
82,432
28
28
Deepseek-v2.5
1258
+3/-4
26,353
DeepSeek
28
33
Qwen2.5-72B-Instruct
1257
+3/-4
39,984
Alibaba
28
20
Llama-3.3-70B-Instruct
1257
+5/-4
15,516
Meta
29
18
GPT-4-Turbo-2024-04-09
1256
+2/-2
102,126
OpenAI
30
23
Mistral-Large-2407
1252
+3/-3
48,207
Mistral
30
30
Athene-70B
1250
+5/-5
20,618
NexusFlow
30
38
Llama-3.1-Tulu-3-70B
1244
+11/-9
3,026
Ai2
33
22
GPT-4-1106-preview
1250
+3/-2
103,732
OpenAI
34
39
Meta-Llama-3.1-70B-Instruct
1248
+3/-3
58,786
Meta
34
20
Claude 3 Opus
1247
+2/-2
202,735
Anthropic
34
39
Amazon Nova Pro 1.0
1244
+5/-5
13,076
Amazon
36
24
GPT-4-0125-preview
1245
+3/-2
97,070
OpenAI
36
37
Mistral-Large-2411
1243
+5/-5
10,634
Mistral
39
20
Claude 3.5 Haiku (20241022)
1237
+7/-4
10,773
Anthropic
40
39
Reka-Core-20240904
1235
+6/-6
7,937
Reka AI
44
42
Gemini-1.5-Flash-001
1227
+3/-2
65,650
45
40
Jamba-1.5-Large
1221
+5/-5
9,125
AI21 Labs
46
41
Gemma-2-27B-it
1220
+3/-3
72,425
46
55
Amazon Nova Lite 1.0
1218
+6/-5
10,868
Amazon
46
48
Qwen2.5-Coder-32B-Instruct
1217
+7/-6
5,731
Alibaba
46
43
Gemma-2-9B-it-SimPO
1216
+5/-6
10,557
Princeton
46
44
Command R+ (08-2024)
1215
+5/-5
10,546
Cohere
46
39
Llama-3.1-Nemotron-51B-Instruct
1211
+7/-8
3,897
Nvidia
46
54
Phi-4
1210
+9/-10
2,485
Microsoft
47
56
Gemini-1.5-Flash-8B-001
1212
+4/-4
36,162
48
55
Aya-Expanse-32B
1209
+4/-4
26,994
Cohere
48
52
GLM-4-0520
1207
+8/-5
10,214
Zhipu AI
49
46
Nemotron-4-340B-Instruct
1209
+4/-3
20,609
Nvidia
49
47
Reka-Flash-20240904
1205
+7/-5
8,130
Reka AI
52
48
Llama-3-70B-Instruct
1206
+2/-3
163,797
Meta
57
46
Claude 3 Sonnet
1201
+2/-2
113,033
Anthropic
57
69
Amazon Nova Micro 1.0
1197
+5/-6
10,956
Amazon
61
57
Gemma-2-9B-it
1191
+3/-3
50,239
61
56
Command R+ (04-2024)
1190
+2/-2
80,868
Cohere
61
69
Hunyuan-Standard-256K
1189
+8/-9
2,899
Tencent
61
70
Llama-3.1-Tulu-3-8B
1185
+11/-10
3,077
Ai2
62
56
Qwen2-72B-Instruct
1187
+3/-3
38,889
Alibaba
62
44
GPT-4-0314
1186
+3/-2
55,965
OpenAI
62
69
Ministral-8B-2410
1182
+7/-7
5,109
Mistral
64
70
Aya-Expanse-8B
1181
+6/-6
8,797
Cohere
64
56
Command R (08-2024)
1180
+5/-5
10,846
Cohere
66
60
Claude 3 Haiku
1179
+2/-2
122,291
Anthropic
66
56
DeepSeek-Coder-V2-Instruct
1178
+5/-5
15,752
DeepSeek AI
66
69
Jamba-1.5-Mini
1176
+5/-6
9,269
AI21 Labs
67
84
Meta-Llama-3.1-8B-Instruct
1176
+3/-3
52,646
Meta
75
55
GPT-4-0613
1163
+2/-3
91,642
OpenAI
75
69
Qwen1.5-110B-Chat
1161
+4/-4
27,467
Alibaba
75
85
Yi-1.5-34B-Chat
1157
+4/-4
25,126
01 AI
75
69
Mistral-Large-2402
1157
+3/-3
64,912
Mistral
75
70
Reka-Flash-21B-online
1156
+5/-4
16,024
Reka AI
75
103
QwQ-32B-Preview
1153
+11/-11
3,415
Alibaba
78
78
Llama-3-8B-Instruct
1152
+2/-2
109,211
Meta
78
90
InternLM2.5-20B-chat
1149
+5/-5
10,599
InternLM
80
74
Command R (04-2024)
1149
+2/-3
56,380
Cohere
80
78
Mistral Medium
1148
+3/-3
35,554
Mistral
80
78
Reka-Flash-21B
1148
+4/-3
25,803
Reka AI
80
72
Mixtral-8x22b-Instruct-v0.1
1148
+2/-3
53,788
Mistral
80
72
Qwen1.5-72B-Chat
1147
+3/-3
40,638
Alibaba
80
81
Granite-3.1-8B-Instruct
1139
+13/-12
2,468
IBM
82
91
Gemma-2-2b-it
1142
+3/-4
41,868
89
72
Gemini-1.0-Pro-001
1131
+4/-4
18,793
89
82
Zephyr-ORPO-141b-A35b-v0.1
1127
+10/-8
4,860
HuggingFace
89
86
Qwen1.5-32B-Chat
1125
+4/-4
22,772
Alibaba
89
93
Granite-3.1-2B-Instruct
1117
+11/-11
2,476
IBM
90
90
Phi-3-Medium-4k-Instruct
1123
+4/-3
26,112
Microsoft
91
103
Starling-LM-7B-beta
1119
+5/-5
16,671
Nexusflow
94
93
Mixtral-8x7B-Instruct-v0.1
1114
+0/-0
76,138
Mistral
94
97
Yi-34B-Chat
1111
+5/-4
15,922
01 AI
94
82
Gemini Pro
1110
+7/-8
6,559
96
96
Qwen1.5-14B-Chat
1109
+3/-5
18,678
Alibaba
96
95
WizardLM-70B-v1.0
1106
+7/-6
8,380
Microsoft
96
82
GPT-3.5-Turbo-0125
1106
+3/-3
68,873
OpenAI
96
101
Meta-Llama-3.2-3B-Instruct
1103
+8/-7
8,411
Meta
97
92
DBRX-Instruct-Preview
1103
+4/-3
33,737
Databricks
97
99
Phi-3-Small-8k-Instruct
1102
+5/-6
18,477
Microsoft
98
102
Tulu-2-DPO-70B
1099
+6/-6
6,663
AllenAI/UW
103
92
Granite-3.0-8B-Instruct
1093
+6/-6
7,005
IBM
103
97
OpenChat-3.5-0106
1091
+5/-5
12,980
OpenChat
104
112
Llama-2-70B-chat
1093
+3/-3
39,634
Meta
105
104
Vicuna-33B
1091
+4/-4
22,950
LMSYS
105
107
Starling-LM-7B-alpha
1088
+7/-4
10,416
UC Berkeley
105
116
Nous-Hermes-2-Mixtral-8x7B-DPO
1084
+11/-10
3,835
NousResearch
106
97
Snowflake Arctic Instruct
1090
+3/-3
34,172
Snowflake
106
112
NV-Llama2-70B-SteerLM-Chat
1081
+10/-8
3,637
Nvidia
107
99
Gemma-1.1-7B-it
1084
+4/-4
25,065
111
101
DeepSeek-LLM-67B-Chat
1077
+7/-7
4,987
DeepSeek AI
112
100
OpenChat-3.5
1076
+5/-8
8,112
OpenChat
112
100
OpenHermes-2.5-Mistral-7B
1074
+8/-6
5,089
NousResearch
112
107
Granite-3.0-2B-Instruct
1074
+7/-8
7,193
IBM
113
118
Mistral-7B-Instruct-v0.2
1072
+4/-4
20,058
Mistral
113
118
Phi-3-Mini-4K-Instruct-June-24
1071
+4/-4
12,818
Microsoft
113
118
Qwen1.5-7B-Chat
1070
+6/-9
4,868
Alibaba
114
93
GPT-3.5-Turbo-1106
1068
+5/-5
17,032
OpenAI
115
122
Phi-3-Mini-4k-Instruct
1066
+4/-4
21,085
Microsoft
115
116
Dolphin-2.2.1-Mistral-7B
1062
+10/-13
1,713
Cognitive Computations
115
117
SOLAR-10.7B-Instruct-v1.0
1062
+10/-10
4,288
Upstage AI
119
123
Llama-2-13b-chat
1063
+5/-5
19,738
Meta
121
118
WizardLM-13b-v1.2
1059
+7/-7
7,178
Microsoft
123
127
CodeLlama-70B-instruct
1041
+21/-17
1,194
Meta
124
128
Meta-Llama-3.2-1B-Instruct
1054
+6/-6
8,535
Meta
124
125
Zephyr-7B-beta
1053
+6/-6
11,334
HuggingFace
124
118
SmolLM2-1.7B-Instruct
1047
+13/-13
2,371
HuggingFace
124
118
MPT-30B-chat
1046
+12/-10
2,648
MosaicML
125
121
Zephyr-7B-alpha
1042
+11/-15
1,814
HuggingFace
127
126
CodeLlama-34B-instruct
1043
+7/-5
7,515
Meta
127
118
falcon-180b-chat
1034
+18/-17
1,327
TII
130
118
Vicuna-13B
1042
+5/-5
19,790
LMSYS
130
126
Gemma-7B-it
1037
+6/-5
9,176
130
127
Phi-3-Mini-128k-Instruct
1037
+4/-4
21,632
Microsoft
130
141
Llama-2-7B-chat
1037
+6/-5
14,555
Meta
130
118
Qwen-14B-Chat
1035
+7/-7
5,070
Alibaba
130
128
Guanaco-33B
1033
+10/-12
2,999
UW
139
132
Gemma-1.1-2b-it
1021
+6/-5
11,348
139
135
StripedHyena-Nous-7B
1018
+9/-7
5,273
Together AI
140
148
OLMo-7B-instruct
1016
+6/-7
6,504
Allen AI
143
140
Mistral-7B-Instruct-v0.1
1008
+6/-6
9,144
Mistral
143
142
Vicuna-7B
1005
+8/-7
7,015
LMSYS
143
129
PaLM-Chat-Bison-001
1004
+8/-5
8,744
148
146
Gemma-2B-it
989
+7/-9
4,922
148
145
Qwen1.5-4B-Chat
988
+6/-6
7,813
Alibaba
150
149
Koala-13B
964
+8/-6
7,034
UC Berkeley
150
150
ChatGLM3-6B
955
+8/-8
4,765
Tsinghua
152
149
GPT4All-13B-Snoozy
932
+13/-15
1,786
Nomic AI
152
150
MPT-7B-Chat
928
+11/-8
4,012
MosaicML
152
155
ChatGLM2-6B
924
+13/-10
2,707
Tsinghua
152
155
RWKV-4-Raven-14B
922
+9/-8
4,934
RWKV
156
150
Alpaca-13B
902
+9/-9
5,876
Stanford
156
156
OpenAssistant-Pythia-12B
893
+7/-8
6,380
OpenAssistant
157
158
ChatGLM-6B
879
+10/-9
4,988
Tsinghua
158
158
FastChat-T5-3B
868
+9/-8
4,302
LMSYS
160
160
StableLM-Tuned-Alpha-7B
840
+11/-11
3,341
Stability AI
160
158
Dolly-V2-12B
822
+11/-12
3,485
Databricks
161
160
LLaMA-13B
800
+12/-14
2,444
Meta
排名(UB):基于 Bradley-Terry 模型的上界排名
排名(StyleCtrl):考虑对话风格的样式控制排名
置信区间:模型表现的置信区间
分数:基于模型性能的竞技场得分
数据来自 lmarena.ai