lmarena.ai排行榜

排名更新时间: 2025-01-23

排名(UB)

排名(StyleCtrl)

模型名

分数

置信区间

票数

服务商

Gemini-2.0-Flash-Thinking-Exp-01-21

1382

+8/-6

6,437

Google

Gemini-Exp-1206

1374

+5/-4

22,116

Google

ChatGPT-4o-latest (2024-11-20)

1365

+4/-4

35,328

OpenAI

DeepSeek-R1

1357

+12/-13

1,883

DeepSeek

Gemini-2.0-Flash-Exp

1356

+4/-4

20,939

Google

o1-2024-12-17

1352

+6/-6

9,230

OpenAI

o1-preview

1335

+3/-3

33,186

OpenAI

DeepSeek-V3

1317

+6/-5

13,640

DeepSeek

Step-2-16K-Exp

1305

+9/-7

4,533

StepFun

o1-mini

1305

+2/-3

49,952

OpenAI

Gemini-1.5-Pro-002

1302

+3/-4

46,621

Google

Grok-2-08-13

1288

+3/-3

67,150

xAI

Yi-Lightning

1287

+3/-4

28,955

01 AI

GPT-4o-2024-05-13

1285

+2/-2

117,745

OpenAI

Claude 3.5 Sonnet (20241022)

1283

+3/-3

48,847

Anthropic

Qwen2.5-plus-1127

1283

+5/-7

9,050

Alibaba

Deepseek-v2.5-1210

1279

+5/-6

7,261

DeepSeek

Athene-v2-Chat-72B

1276

+4/-5

22,355

NexusFlow

GLM-4-Plus

1274

+4/-4

27,771

Zhipu AI

GPT-4o-mini-2024-07-18

1273

+3/-3

61,233

OpenAI

Gemini-1.5-Flash-002

1271

+4/-3

35,199

Google

Llama-3.1-Nemotron-70B-Instruct

1269

+6/-5

7,598

Nvidia

Meta-Llama-3.1-405B-Instruct-bf16

1268

+4/-4

22,703

Meta

Claude 3.5 Sonnet (20240620)

1268

+2/-3

86,167

Anthropic

Meta-Llama-3.1-405B-Instruct-fp8

1267

+3/-3

63,187

Meta

Gemini Advanced App (2024-05-14)

1267

+3/-2

52,145

Google

Grok-2-Mini-08-13

1266

+3/-3

55,507

xAI

GPT-4o-2024-08-06

1265

+3/-3

47,975

OpenAI

Qwen-Max-0919

1263

+6/-5

17,434

Alibaba

Gemini-1.5-Pro-001

1260

+3/-2

82,433

Google

Deepseek-v2.5

1258

+4/-4

26,345

DeepSeek

Qwen2.5-72B-Instruct

1257

+4/-3

40,664

Alibaba

GPT-4-Turbo-2024-04-09

1256

+2/-2

102,125

OpenAI

Llama-3.3-70B-Instruct

1256

+4/-6

16,905

Meta

Mistral-Large-2407

1251

+3/-3

48,205

Mistral

Athene-70B

1250

+3/-5

20,609

NexusFlow

GPT-4-1106-preview

1250

+2/-2

103,732

OpenAI

Meta-Llama-3.1-70B-Instruct

1248

+3/-3

58,785

Meta

Llama-3.1-Tulu-3-70B

1244

+9/-10

3,031

Ai2

Claude 3 Opus

1247

+2/-2

202,713

Anthropic

Amazon Nova Pro 1.0

1243

+6/-4

13,738

Amazon

Mistral-Large-2411

1244

+5/-5

12,008

Mistral

GPT-4-0125-preview

1245

+2/-2

97,064

OpenAI

Claude 3.5 Haiku (20241022)

1239

+6/-7

12,181

Anthropic

Reka-Core-20240904

1235

+6/-5

7,942

Reka AI

Gemini-1.5-Flash-001

1227

+3/-2

65,656

Google

Jamba-1.5-Large

1221

+5/-5

9,125

AI21 Labs

Qwen2.5-Coder-32B-Instruct

1217

+8/-8

5,730

Alibaba

Gemma-2-27B-it

1220

+3/-3

73,168

Google

Amazon Nova Lite 1.0

1219

+5/-6

11,563

Amazon

Command R+ (08-2024)

1215

+5/-5

10,541

Cohere

Gemma-2-9B-it-SimPO

1216

+5/-6

10,551

Princeton

Phi-4

1211

+7/-10

4,039

Microsoft

Llama-3.1-Nemotron-51B-Instruct

1211

+8/-10

3,898

Nvidia

Gemini-1.5-Flash-8B-001

1212

+4/-3

36,856

Google

Nemotron-4-340B-Instruct

1209

+4/-3

20,604

Nvidia

Aya-Expanse-32B

1209

+4/-4

27,764

Cohere

GLM-4-0520

1207

+5/-5

10,212

Zhipu AI

Reka-Flash-20240904

1205

+6/-7

8,131

Reka AI

Llama-3-70B-Instruct

1206

+2/-2

163,809

Meta

Claude 3 Sonnet

1201

+2/-3

113,011

Anthropic

Amazon Nova Micro 1.0

1197

+5/-5

11,643

Amazon

Hunyuan-Standard-256K

1189

+10/-11

2,900

Tencent

Gemma-2-9B-it

1191

+4/-3

50,951

Google

Command R+ (04-2024)

1190

+3/-3

80,852

Cohere

Llama-3.1-Tulu-3-8B

1185

+9/-8

3,079

Ai2

Qwen2-72B-Instruct

1187

+3/-3

38,880

Alibaba

GPT-4-0314

1186

+4/-3

55,956

OpenAI

Ministral-8B-2410

1182

+9/-6

5,114

Mistral

Command R (08-2024)

1180

+5/-5

10,846

Cohere

Aya-Expanse-8B

1179

+7/-8

9,571

Cohere

Claude 3 Haiku

1179

+2/-2

122,299

Anthropic

DeepSeek-Coder-V2-Instruct

1178

+4/-5

15,754

DeepSeek AI

Jamba-1.5-Mini

1176

+6/-6

9,270

AI21 Labs

Meta-Llama-3.1-8B-Instruct

1176

+3/-2

52,651

Meta

GPT-4-0613

1163

+3/-3

91,646

OpenAI

Qwen1.5-110B-Chat

1161

+4/-3

27,466

Alibaba

Yi-1.5-34B-Chat

1157

+4/-4

25,126

01 AI

103

QwQ-32B-Preview

1153

+11/-11

3,412

Alibaba

Mistral-Large-2402

1157

+2/-3

64,925

Mistral

Reka-Flash-21B-online

1156

+4/-4

16,034

Reka AI

InternLM2.5-20B-chat

1149

+6/-6

10,595

InternLM

Llama-3-8B-Instruct

1152

+2/-2

109,237

Meta

Granite-3.1-8B-Instruct

1142

+10/-12

3,064

IBM

Command R (04-2024)

1149

+3/-3

56,375

Cohere

Mistral Medium

1148

+4/-3

35,554

Mistral

Reka-Flash-21B

1148

+3/-4

25,798

Reka AI

Mixtral-8x22b-Instruct-v0.1

1148

+3/-3

53,787

Mistral

Qwen1.5-72B-Chat

1147

+3/-3

40,630

Alibaba

Gemma-2-2b-it

1143

+3/-3

42,631

Google

Gemini-1.0-Pro-001

1131

+4/-4

18,792

Google

Zephyr-ORPO-141b-A35b-v0.1

1127

+6/-7

4,863

HuggingFace

Qwen1.5-32B-Chat

1125

+5/-4

22,759

Alibaba

Granite-3.1-2B-Instruct

1118

+10/-10

3,127

IBM

Phi-3-Medium-4k-Instruct

1123

+3/-4

26,110

Microsoft

105

Starling-LM-7B-beta

1119

+4/-5

16,672

Nexusflow

Mixtral-8x7B-Instruct-v0.1

1114

+0/-0

76,133

Mistral

Yi-34B-Chat

1111

+5/-4

15,920

01 AI

Gemini Pro

1110

+7/-8

6,559

Google

Qwen1.5-14B-Chat

1109

+5/-4

18,675

Alibaba

WizardLM-70B-v1.0

1106

+6/-6

8,379

Microsoft

GPT-3.5-Turbo-0125

1106

+3/-2

68,861

OpenAI

102

Meta-Llama-3.2-3B-Instruct

1103

+6/-7

8,403

Meta

DBRX-Instruct-Preview

1103

+4/-3

33,730

Databricks

100

Phi-3-Small-8k-Instruct

1102

+4/-3

18,478

Microsoft

101

Tulu-2-DPO-70B

1099

+7/-7

6,664

AllenAI/UW

103

Granite-3.0-8B-Instruct

1093

+6/-7

7,002

IBM

105

OpenChat-3.5-0106

1092

+6/-6

12,984

OpenChat

106

112

Llama-2-70B-chat

1093

+2/-2

39,634

Meta

106

105

Vicuna-33B

1091

+4/-4

22,950

LMSYS

106

Snowflake Arctic Instruct

1090

+3/-3

34,173

Snowflake

106

108

Starling-LM-7B-alpha

1088

+6/-5

10,417

UC Berkeley

106

113

Nous-Hermes-2-Mixtral-8x7B-DPO

1084

+9/-8

3,834

NousResearch

108

Gemma-1.1-7B-it

1084

+4/-4

25,059

Google

108

113

NV-Llama2-70B-SteerLM-Chat

1081

+9/-12

3,637

Nvidia

111

102

DeepSeek-LLM-67B-Chat

1077

+9/-7

4,988

DeepSeek AI

113

100

OpenChat-3.5

1076

+7/-8

8,111

OpenChat

113

104

OpenHermes-2.5-Mistral-7B

1074

+9/-7

5,091

NousResearch

113

108

Granite-3.0-2B-Instruct

1074

+7/-7

7,195

IBM

114

119

Mistral-7B-Instruct-v0.2

1072

+5/-4

20,051

Mistral

114

119

Qwen1.5-7B-Chat

1070

+8/-7

4,869

Alibaba

115

119

Phi-3-Mini-4K-Instruct-June-24

1071

+6/-4

12,818

Microsoft

115

GPT-3.5-Turbo-1106

1068

+5/-4

17,033

OpenAI

115

123

Phi-3-Mini-4k-Instruct

1066

+4/-5

21,095

Microsoft

115

Dolphin-2.2.1-Mistral-7B

1062

+13/-13

1,713

Cognitive Computations

115

118

SOLAR-10.7B-Instruct-v1.0

1062

+8/-9

4,287

Upstage AI

119

124

Llama-2-13b-chat

1063

+4/-4

19,736

Meta

122

119

WizardLM-13b-v1.2

1059

+7/-7

7,177

Microsoft

125

127

Meta-Llama-3.2-1B-Instruct

1054

+7/-6

8,531

Meta

126

125

Zephyr-7B-beta

1053

+6/-6

11,332

HuggingFace

126

123

SmolLM2-1.7B-Instruct

1047

+12/-14

2,370

HuggingFace

126

119

MPT-30B-chat

1046

+11/-12

2,648

MosaicML

126

124

Zephyr-7B-alpha

1042

+13/-16

1,815

HuggingFace

126

125

CodeLlama-70B-instruct

1041

+14/-20

1,193

Meta

128

126

CodeLlama-34B-instruct

1043

+7/-8

7,512

Meta

128

119

falcon-180b-chat

1034

+16/-16

1,326

TII

131

120

Vicuna-13B

1042

+4/-4

19,786

LMSYS

131

126

Gemma-7B-it

1037

+7/-5

9,174

Google

131

127

Phi-3-Mini-128k-Instruct

1037

+5/-4

21,627

Microsoft

131

120

Qwen-14B-Chat

1035

+8/-7

5,072

Alibaba

131

143

Llama-2-7B-chat

1037

+4/-5

14,553

Meta

131

129

Guanaco-33B

1033

+10/-8

3,000

140

132

Gemma-1.1-2b-it

1021

+5/-7

11,346

Google

140

136

StripedHyena-Nous-7B

1017

+7/-8

5,271

Together AI

141

150

OLMo-7B-instruct

1015

+7/-7

6,508

Allen AI

143

141

Mistral-7B-Instruct-v0.1

1008

+5/-6

9,144

Mistral

144

143

Vicuna-7B

1005

+6/-8

7,016

LMSYS

144

132

PaLM-Chat-Bison-001

1004

+6/-5

8,743

Google

148

147

Gemma-2B-it

989

+9/-9

4,921

Google

149

147

Qwen1.5-4B-Chat

988

+5/-6

7,814

Alibaba

151

Koala-13B

964

+6/-7

7,036

UC Berkeley

151

ChatGLM3-6B

955

+9/-7

4,764

Tsinghua

153

151

GPT4All-13B-Snoozy

932

+11/-14

1,786

Nomic AI

153

151

MPT-7B-Chat

928

+10/-11

4,013

MosaicML

153

156

ChatGLM2-6B

924

+11/-10

2,707

Tsinghua

153

156

RWKV-4-Raven-14B

922

+8/-10

4,934

RWKV

157

151

Alpaca-13B

902

+7/-9

5,876

Stanford

157

OpenAssistant-Pythia-12B

894

+6/-8

6,381

OpenAssistant

158

159

ChatGLM-6B

879

+8/-11

4,988

Tsinghua

159

FastChat-T5-3B

868

+7/-11

4,299

LMSYS

161

162

StableLM-Tuned-Alpha-7B

841

+9/-12

3,341

Stability AI

161

159

Dolly-V2-12B

822

+8/-11

3,485

Databricks

162

160

LLaMA-13B

800

+14/-14

2,444

Meta

说明

排名(UB)：基于 Bradley-Terry 模型的上界排名
排名(StyleCtrl)：考虑对话风格的样式控制排名
置信区间：模型表现的置信区间
分数：基于模型性能的竞技场得分

数据来源

数据来自 lmarena.ai

上一页模型榜单

最后更新于6小时前

这有帮助吗？