模型榜单

排名更新时间: 2025-05-18

排名(UB)
排名(StyleCtrl)
模型名
分数
置信区间
票数
服务商
协议

1

1

gemini-2.5-pro-preview-05-06

1446

+8/-9

4500

Google

Proprietary

2

1

o3-2025-04-16

1413

+8/-7

6689

OpenAI

Proprietary

2

2

chatgpt-4o-latest-20250326

1408

+6/-6

10290

OpenAI

Proprietary

2

5

grok-3-preview-02-24

1403

+4/-4

14843

xAI

Proprietary

3

3

gpt-4.5-preview-2025-02-27

1398

+4/-5

15275

OpenAI

Proprietary

4

5

gemini-2.5-flash-preview-04-17

1394

+7/-7

5959

Google

Proprietary

7

5

deepseek-v3-0324

1373

+7/-5

8753

DeepSeek

MIT

7

5

gpt-4.1-2025-04-14

1366

+7/-8

5102

OpenAI

Proprietary

8

8

deepseek-r1

1358

+4/-4

18493

DeepSeek

MIT

8

15

gemini-2.0-flash-001

1355

+4/-3

24913

Google

Proprietary

8

13

hunyuan-turbos-20250416

1355

+9/-10

3699

Tencent

Proprietary

8

5

o4-mini-2025-04-16

1351

+10/-7

5083

OpenAI

Proprietary

9

7

o1-2024-12-17

1350

+4/-4

29036

OpenAI

Proprietary

9

14

qwen3-235b-a22b

1343

+11/-9

3611

Alibaba

Apache 2.0

11

15

gemma-3-27b-it

1341

+5/-4

12343

Google

Gemma

12

15

qwen2.5-max

1341

+4/-3

23180

Alibaba

Proprietary

14

12

o1-preview

1335

+3/-3

33171

OpenAI

Proprietary

17

24

gemma-3-12b-it

1321

+11/-11

3016

Google

Gemma

18

15

o3-mini-high

1325

+4/-4

19410

OpenAI

Proprietary

18

13

gpt-4.1-mini-2025-04-14

1322

+6/-7

4950

OpenAI

Proprietary

18

21

deepseek-v3

1318

+4/-3

22838

DeepSeek

DeepSeek

19

27

qwq-32b

1313

+6/-5

9946

Alibaba

Apache 2.0

19

23

gemini-2.0-flash-lite-preview-02-05

1312

+3/-3

25020

Google

Proprietary

19

27

glm-4-plus-0111

1311

+6/-7

6024

Zhipu

Proprietary

19

24

qwen-plus-0125

1310

+5/-6

6058

Alibaba

Proprietary

21

24

command-a-03-2025

1306

+4/-5

10513

Cohere

CC-BY-NC-4.0

21

29

step-2-16k-exp-202412

1305

+8/-7

5125

StepFun

Proprietary

21

23

hunyuan-turbos-20250226

1303

+11/-9

2449

Tencent

Proprietary

22

23

o3-mini

1305

+4/-3

24918

OpenAI

Proprietary

24

29

o1-mini

1304

+2/-3

54953

OpenAI

Proprietary

24

24

gemini-1.5-pro-002

1302

+2/-2

58645

Google

Proprietary

24

10

claude-3-7-sonnet-20250219-thinking

1300

+5/-5

12038

Anthropic

Proprietary

24

30

llama-3.3-nemotron-49b-super-v1

1297

+9/-10

2366

Nvidia

Nvidia

24

24

hunyuan-turbo-0110

1296

+11/-10

2512

Tencent

Proprietary

31

13

claude-3-7-sonnet-20250219

1290

+6/-5

17387

Anthropic

Proprietary

33

35

grok-2-2024-08-13

1288

+2/-2

67079

xAI

Proprietary

33

38

yi-lightning

1287

+3/-4

28968

01 AI

Proprietary

33

26

gpt-4o-2024-05-13

1285

+2/-1

117741

OpenAI

Proprietary

33

47

qwen2.5-plus-1127

1282

+5/-4

10711

Alibaba

Proprietary

33

54

gemma-3-4b-it

1275

+12/-10

3326

Google

Gemma

34

42

deepseek-v2.5-1210

1279

+7/-6

7242

DeepSeek

DeepSeek

36

15

claude-3-5-sonnet-20241022

1283

+2/-3

65435

Anthropic

Proprietary

38

38

hunyuan-large-2025-02-10

1272

+11/-6

3858

Tencent

Proprietary

40

50

athene-v2-chat

1275

+3/-3

26077

NexusFlow

NexusFlow

40

40

gpt-4.1-nano-2025-04-14

1271

+7/-6

5121

OpenAI

Proprietary

41

45

glm-4-plus

1274

+3/-3

27787

Zhipu AI

Proprietary

41

47

gpt-4o-mini-2024-07-18

1272

+2/-2

71363

OpenAI

Proprietary

41

50

gemini-1.5-flash-002

1271

+2/-3

37025

Google

Proprietary

41

31

llama-4-maverick-17b-128e-instruct

1270

+6/-6

7798

Meta

Llama 4

41

63

llama-3.1-nemotron-70b-instruct

1269

+7/-6

7579

Nvidia

Llama 3.1

42

30

llama-3.1-405b-instruct-bf16

1269

+3/-2

43799

Meta

Llama 3.1 Community

44

28

claude-3-5-sonnet-20240620

1268

+2/-2

86162

Anthropic

Proprietary

44

30

gemini-advanced-0514

1267

+3/-2

52136

Google

Proprietary

44

47

hunyuan-standard-2025-02-10

1261

+9/-8

4015

Tencent

Proprietary

45

31

llama-3.1-405b-instruct-fp8

1267

+2/-2

63030

Meta

Llama 3.1 Community

45

62

grok-2-mini-2024-08-13

1266

+2/-2

55434

xAI

Proprietary

46

34

gpt-4o-2024-08-06

1265

+2/-3

47970

OpenAI

Proprietary

46

49

qwen-max-0919

1263

+4/-5

17432

Alibaba

Qwen

56

44

gemini-1.5-pro-001

1260

+2/-2

82431

Google

Proprietary

56

58

deepseek-v2.5

1258

+4/-4

26339

DeepSeek

DeepSeek

57

63

qwen2.5-72b-instruct

1257

+3/-2

41517

Alibaba

Qwen

57

47

llama-3.3-70b-instruct

1257

+3/-3

38088

Meta

Llama-3.3

57

42

gpt-4-turbo-2024-04-09

1256

+2/-2

102145

OpenAI

Proprietary

59

50

mistral-large-2407

1252

+3/-2

48217

Mistral

Mistral Research

63

63

athene-70b-0725

1250

+3/-4

20578

NexusFlow

CC-BY-NC-4.0

63

63

mistral-large-2411

1249

+4/-3

29643

Mistral

MRL

64

47

gpt-4-1106-preview

1250

+2/-2

103749

OpenAI

Proprietary

64

69

llama-3.1-70b-instruct

1248

+2/-2

58637

Meta

Llama 3.1 Community

64

69

llama-3.1-tulu-3-70b

1244

+8/-9

3009

Ai2

Llama 3.1

65

45

claude-3-opus-20240229

1247

+1/-2

202651

Anthropic

Proprietary

65

69

amazon-nova-pro-v1.0

1245

+3/-3

25794

Amazon

Proprietary

66

50

gpt-4-0125-preview

1245

+2/-2

97080

OpenAI

Proprietary

72

47

claude-3-5-haiku-20241022

1237

+3/-3

36354

Anthropic

Propretary

72

69

reka-core-20240904

1235

+5/-5

7946

Reka AI

Proprietary

75

72

gemini-1.5-flash-001

1227

+2/-3

65662

Google

Proprietary

75

71

jamba-1.5-large

1222

+6/-5

9125

AI21 Labs

Jamba Open

75

79

qwen2.5-coder-32b-instruct

1217

+7/-8

5733

Alibaba

Apache 2.0

76

72

gemma-2-27b-it

1220

+2/-2

79527

Google

Gemma license

76

79

mistral-small-24b-instruct-2501

1218

+4/-4

15323

Mistral

Apache 2.0

76

87

amazon-nova-lite-v1.0

1217

+3/-3

20652

Amazon

Proprietary

76

74

gemma-2-9b-it-simpo

1216

+5/-4

10549

Princeton

MIT

76

75

command-r-plus-08-2024

1215

+5/-5

10537

Cohere

CC-BY-NC-4.0

76

70

llama-3.1-nemotron-51b-instruct

1212

+8/-10

3890

Nvidia

Llama 3.1

78

88

gemini-1.5-flash-8b-001

1213

+3/-3

37695

Google

Proprietary

78

87

olmo-2-0325-32b-instruct

1206

+9/-9

3183

Allen AI

Apache-2.0

80

86

c4ai-aya-expanse-32b

1209

+3/-3

28757

Cohere

CC-BY-NC-4.0

80

77

nemotron-4-340b-instruct

1209

+3/-4

20610

Nvidia

NVIDIA Open Model

80

82

glm-4-0520

1207

+5/-5

10224

Zhipu AI

Proprietary

81

77

reka-flash-20240904

1206

+5/-6

8134

Reka AI

Proprietary

83

79

llama-3-70b-instruct

1207

+2/-2

163642

Meta

Llama 3 Community

83

91

phi-4

1205

+4/-3

25221

Microsoft

MIT

87

77

claude-3-sonnet-20240229

1201

+2/-2

113071

Anthropic

Proprietary

87

100

amazon-nova-micro-v1.0

1198

+5/-3

20660

Amazon

Proprietary

90

101

hunyuan-standard-256k

1189

+11/-9

2900

Tencent

Proprietary

94

88

gemma-2-9b-it

1192

+2/-2

57201

Google

Gemma license

94

87

command-r-plus

1190

+2/-2

80856

Cohere

CC-BY-NC-4.0

94

102

llama-3.1-tulu-3-8b

1185

+8/-10

3075

Ai2

Llama 3.1

94

100

ministral-8b-2410

1182

+8/-7

5109

Mistral

MRL

95

87

qwen2-72b-instruct

1187

+2/-2

38870

Alibaba

Qianwen LICENSE

95

75

gpt-4-0314

1186

+3/-3

55967

OpenAI

Proprietary

96

102

c4ai-aya-expanse-8b

1180

+6/-5

10395

Cohere

CC-BY-NC-4.0

97

91

command-r-08-2024

1180

+4/-5

10849

Cohere

CC-BY-NC-4.0

97

87

deepseek-coder-v2

1178

+5/-5

15756

DeepSeek AI

DeepSeek License

98

92

claude-3-haiku-20240307

1179

+2/-2

122315

Anthropic

Proprietary

98

101

jamba-1.5-mini

1176

+5/-5

9273

AI21 Labs

Jamba Open

99

117

llama-3.1-8b-instruct

1176

+2/-3

52576

Meta

Llama 3.1 Community

107

86

gpt-4-0613

1163

+2/-2

91628

OpenAI

Proprietary

107

102

qwen1.5-110b-chat

1161

+3/-4

27438

Alibaba

Qianwen LICENSE

107

117

yi-1.5-34b-chat

1157

+4/-3

25136

01 AI

Apache-2.0

107

103

reka-flash-21b-20240226-online

1156

+6/-4

16027

Reka AI

Proprietary

107

132

qwq-32b-preview

1153

+11/-8

3412

Alibaba

Apache 2.0

108

101

mistral-large-2402

1157

+2/-3

64924

Mistral

Proprietary

109

109

llama-3-8b-instruct

1152

+2/-3

109093

Meta

Llama 3 Community

111

124

internlm2_5-20b-chat

1149

+3/-5

10596

InternLM

Other

111

111

granite-3.1-8b-instruct

1143

+10/-8

3290

IBM

Apache 2.0

112

105

command-r

1149

+2/-3

56392

Cohere

CC-BY-NC-4.0

112

105

mixtral-8x22b-instruct-v0.1

1148

+2/-3

53761

Mistral

Apache 2.0

112

110

mistral-medium

1148

+3/-3

35566

Mistral

Proprietary

112

106

qwen1.5-72b-chat

1147

+3/-3

40665

Alibaba

Qianwen LICENSE

112

108

reka-flash-21b-20240226

1147

+3/-4

25807

Reka AI

Proprietary

113

124

gemma-2-2b-it

1144

+3/-2

48901

Google

Gemma license

121

105

gemini-pro-dev-api

1131

+4/-4

18803

Google

Proprietary

122

114

zephyr-orpo-141b-A35b-v0.1

1127

+7/-10

4853

HuggingFace

Apache 2.0

122

119

qwen1.5-32b-chat

1125

+3/-4

22762

Alibaba

Qianwen LICENSE

122

124

granite-3.1-2b-instruct

1119

+8/-11

3380

IBM

Apache 2.0

123

123

phi-3-medium-4k-instruct

1123

+3/-3

26112

Microsoft

MIT

123

134

starling-lm-7b-beta

1119

+4/-4

16674

Nexusflow

Apache-2.0

125

114

gemini-pro

1111

+7/-6

6557

Google

Proprietary

126

124

mixtral-8x7b-instruct-v0.1

1114

+2/-2

76134

Mistral

Apache 2.0

126

129

yi-34b-chat

1111

+5/-5

15918

01 AI

Yi License

127

126

qwen1.5-14b-chat

1109

+4/-3

18685

Alibaba

Qianwen LICENSE

127

127

wizardlm-70b

1106

+7/-6

8383

Microsoft

Llama 2 Community

128

114

gpt-3.5-turbo-0125

1106

+2/-2

68871

OpenAI

Proprietary

128

132

llama-3.2-3b-instruct

1103

+6/-6

8394

Meta

Llama 3.2

129

124

dbrx-instruct-preview

1103

+3/-3

33743

Databricks

DBRX LICENSE

129

131

phi-3-small-8k-instruct

1102

+4/-4

18477

Microsoft

MIT

129

132

tulu-2-dpo-70b

1099

+7/-7

6658

AllenAI/UW

AI2 ImpACT Low-risk

135

124

granite-3.0-8b-instruct

1093

+6/-7

7002

IBM

Apache 2.0

136

129

openchat-3.5-0106

1092

+5/-5

12986

OpenChat

Apache-2.0

137

143

llama-2-70b-chat

1093

+3/-4

39602

Meta

Llama 2 Community

137

135

vicuna-33b

1091

+3/-4

22937

LMSYS

Non-commercial

137

129

snowflake-arctic-instruct

1090

+2/-3

34181

Snowflake

Apache 2.0

137

139

starling-lm-7b-alpha

1088

+5/-6

10414

UC Berkeley

CC-BY-NC-4.0

137

146

nous-hermes-2-mixtral-8x7b-dpo

1084

+8/-8

3837

NousResearch

Apache-2.0

138

143

llama2-70b-steerlm-chat

1080

+11/-10

3636

Nvidia

Llama 2 Community

139

131

gemma-1.1-7b-it

1084

+3/-3

25063

Google

Gemma license

143

132

deepseek-llm-67b-chat

1077

+7/-9

4988

DeepSeek AI

DeepSeek License

143

132

openchat-3.5

1076

+6/-7

8107

OpenChat

Apache-2.0

143

139

granite-3.0-2b-instruct

1074

+8/-8

7187

IBM

Apache 2.0

144

134

openhermes-2.5-mistral-7b

1074

+7/-7

5086

NousResearch

Apache-2.0

145

150

phi-3-mini-4k-instruct-june-2024

1071

+7/-5

12800

Microsoft

MIT

145

150

qwen1.5-7b-chat

1070

+7/-7

4873

Alibaba

Qianwen LICENSE

145

146

dolphin-2.2.1-mistral-7b

1062

+14/-11

1713

Cognitive Computations

Apache-2.0

146

150

mistral-7b-instruct-v0.2

1072

+3/-4

20063

Mistral

Apache-2.0

146

126

gpt-3.5-turbo-1106

1068

+4/-5

17035

OpenAI

Proprietary

146

154

phi-3-mini-4k-instruct

1066

+4/-3

21092

Microsoft

MIT

146

149

solar-10.7b-instruct-v1.0

1062

+8/-8

4286

Upstage AI

CC-BY-NC-4.0

150

154

llama-2-13b-chat

1063

+4/-3

19713

Meta

Llama 2 Community

151

150

wizardlm-13b

1059

+7/-7

7175

Microsoft

Llama 2 Community

156

160

llama-3.2-1b-instruct

1054

+7/-6

8518

Meta

Llama 3.2

157

158

zephyr-7b-beta

1053

+4/-6

11321

HuggingFace

MIT

157

154

smollm2-1.7b-instruct

1046

+10/-12

2376

HuggingFace

Apache 2.0

157

150

mpt-30b-chat

1045

+11/-10

2644

MosaicML

CC-BY-NC-SA-4.0

157

159

codellama-70b-instruct

1041

+16/-12

1192

Meta

Llama 2 Community

159

157

zephyr-7b-alpha

1040

+11/-13

1812

HuggingFace

MIT

160

158

codellama-34b-instruct

1043

+6/-7

7508

Meta

Llama 2 Community

160

150

falcon-180b-chat

1034

+15/-16

1327

TII

Falcon-180B TII License

161

152

vicuna-13b

1042

+5/-4

19773

LMSYS

Llama 2 Community

162

158

gemma-7b-it

1037

+5/-7

9176

Google

Gemma license

162

173

llama-2-7b-chat

1037

+5/-4

14532

Meta

Llama 2 Community

162

158

phi-3-mini-128k-instruct

1037

+4/-4

21625

Microsoft

MIT

162

150

qwen-14b-chat

1035

+7/-9

5067

Alibaba

Qianwen LICENSE

162

162

guanaco-33b

1033

+9/-9

2995

UW

Non-commercial

170

163

gemma-1.1-2b-it

1021

+6/-5

11349

Google

Gemma license

172

167

stripedhyena-nous-7b

1017

+8/-8

5277

Together AI

Apache 2.0

173

180

olmo-7b-instruct

1015

+6/-7

6501

Allen AI

Apache-2.0

175

172

mistral-7b-instruct

1008

+7/-8

9144

Mistral

Apache 2.0

175

173

vicuna-7b

1005

+8/-8

7015

LMSYS

Llama 2 Community

175

162

palm-2

1003

+7/-5

8711

Google

Proprietary

178

178

gemma-2b-it

990

+9/-6

4918

Google

Gemma license

180

175

qwen1.5-4b-chat

989

+5/-7

7819

Alibaba

Qianwen LICENSE

182

181

koala-13b

965

+6/-8

7020

UC Berkeley

Non-commercial

182

182

chatglm3-6b

955

+8/-10

4762

Tsinghua

Apache-2.0

183

182

gpt4all-13b-snoozy

932

+13/-14

1788

Nomic AI

Non-commercial

184

182

mpt-7b-chat

928

+9/-10

3997

MosaicML

CC-BY-NC-SA-4.0

184

187

chatglm2-6b

924

+12/-13

2712

Tsinghua

Apache-2.0

184

185

RWKV-4-Raven-14B

922

+10/-10

4920

RWKV

Apache 2.0

188

182

alpaca-13b

901

+8/-7

5864

Stanford

Non-commercial

188

188

oasst-pythia-12b

893

+7/-7

6368

OpenAssistant

Apache 2.0

189

190

chatglm-6b

879

+9/-9

4983

Tsinghua

Non-commercial

190

190

fastchat-t5-3b

868

+9/-9

4288

LMSYS

Apache 2.0

192

193

stablelm-tuned-alpha-7b

840

+7/-10

3336

Stability AI

CC-BY-NC-SA-4.0

192

190

dolly-v2-12b

822

+10/-13

3480

Databricks

MIT

193

191

llama-13b

800

+11/-13

2446

Meta

Non-commercial

说明

  • 排名(UB):基于 Bradley-Terry 模型的上界排名

  • 排名(StyleCtrl):考虑对话风格的样式控制排名

  • 置信区间:模型表现的置信区间

  • 分数:基于模型性能的竞技场得分

  • 票数:模型获得的投票数量

  • 服务商:提供模型的机构或公司

  • 协议:模型使用的许可协议

数据来源

最后更新于

这有帮助吗?