模型榜单

目前公认的最权威的LLM全面评测榜单,人工盲评

排名更新时间: 2025-05-01

排名(UB)
排名(StyleCtrl)
模型名
分数
置信区间
票数
服务商
许可证

1

1

gemini-2.5-pro-exp-03-25

1439

+6/-5

10,389

Google

Proprietary

2

1

o3-2025-04-16

1418

+14/-9

2,211

OpenAI

Proprietary

2

2

chatgpt-4o-latest-20250326

1408

+6/-5

9,229

OpenAI

Proprietary

3

5

grok-3-preview-02-24

1402

+4/-5

14,840

xAI

Proprietary

3

5

gemini-2.5-flash-preview-04-17

1393

+10/-7

4,073

Google

Proprietary

4

3

gpt-4.5-preview-2025-02-27

1398

+4/-5

15,285

OpenAI

Proprietary

7

11

gemini-2.0-flash-thinking-exp-01-21

1380

+4/-4

26,903

Google

Proprietary

7

5

deepseek-v3-0324

1373

+6/-7

6,792

DeepSeek

MIT

8

5

gpt-4.1-2025-04-14

1363

+10/-9

2,927

OpenAI

Proprietary

9

7

deepseek-r1

1358

+5/-4

16,857

DeepSeek

MIT

9

5

gemini-2.0-flash-001

1354

+3/-3

23,060

Google

Proprietary

9

6

o4-mini-2025-04-16

1351

+10/-10

2,070

OpenAI

Proprietary

9

5

o1-2024-12-17

1350

+4/-3

29,044

OpenAI

Proprietary

13

16

gemma-3-27b-it

1342

+4/-5

10,735

Google

Gemma

13

16

qwen2.5-max

1340

+4/-4

22,105

Alibaba

Proprietary

14

13

o1-preview

1335

+2/-3

33,172

OpenAI

Proprietary

17

14

o3-mini-high

1325

+4/-4

18,934

OpenAI

Proprietary

17

16

gpt-4.1-mini-2025-04-14

1322

+9/-13

2,700

OpenAI

Proprietary

17

19

deepseek-v3

1318

+4/-4

22,828

DeepSeek

DeepSeek

18

26

qwq-32b

1314

+6/-6

8,563

Alibaba

Apache 2.0

18

23

gemini-2.0-flash-lite-preview-02-05

1311

+4/-4

23,099

Google

Proprietary

18

26

glm-4-plus-0111

1311

+8/-6

6,024

Zhipu

Proprietary

18

23

qwen-plus-0125

1310

+7/-8

6,056

Alibaba

Proprietary

19

27

step-2-16k-exp-202412

1305

+7/-6

5,125

StepFun

Proprietary

19

23

command-a-03-2025

1305

+6/-6

8,521

Cohere

CC-BY-NC-4.0

19

23

o3-mini

1305

+4/-3

24,941

OpenAI

Proprietary

19

22

hunyuan-turbos-20250226

1302

+11/-10

2,452

Tencent

Proprietary

20

7

claude-3-7-sonnet-20250219-thinking-32k

1301

+7/-5

9,952

Anthropic

Proprietary

21

27

o1-mini

1304

+3/-3

54,961

OpenAI

Proprietary

22

30

llama-3.3-nemotron-49b-super-v1

1296

+10/-8

2,368

Nvidia

Nvidia

23

23

gemini-1.5-pro-002

1302

+2/-2

58,652

Google

Proprietary

23

23

hunyuan-turbo-0110

1295

+9/-9

2,512

Tencent

Proprietary

29

14

claude-3-7-sonnet-20250219

1292

+6/-5

15,281

Anthropic

Proprietary

31

34

grok-2-2024-08-13

1288

+2/-2

67,096

xAI

Proprietary

31

36

yi-lightning

1287

+3/-3

28,976

01 AI

Proprietary

32

23

gpt-4o-2024-05-13

1285

+2/-2

117,751

OpenAI

Proprietary

32

46

qwen2.5-plus-1127

1282

+5/-5

10,719

Alibaba

Proprietary

33

16

claude-3-5-sonnet-20241022

1283

+3/-2

65,174

Anthropic

Proprietary

35

39

deepseek-v2.5-1210

1279

+6/-6

7,239

DeepSeek

DeepSeek

36

34

gpt-4.1-nano-2025-04-14

1271

+12/-9

2,787

OpenAI

Proprietary

37

34

hunyuan-large-2025-02-10

1272

+9/-8

3,855

Tencent

Proprietary

38

48

athene-v2-chat

1275

+3/-3

26,086

NexusFlow

NexusFlow

38

29

llama-4-maverick-17b-128e-instruct

1271

+6/-7

5,772

Meta

Llama 4

39

44

glm-4-plus

1274

+2/-4

27,786

Zhipu AI

Proprietary

39

52

gpt-4o-mini-2024-07-18

1272

+2/-2

71,378

OpenAI

Proprietary

39

61

gemini-1.5-flash-002

1271

+3/-3

37,024

Google

Proprietary

39

61

llama-3.1-nemotron-70b-instruct

1269

+6/-5

7,576

Nvidia

Llama 3.1

40

30

llama-3.1-405b-instruct-bf16

1269

+3/-3

43,785

Meta

Llama 3.1 Community

41

26

claude-3-5-sonnet-20240620

1268

+2/-3

86,163

Anthropic

Proprietary

41

30

gemini-advanced-0514

1267

+3/-3

52,140

Google

Proprietary

43

30

llama-3.1-405b-instruct-fp8

1267

+2/-2

63,056

Meta

Llama 3.1 Community

43

32

grok-2-mini-2024-08-13

1266

+2/-2

55,446

xAI

Proprietary

43

46

gpt-4o-2024-08-06

1265

+3/-2

47,974

OpenAI

Proprietary

43

46

hunyuan-standard-2025-02-10

1261

+8/-8

4,018

Tencent

Proprietary

44

46

qwen-max-0919

1263

+4/-3

17,438

Alibaba

Qwen

53

42

gemini-1.5-pro-001

1260

+2/-2

82,450

Google

Proprietary

54

54

deepseek-v2.5

1258

+3/-3

26,346

DeepSeek

DeepSeek

54

61

qwen2.5-72b-instruct

1257

+3/-3

41,532

Alibaba

Qwen

55

46

gpt-4-turbo-2024-04-09

1256

+2/-2

102,153

OpenAI

Proprietary

55

46

llama-3.3-70b-instruct

1256

+3/-3

38,099

Meta

Llama-3.3

59

46

mistral-large-2407

1251

+2/-2

48,214

Mistral

Mistral Research

59

64

athene-70b-0725

1250

+3/-3

20,580

NexusFlow

CC-BY-NC-4.0

59

64

llama-3.1-tulu-3-70b

1244

+9/-9

3,011

Ai2

Llama 3.1

61

45

gpt-4-1106-preview

1250

+2/-2

103,739

OpenAI

Proprietary

61

61

mistral-large-2411

1249

+3/-3

29,493

Mistral

MRL

61

43

claude-3-opus-20240229

1247

+2/-2

202,698

Anthropic

Proprietary

63

48

gpt-4-0125-preview

1245

+2/-2

97,064

OpenAI

Proprietary

63

67

amazon-nova-pro-v1.0

1244

+3/-3

24,768

Amazon

Proprietary

69

46

claude-3-5-haiku-20241022

1237

+3/-3

34,282

Anthropic

Proprietary

69

67

reka-core-20240904

1235

+5/-5

7,941

Reka AI

Proprietary

72

68

gemini-1.5-flash-001

1227

+2/-3

65,656

Google

Proprietary

72

77

jamba-1.5-large

1222

+6/-6

9,122

AI21 Labs

Jamba Open

72

69

qwen2.5-coder-32b-instruct

1217

+7/-6

5,731

Alibaba

Apache 2.0

73

69

gemma-2-27b-it

1220

+2/-2

79,532

Google

Gemma license

73

76

amazon-nova-lite-v1.0

1217

+4/-4

20,653

Amazon

Proprietary

73

69

mistral-small-24b-instruct-2501

1217

+4/-4

15,089

Mistral

Apache 2.0

73

71

gemma-2-9b-it-simpo

1216

+6/-5

10,548

Princeton

MIT

73

67

command-r-plus-08-2024

1215

+5/-5

10,534

Cohere

CC-BY-NC-4.0

74

87

llama-3.1-nemotron-51b-instruct

1211

+10/-8

3,889

Nvidia

Llama 3.1

77

74

nemotron-4-340b-instruct

1209

+3/-4

20,614

Nvidia

NVIDIA Open Model

77

79

c4ai-aya-expanse-32b

1209

+3/-3

28,752

Cohere

CC-BY-NC-4.0

77

75

glm-4-0520

1207

+5/-5

10,222

Zhipu AI

Proprietary

77

76

reka-flash-20240904

1205

+6/-5

8,136

Reka AI

Proprietary

81

75

llama-3-70b-instruct

1207

+1/-2

163,662

Meta

Llama 3 Community

81

89

phi-4

1205

+3/-3

25,228

Microsoft

MIT

85

75

claude-3-sonnet-20240229

1201

+1/-2

113,053

Anthropic

Proprietary

85

97

amazon-nova-micro-v1.0

1198

+4/-4

20,658

Amazon

Proprietary

89

87

gemma-2-9b-it

1192

+2/-3

57,206

Google

Gemma license

89

98

hunyuan-standard-256k

1189

+9/-11

2,902

Tencent

Proprietary

89

98

llama-3.1-tulu-3-8b

1185

+9/-11

3,076

Ai2

Llama 3.1

90

84

command-r-plus

1190

+2/-2

80,848

Cohere

CC-BY-NC-4.0

90

97

qwen2-72b-instruct

1187

+2/-3

38,886

Alibaba

Qianwen LICENSE

91

71

ministral-8b-2410

1182

+7/-6

5,108

Mistral

MRL

92

98

gpt-4-0314

1186

+2/-2

55,970

OpenAI

Proprietary

94

88

c4ai-aya-expanse-8b

1180

+6/-5

10,397

Cohere

CC-BY-NC-4.0

94

89

command-r-08-2024

1179

+3/-5

10,843

Cohere

CC-BY-NC-4.0

94

84

claude-3-haiku-20240307

1179

+2/-2

122,315

Anthropic

Proprietary

94

97

deepseek-coder-v2

1178

+4/-4

15,757

DeepSeek AI

DeepSeek License

94

113

jamba-1.5-mini

1176

+5/-6

9,273

AI21 Labs

Jamba Open

103

83

llama-3.1-8b-instruct

1176

+2/-3

52,586

Meta

Llama 3.1 Community

103

98

gpt-4-0613

1163

+2/-2

91,638

OpenAI

Proprietary

103

98

qwen1.5-110b-chat

1161

+3/-3

27,438

Alibaba

Qianwen LICENSE

103

132

reka-flash-21b-20240226-online

1156

+5/-5

16,032

Reka AI

Proprietary

104

97

qwq-32b-preview

1153

+8/-11

3,412

Alibaba

Apache 2.0

104

111

mistral-large-2402

1157

+2/-2

64,917

Mistral

Proprietary

106

106

yi-1.5-34b-chat

1157

+3/-4

25,131

01 AI

Apache-2.0

106

119

llama-3-8b-instruct

1152

+2/-2

109,091

Meta

Llama 3 Community

107

102

internlm2_5-20b-chat

1149

+5/-5

10,596

InternLM

Other

107

106

command-r

1149

+2/-3

56,381

Cohere

CC-BY-NC-4.0

107

110

mistral-medium

1148

+3/-3

35,564

Mistral

Proprietary

108

101

granite-3.1-8b-instruct

1142

+10/-10

3,293

IBM

Apache 2.0

108

106

mixtral-8x22b-instruct-v0.1

1148

+2/-3

53,758

Mistral

Apache 2.0

108

102

reka-flash-21b-20240226

1147

+3/-3

25,807

Reka AI

Proprietary

109

120

qwen1.5-72b-chat

1147

+3/-3

40,670

Alibaba

Qianwen LICENSE

117

101

gemma-2-2b-it

1144

+2/-2

48,902

Google

Gemma license

117

113

gemini-pro-dev-api

1131

+4/-4

18,801

Google

Proprietary

118

114

zephyr-orpo-141b-A35b-v0.1

1127

+8/-7

4,860

HuggingFace

Apache 2.0

118

122

qwen1.5-32b-chat

1125

+4/-4

22,757

Alibaba

Qianwen LICENSE

119

119

granite-3.1-2b-instruct

1119

+10/-10

3,383

IBM

Apache 2.0

119

130

phi-3-medium-4k-instruct

1123

+3/-4

26,109

Microsoft

MIT

122

120

starling-lm-7b-beta

1119

+4/-4

16,674

Nexusflow

Apache-2.0

122

125

mixtral-8x7b-instruct-v0.1

1114

+2/-3

76,140

Mistral

Apache 2.0

122

110

yi-34b-chat

1111

+5/-5

15,918

01 AI

Yi License

123

122

gemini-pro

1111

+6/-7

6,556

Google

Proprietary

123

124

qwen1.5-14b-chat

1109

+4/-4

18,683

Alibaba

Qianwen LICENSE

124

128

wizardlm-70b

1106

+6/-6

8,385

Microsoft

Llama 2 Community

125

110

llama-3.2-3b-instruct

1103

+6/-7

8,400

Meta

Llama 3.2

125

120

gpt-3.5-turbo-0125

1106

+2/-3

68,869

OpenAI

Proprietary

125

128

dbrx-instruct-preview

1103

+3/-4

33,738

Databricks

DBRX LICENSE

125

128

phi-3-small-8k-instruct

1102

+5/-5

18,472

Microsoft

MIT

129

120

granite-3.0-8b-instruct

1093

+7/-8

7,001

IBM

Apache 2.0

132

125

openchat-3.5-0106

1091

+5/-5

12,989

OpenChat

Apache-2.0

133

139

llama-2-70b-chat

1093

+2/-3

39,605

Meta

Llama 2 Community

133

125

vicuna-33b

1091

+4/-3

22,946

LMSYS

Non-commercial

133

135

snowflake-arctic-instruct

1090

+4/-5

34,176

Snowflake

Apache 2.0

133

142

starling-lm-7b-alpha

1088

+5/-5

10,417

UC Berkeley

CC-BY-NC-4.0

135

127

nous-hermes-2-mixtral-8x7b-dpo

1084

+9/-10

3,838

NousResearch

Apache-2.0

135

140

gemma-1.1-7b-it

1084

+4/-3

25,068

Google

Gemma license

139

128

llama2-70b-steerlm-chat

1080

+9/-12

3,636

Nvidia

Llama 2 Community

140

128

deepseek-llm-67b-chat

1077

+7/-9

4,988

DeepSeek AI

DeepSeek License

141

135

openchat-3.5

1076

+5/-6

8,106

OpenChat

Apache-2.0

141

132

granite-3.0-2b-instruct

1074

+6/-7

7,186

IBM

Apache 2.0

141

146

openhermes-2.5-mistral-7b

1074

+6/-6

5,086

NousResearch

Apache-2.0

141

146

mistral-7b-instruct-v0.2

1072

+4/-4

20,066

Mistral

Apache-2.0

141

146

phi-3-mini-4k-instruct-june-2024

1071

+5/-5

12,804

Microsoft

MIT

141

142

qwen1.5-7b-chat

1070

+7/-8

4,871

Alibaba

Qianwen LICENSE

141

121

dolphin-2.2.1-mistral-7b

1063

+13/-12

1,713

Cognitive Computations

Apache-2.0

142

150

gpt-3.5-turbo-1106

1068

+4/-5

17,034

OpenAI

Proprietary

143

144

phi-3-mini-4k-instruct

1066

+4/-4

21,094

Microsoft

MIT

147

153

solar-10.7b-instruct-v1.0

1062

+7/-10

4,289

Upstage AI

CC-BY-NC-4.0

149

146

llama-2-13b-chat

1063

+4/-4

19,717

Meta

Llama 2 Community

152

155

wizardlm-13b

1059

+5/-6

7,176

Microsoft

Llama 2 Community

153

153

llama-3.2-1b-instruct

1054

+7/-7

8,517

Meta

Llama 3.2

153

147

zephyr-7b-beta

1053

+5/-5

11,320

HuggingFace

MIT

153

146

smollm2-1.7b-instruct

1047

+10/-12

2,377

HuggingFace

Apache 2.0

153

150

mpt-30b-chat

1045

+10/-11

2,644

MosaicML

CC-BY-NC-SA-4.0

153

153

codellama-70b-instruct

1042

+14/-20

1,190

Meta

Llama 2 Community

155

146

zephyr-7b-alpha

1041

+12/-13

1,811

HuggingFace

MIT

156

154

falcon-180b-chat

1034

+17/-17

1,328

TII

Falcon-180B TII License

156

147

codellama-34b-instruct

1043

+6/-7

7,509

Meta

Llama 2 Community

158

155

vicuna-13b

1042

+6/-4

19,771

LMSYS

Llama 2 Community

158

168

phi-3-mini-128k-instruct

1037

+4/-4

21,633

Microsoft

MIT

158

155

llama-2-7b-chat

1037

+5/-5

14,532

Meta

Llama 2 Community

158

146

gemma-7b-it

1037

+5/-7

9,176

Google

Gemma license

158

158

qwen-14b-chat

1035

+7/-8

5,067

Alibaba

Qianwen LICENSE

166

159

guanaco-33b

1033

+9/-13

2,997

UW

Non-commercial

167

161

gemma-1.1-2b-it

1021

+6/-5

11,348

Google

Gemma license

167

176

stripedhyena-nous-7b

1017

+9/-7

5,278

Together AI

Apache 2.0

171

168

olmo-7b-instruct

1015

+7/-7

6,502

Allen AI

Apache-2.0

171

171

mistral-7b-instruct

1008

+6/-6

9,143

Mistral

Apache 2.0

172

158

vicuna-7b

1005

+6/-7

7,014

LMSYS

Llama 2 Community

174

174

palm-2

1003

+5/-6

8,712

Google

Proprietary

176

172

gemma-2b-it

989

+9/-9

4,921

Google

Gemma license

178

178

qwen1.5-4b-chat

988

+7/-7

7,821

Alibaba

Qianwen LICENSE

178

178

koala-13b

965

+7/-7

7,021

UC Berkeley

Non-commercial

180

177

chatglm3-6b

955

+7/-7

4,761

Tsinghua

Apache-2.0

180

178

gpt4all-13b-snoozy

932

+13/-14

1,788

Nomic AI

Non-commercial

180

183

mpt-7b-chat

928

+9/-9

3,996

MosaicML

CC-BY-NC-SA-4.0

180

183

chatglm2-6b

924

+10/-10

2,712

Tsinghua

Apache-2.0

184

178

RWKV-4-Raven-14B

922

+7/-9

4,920

RWKV

Apache 2.0

184

184

alpaca-13b

901

+9/-11

5,865

Stanford

Non-commercial

185

186

oasst-pythia-12b

893

+7/-8

6,367

OpenAssistant

Apache 2.0

186

186

chatglm-6b

879

+8/-9

4,987

Tsinghua

Non-commercial

188

189

fastchat-t5-3b

868

+9/-9

4,288

LMSYS

Apache 2.0

188

186

stablelm-tuned-alpha-7b

840

+12/-11

3,339

Stability AI

CC-BY-NC-SA-4.0

190

187

dolly-v2-12b

822

+10/-10

3,481

Databricks

MIT

190

190

llama-13b

799

+12/-13

2,446

Meta

Non-commercial

说明

  • 排名(UB):基于 Bradley-Terry 模型的上界排名

  • 排名(StyleCtrl):考虑对话风格的样式控制排名

  • 置信区间:模型表现的置信区间

  • 分数:基于模型性能的竞技场得分

数据来源

最后更新于

这有帮助吗?