模型榜单

这是一个基于 Chatbot Arena (lmarena.ai) 数据的排行榜,通过自动化流程生成。

数据更新时间: 2025-10-23 08:08:10 UTC / 2025-10-23 16:08:10 CST (北京时间)

排行榜

排名 (UB)
模型
分数
95% 置信区间 (±)
票数
组织/公司
许可证

1

gemini-2.5-pro 1

1451

±4

54,087

Google

Proprietary

1

claude-opus-4-1-20250805-thinking-16k 1

1447

±5

21,306

Anthropic

Proprietary

1

claude-sonnet-4-5-20250929-thinking-32k 1

1445

±8

6,287

Anthropic

Proprietary

1

gpt-4.5-preview-2025-02-27 1

1441

±6

14,644

OpenAI

Proprietary

2

chatgpt-4o-latest-20250326 1

1440

±4

40,013

OpenAI

Proprietary

2

o3-2025-04-16 1

1440

±4

51,293

OpenAI

Proprietary

2

claude-sonnet-4-5-20250929 1

1438

±8

6,144

Anthropic

Proprietary

2

gpt-5-high 1

1437

±5

23,580

OpenAI

Proprietary

2

claude-opus-4-1-20250805 1

1437

±5

33,298

Anthropic

Proprietary

3

qwen3-max-preview 1

1434

±6

18,078

Alibaba

Proprietary

10

gpt-5-chat 1

1425

±5

21,630

OpenAI

Proprietary

10

qwen3-max-2025-09-23 1

1423

±7

6,919

Alibaba

Proprietary

10

glm-4.6 1

1422

±9

4,401

Z.ai

MIT

11

grok-4-fast 1

1420

±8

7,104

xAI

Proprietary

11

claude-opus-4-20250514-thinking-16k 1

1419

±5

35,522

Anthropic

Proprietary

11

deepseek-v3.2-exp-thinking 1

1419

±9

4,320

DeepSeek AI

MIT

11

qwen3-vl-235b-a22b-instruct 1

1418

±8

6,312

Alibaba

Apache 2.0

11

qwen3-235b-a22b-instruct-2507 1

1418

±5

29,343

Alibaba

Apache 2.0

11

deepseek-r1-0528 1

1417

±6

19,284

DeepSeek

MIT

11

kimi-k2-0905-preview 1

1417

±7

10,772

Moonshot

Modified MIT

11

deepseek-v3.1 1

1416

±6

15,380

DeepSeek

MIT

11

deepseek-v3.1-thinking 1

1415

±7

12,098

DeepSeek

MIT

11

kimi-k2-0711-preview 1

1415

±5

28,321

Moonshot

Modified MIT

11

deepseek-v3.1-terminus 1

1414

±10

3,775

DeepSeek AI

MIT

11

deepseek-v3.1-terminus-thinking 1

1413

±10

3,541

DeepSeek AI

MIT

12

grok-4-0709 1

1413

±5

29,264

xAI

Proprietary

12

claude-opus-4-20250514 1

1411

±4

43,310

Anthropic

Proprietary

12

deepseek-v3.2-exp 1

1408

±9

4,684

DeepSeek AI

MIT

13

gpt-4.1-2025-04-14 1

1411

±4

41,918

OpenAI

Proprietary

14

grok-3-preview-02-24 1

1409

±4

34,154

xAI

Proprietary

18

mistral-medium-2508 1

1406

±5

23,844

Mistral

Proprietary

18

glm-4.5 1

1406

±5

22,612

Z.ai

MIT

18

gemini-2.5-flash-preview-09-2025 1

1404

±7

6,730

Google

Proprietary

23

claude-haiku-4-5-20251001 1

1397

±12

2,380

Anthropic

Proprietary

24

qwen3-next-80b-a3b-instruct 1

1402

±6

12,793

Alibaba

Apache 2.0

29

o1-2024-12-17 1

1400

±4

28,039

OpenAI

Proprietary

29

longcat-flash-chat 1

1398

±6

11,667

Meituan

MIT

29

qwen3-235b-a22b-thinking-2507 1

1397

±6

9,386

Alibaba

Apache 2.0

30

claude-sonnet-4-20250514-thinking-32k 1

1398

±5

33,827

Anthropic

Proprietary

30

qwen3-235b-a22b-no-thinking 1

1398

±5

39,528

Alibaba

Apache 2.0

32

gpt-5-mini-high 1

1395

±6

18,172

OpenAI

Proprietary

32

deepseek-r1 1

1394

±5

18,718

DeepSeek

MIT

32

qwen3-vl-235b-a22b-thinking 1

1392

±8

5,956

Alibaba

Apache 2.0

36

deepseek-v3-0324 1

1391

±4

44,482

DeepSeek

MIT

36

o4-mini-2025-04-16 1

1391

±4

41,513

OpenAI

Proprietary

36

mai-1-preview 1

1389

±6

14,528

Microsoft AI

Proprietary

38

claude-sonnet-4-20250514 1

1389

±5

39,329

Anthropic

Proprietary

38

hunyuan-t1-20250711 1

1384

±9

4,845

Tencent

Proprietary

39

o1-preview 1

1386

±5

31,505

OpenAI

Proprietary

39

qwen3-30b-a3b-instruct-2507 1

1385

±5

21,853

Alibaba

Apache 2.0

40

claude-3-7-sonnet-20250219-thinking-32k 1

1386

±4

39,987

Anthropic

Proprietary

41

qwen3-coder-480b-a35b-instruct 1

1384

±5

23,287

Alibaba

Apache 2.0

44

mistral-medium-2505 1

1381

±5

34,539

Mistral

Proprietary

44

hunyuan-turbos-20250416 1

1379

±6

11,135

Tencent

Proprietary

47

gpt-4.1-mini-2025-04-14 1

1379

±4

40,621

OpenAI

Proprietary

50

gemini-2.5-flash-lite-preview-09-2025-no-thinking 1

1374

±7

6,765

Google

Proprietary

52

gemini-2.5-flash-lite-preview-06-17-thinking 1

1374

±5

31,701

Google

Proprietary

52

qwen3-235b-a22b 1

1372

±5

27,210

Alibaba

Apache 2.0

54

qwen2.5-max 1

1372

±4

33,541

Alibaba

Proprietary

54

glm-4.5-air 1

1369

±5

21,945

Z.ai

MIT

55

Claude 3.5 Sonnet (10/22) 1

1370

±3

89,889

Anthropic

Proprietary

55

qwen3-next-80b-a3b-thinking 1

1367

±6

11,522

Alibaba

Apache 2.0

56

minimax-m1 1

1368

±5

31,897

MiniMax

Apache 2.0

59

grok-3-mini-high 1

1362

±5

17,615

xAI

Proprietary

59

o3-mini-high 1

1362

±5

18,735

OpenAI

Proprietary

60

gemma-3-27b-it 1

1363

±4

44,508

Google

Gemma

63

grok-3-mini-beta 1

1356

±5

23,839

xAI

Proprietary

63

deepseek-v3 1

1356

±5

21,994

DeepSeek

DeepSeek

64

glm-4.5v 1

1351

±8

5,028

Z.ai

MIT

65

mistral-small-2506 1

1353

±5

18,374

Mistral

Apache 2.0

67

gemini-2.0-flash-lite-preview-02-05 1

1352

±4

25,215

Google

Proprietary

67

Gemini-1.5-Pro-002 1

1351

±3

56,012

Google

Proprietary

67

command-a-03-2025 1

1349

±4

47,441

Cohere

CC-BY-NC-4.0

67

gpt-oss-120b 1

1348

±5

21,119

OpenAI

Apache 2.0

67

hunyuan-turbos-20250226 1

1345

±12

2,250

Tencent

Proprietary

67

llama-3.1-nemotron-ultra-253b-v1 1

1344

±12

2,573

Nvidia

Nvidia Open Model

67

amazon-nova-experimental-chat-10-09 1

1344Preliminary

±11

2,911

Amazon

Proprietary

67

qwen3-32b 1

1344

±9

3,943

Alibaba

Apache 2.0

68

qwen-plus-0125 1

1343

±8

5,861

Alibaba

Proprietary

69

o3-mini 1

1347

±3

58,935

OpenAI

Proprietary

69

step-3 1

1344

±7

6,686

StepFun

Apache 2.0

69

glm-4-plus-0111 1

1343

±8

5,806

Zhipu

Proprietary

69

ling-flash-2.0 1

1341Preliminary

±9

4,893

Ant Group

MIT

69

gemma-3-12b-it 1

1340

±10

3,866

Google

Gemma

69

nvidia-llama-3.3-nemotron-super-49b-v1.5 1

1339

±10

3,488

Nvidia

Nvidia Open

69

hunyuan-turbo-0110 1

1337

±11

2,322

Tencent

Proprietary

72

gpt-4o-2024-05-13 1

1344

±3

113,568

OpenAI

Proprietary

73

Claude 3.5 Sonnet (06/20) 1

1341

±3

82,864

Anthropic

Proprietary

73

gpt-5-nano-high 1

1337

±7

8,465

OpenAI

Proprietary

77

llama-3.1-405b-instruct-bf16 1

1335

±4

41,932

Meta

Llama 3.1 Community

77

step-2-16k-exp-202412 1

1331

±8

4,895

StepFun

Proprietary

78

o1-mini 1

1334

±3

52,301

OpenAI

Proprietary

78

GPT-4o (08/06) 1

1333

±4

45,787

OpenAI

Proprietary

78

gemini-advanced-0514 1

1332

±5

50,654

Google

Proprietary

78

qwq-32b 1

1332

±4

26,309

Alibaba

Apache 2.0

79

llama-3.1-405b-instruct-fp8 1

1333

±3

60,272

Meta

Llama 3.1 Community

79

grok-2-2024-08-13 1

1333

±3

63,725

xAI

Proprietary

79

llama-3.3-nemotron-49b-super-v1 1

1324

±12

2,243

Nvidia

Nvidia

83

hunyuan-large-2025-02-10 1

1323

±10

3,760

Tencent

Proprietary

86

yi-lightning 1

1327

±5

27,624

01 AI

Proprietary

87

llama-4-maverick-17b-128e-instruct 1

1327

±4

41,319

Meta

Llama 4

91

qwen3-30b-a3b 1

1325

±5

27,520

Alibaba

Apache 2.0

93

deepseek-v2.5-1210 1

1321

±8

6,877

DeepSeek

DeepSeek

96

gpt-4-turbo-2024-04-09 1

1323

±4

98,965

OpenAI

Proprietary

96

llama-4-scout-17b-16e-instruct 1

1322

±5

31,329

Meta

Llama

96

gpt-4.1-nano-2025-04-14 1

1319

±8

6,143

OpenAI

Proprietary

97

claude-3-opus-20240229 1

1321

±3

196,368

Anthropic

Proprietary

97

claude-3-5-haiku-20241022 1

1320

±3

71,507

Anthropic

Propretary

97

gemini-1.5-pro-001 1

1320

±4

79,769

Google

Proprietary

97

step-1o-turbo-202506 1

1319

±7

9,685

StepFun

Proprietary

97

gemma-3n-e4b-it 1

1318

±5

23,541

Google

Gemma

97

gpt-oss-20b 1

1317

±6

10,906

OpenAI

Apache 2.0

97

ring-flash-2.0 1

1314Preliminary

±9

4,971

Ant Group

MIT

100

llama-3.3-70b-instruct 1

1319

±3

56,024

Meta

Llama-3.3

100

glm-4-plus 1

1317

±5

26,342

Zhipu AI

Proprietary

100

qwen-max-0919 1

1316

±6

16,598

Alibaba

Qwen

101

qwen2.5-plus-1127 1

1313

±6

10,252

Alibaba

Proprietary

102

GPT-4o-mini (07/18) 1

1315

±3

69,290

OpenAI

Proprietary

102

hunyuan-standard-2025-02-10 1

1309

±10

3,920

Tencent

Proprietary

105

gpt-4-1106-preview 1

1313

±4

101,117

OpenAI

Proprietary

106

gpt-4-0125-preview 1

1313

±4

94,534

OpenAI

Proprietary

106

mistral-large-2407 1

1312

±4

45,968

Mistral

Mistral Research

106

athene-v2-chat 1

1312

±4

24,880

NexusFlow

NexusFlow

107

Gemini-1.5-Flash-002 1

1310

±4

35,180

Google

Proprietary

114

gemma-3-4b-it 1

1302

±9

4,195

Google

Gemma

117

magistral-medium-2506 1

1304

±6

12,018

Mistral

Proprietary

118

grok-2-mini-2024-08-13 1

1306

±4

52,789

xAI

Proprietary

118

deepseek-v2.5 1

1305

±5

24,839

DeepSeek

DeepSeek

118

athene-70b-0725 1

1304

±6

19,796

NexusFlow

CC-BY-NC-4.0

120

mistral-large-2411 1

1303

±4

28,455

Mistral

MRL

122

mistral-small-3.1-24b-instruct-2503 1

1301

±5

31,747

Mistral

Apache 2.0

124

qwen2.5-72b-instruct 1

1300

±4

39,632

Alibaba

Qwen

124

llama-3.1-nemotron-70b-instruct 1

1296

±8

7,216

Nvidia

Llama 3.1

125

hunyuan-large-vision 1

1293

±9

5,606

Tencent

Proprietary

132

Meta-Llama-3.1-70B-Instruct 1

1292

±4

56,003

Meta

Llama 3.1 Community

132

jamba-1.5-large 1

1287

±7

8,730

AI21 Labs

Jamba Open

132

reka-core-20240904 1

1286

±7

7,380

Reka AI

Proprietary

132

llama-3.1-tulu-3-70b 1

1285

±10

2,881

Ai2

Llama 3.1

132

llama-3.1-nemotron-51b-instruct 1

1285

±10

3,777

Nvidia

Llama 3.1

133

amazon-nova-pro-v1.0 1

1287

±4

25,218

Amazon

Proprietary

133

gpt-4-0314 1

1285

±5

54,754

OpenAI

Proprietary

134

gemma-2-27b-it 1

1285

±3

76,195

Google

Gemma license

135

gemini-1.5-flash-001 1

1283

±4

63,418

Google

Proprietary

135

claude-3-sonnet-20240229 1

1280

±4

110,173

Anthropic

Proprietary

135

gemma-2-9b-it-simpo 1

1277

±7

10,108

Princeton

MIT

136

command-r-plus-08-2024 1

1276

±6

9,931

Cohere

CC-BY-NC-4.0

137

nemotron-4-340b-instruct 1

1277

±5

19,913

Nvidia

NVIDIA Open Model

140

glm-4-0520 1

1272

±7

9,857

Zhipu AI

Proprietary

141

reka-flash-20240904 1

1272

±7

7,583

Reka AI

Proprietary

142

llama-3-70b-instruct 1

1275

±3

158,908

Meta

Llama 3 Community

142

gpt-4-0613 1

1274

±4

89,612

OpenAI

Proprietary

142

mistral-small-24b-instruct-2501 1

1272

±6

14,830

Mistral

Apache 2.0

143

qwen2.5-coder-32b-instruct 1

1268

±8

5,452

Alibaba

Apache 2.0

148

c4ai-aya-expanse-32b 1

1266

±5

27,362

Cohere

CC-BY-NC-4.0

150

command-r-plus 1

1263

±4

78,401

Cohere

CC-BY-NC-4.0

150

deepseek-coder-v2 1

1262

±6

15,242

DeepSeek AI

DeepSeek License

151

gemma-2-9b-it 1

1263

±4

54,954

Google

Gemma license

151

qwen2-72b-instruct 1

1261

±5

37,688

Alibaba

Qianwen LICENSE

153

claude-3-haiku-20240307 1

1260

±4

118,626

Anthropic

Proprietary

153

gemini-1.5-flash-8b-001 1

1259

±4

35,914

Google

Proprietary

153

amazon-nova-lite-v1.0 1

1259

±5

19,760

Amazon

Proprietary

153

olmo-2-0325-32b-instruct 1

1252

±11

3,377

Allen AI

Apache-2.0

156

phi-4 1

1254

±4

24,354

Microsoft

MIT

157

command-r-08-2024 1

1251

±6

10,229

Cohere

CC-BY-NC-4.0

163

mistral-large-2402 1

1242

±5

63,404

Mistral

Proprietary

163

amazon-nova-micro-v1.0 1

1240

±5

19,774

Amazon

Proprietary

163

jamba-1.5-mini 1

1237

±7

8,918

AI21 Labs

Jamba Open

163

ministral-8b-2410 1

1235

±9

4,833

Mistral

MRL

164

hunyuan-standard-256k 1

1232

±12

2,761

Tencent

Proprietary

165

qwen1.5-110b-chat 1

1234

±5

26,679

Alibaba

Qianwen LICENSE

165

qwen1.5-72b-chat 1

1233

±5

39,689

Alibaba

Qianwen LICENSE

165

reka-flash-21b-20240226-online 1

1233

±7

15,606

Reka AI

Proprietary

165

gemini-pro-dev-api 1

1233

±7

18,454

Google

Proprietary

167

mixtral-8x22b-instruct-v0.1 1

1229

±4

52,214

Mistral

Apache 2.0

167

command-r 1

1228

±5

54,710

Cohere

CC-BY-NC-4.0

167

reka-flash-21b-20240226 1

1226

±6

25,026

Reka AI

Proprietary

167

llama-3.1-tulu-3-8b 1

1221

±11

2,943

Ai2

Llama 3.1

167

gemini-pro 1

1220

±12

6,418

Google

Proprietary

168

mistral-medium 1

1223

±5

34,893

Mistral

Proprietary

168

c4ai-aya-expanse-8b 1

1222

±7

9,922

Cohere

CC-BY-NC-4.0

170

gpt-3.5-turbo-0125 1

1223

±5

67,214

OpenAI

Proprietary

171

llama-3-8b-instruct 1

1222

±4

106,055

Meta

Llama 3 Community

174

zephyr-orpo-141b-A35b-v0.1 1

1213

±11

4,712

HuggingFace

Apache 2.0

175

granite-3.1-8b-instruct 1

1209

±11

3,142

IBM

Apache 2.0

179

yi-1.5-34b-chat 1

1213

±5

24,417

01 AI

Apache-2.0

181

llama-3.1-8b-instruct 1

1210

±4

50,234

Meta

Llama 3.1 Community

181

qwen1.5-32b-chat 1

1205

±6

22,068

Alibaba

Qianwen LICENSE

182

gpt-3.5-turbo-1106 1

1200

±9

16,760

OpenAI

Proprietary

185

phi-3-medium-4k-instruct 1

1198

±5

25,301

Microsoft

MIT

186

mixtral-8x7b-instruct-v0.1 1

1197

±4

74,303

Mistral

Apache 2.0

186

gemma-2-2b-it 1

1197

±4

46,901

Google

Gemma license

186

dbrx-instruct-preview 1

1195

±6

32,760

Databricks

DBRX LICENSE

186

qwen1.5-14b-chat 1

1192

±7

18,066

Alibaba

Qianwen LICENSE

186

internlm2_5-20b-chat 1

1192

±7

10,038

InternLM

Other

188

wizardlm-70b 1

1185

±9

8,270

Microsoft

Llama 2 Community

188

deepseek-llm-67b-chat 1

1183

±11

4,950

DeepSeek AI

DeepSeek License

192

yi-34b-chat 1

1184

±7

15,624

01 AI

Yi License

192

granite-3.0-8b-instruct 1

1182

±9

6,727

IBM

Apache 2.0

192

openchat-3.5-0106 1

1182

±8

12,712

OpenChat

Apache-2.0

192

openchat-3.5 1

1181

±10

8,009

OpenChat

Apache-2.0

192

granite-3.1-2b-instruct 1

1180

±11

3,235

IBM

Apache 2.0

193

snowflake-arctic-instruct 1

1180

±6

33,272

Snowflake

Apache 2.0

193

tulu-2-dpo-70b 1

1179

±10

6,579

AllenAI/UW

AI2 ImpACT Low-risk

193

openhermes-2.5-mistral-7b 1

1175

±10

5,026

NousResearch

Apache-2.0

195

gemma-1.1-7b-it 1

1178

±6

24,327

Google

Gemma license

195

vicuna-33b 1

1173

±6

22,613

LMSYS

Non-commercial

195

starling-lm-7b-beta 1

1173

±7

16,190

Nexusflow

Apache-2.0

195

phi-3-small-8k-instruct 1

1172

±6

17,983

Microsoft

MIT

195

nous-hermes-2-mixtral-8x7b-dpo 1

1166

±12

3,792

NousResearch

Apache-2.0

196

llama-2-70b-chat 1

1171

±5

38,767

Meta

Llama 2 Community

196

starling-lm-7b-alpha 1

1168

±8

10,267

UC Berkeley

CC-BY-NC-4.0

198

llama-3.2-3b-instruct 1

1166

±8

8,043

Meta

Llama 3.2

201

qwq-32b-preview 1

1160

±11

3,256

Alibaba

Apache 2.0

203

llama2-70b-steerlm-chat 1

1157

±13

3,605

Nvidia

Llama 2 Community

205

dolphin-2.2.1-mistral-7b 1

1151

±15

1,685

Cognitive Computations

Apache-2.0

206

solar-10.7b-instruct-v1.0 1

1154

±13

4,187

Upstage AI

CC-BY-NC-4.0

210

granite-3.0-2b-instruct 1

1156

±8

6,922

IBM

Apache 2.0

210

mpt-30b-chat 1

1150

±12

2,606

MosaicML

CC-BY-NC-SA-4.0

210

falcon-180b-chat 1

1146

±17

1,312

TII

Falcon-180B TII License

211

wizardlm-13b 1

1149

±9

7,122

Microsoft

Llama 2 Community

212

mistral-7b-instruct-v0.2 1

1150

±7

19,603

Mistral

Apache-2.0

213

qwen1.5-7b-chat 1

1144

±10

4,782

Alibaba

Qianwen LICENSE

213

qwen-14b-chat 1

1138

±11

5,004

Alibaba

Qianwen LICENSE

214

phi-3-mini-4k-instruct-june-2024 1

1143

±6

12,415

Microsoft

MIT

214

llama-2-13b-chat 1

1142

±7

19,357

Meta

Llama 2 Community

214

vicuna-13b 1

1141

±7

19,539

LMSYS

Llama 2 Community

215

codellama-34b-instruct 1

1136

±9

7,417

Meta

Llama 2 Community

215

palm-2 1

1135

±9

8,634

Google

Proprietary

217

gemma-7b-it 1

1133

±9

9,034

Google

Gemma license

217

zephyr-7b-beta 1

1132

±9

11,220

HuggingFace

MIT

217

zephyr-7b-alpha 1

1128

±16

1,803

HuggingFace

MIT

219

guanaco-33b 1

1128

±12

2,955

UW

Non-commercial

220

phi-3-mini-128k-instruct 1

1130

±7

21,024

Microsoft

MIT

220

codellama-70b-instruct 1

1119

±18

1,151

Meta

Llama 2 Community

223

phi-3-mini-4k-instruct 1

1129

±6

20,539

Microsoft

MIT

225

stripedhyena-nous-7b 1

1120

±11

5,214

Together AI

Apache 2.0

225

smollm2-1.7b-instruct 1

1118

±14

2,244

HuggingFace

Apache 2.0

230

vicuna-7b 1

1114

±9

6,972

LMSYS

Llama 2 Community

233

llama-3.2-1b-instruct 1

1112

±8

8,166

Meta

Llama 3.2

233

gemma-1.1-2b-it 1

1112

±8

11,035

Google

Gemma license

233

mistral-7b-instruct 1

1110

±9

9,042

Mistral

Apache 2.0

234

llama-2-7b-chat 1

1108

±7

14,272

Meta

Llama 2 Community

241

gemma-2b-it 1

1089

±11

4,817

Google

Gemma license

243

qwen1.5-4b-chat 1

1090

±9

7,662

Alibaba

Qianwen LICENSE

243

olmo-7b-instruct 1

1075

±11

6,412

Allen AI

Apache-2.0

244

koala-13b 1

1070

±10

6,998

UC Berkeley

Non-commercial

244

gpt4all-13b-snoozy 1

1064

±15

1,773

Nomic AI

Non-commercial

245

alpaca-13b 1

1064

±11

5,828

Stanford

Non-commercial

245

mpt-7b-chat 1

1060

±12

3,977

MosaicML

CC-BY-NC-SA-4.0

245

chatglm3-6b 1

1056

±12

4,692

Tsinghua

Apache-2.0

248

RWKV-4-Raven-14B 1

1041

±11

4,898

RWKV

Apache 2.0

251

chatglm2-6b 1

1025

±14

2,683

Tsinghua

Apache-2.0

251

oasst-pythia-12b 1

1021

±11

6,343

OpenAssistant

Apache 2.0

254

chatglm-6b 1

995

±13

4,968

Tsinghua

Non-commercial

254

fastchat-t5-3b 1

990

±12

4,270

LMSYS

Apache 2.0

254

dolly-v2-12b 1

977

±14

3,471

Databricks

MIT

254

llama-13b 1

968

±16

2,441

Meta

Non-commercial

256

stablelm-tuned-alpha-7b 1

952

±13

3,325

Stability AI

CC-BY-NC-SA-4.0

说明

  • 排名 (UB):基于 Bradley-Terry 模型计算的排名。此排名反映了模型在竞技场中的综合表现,并提供了其 Elo 分数的 上界 估计,帮助理解模型的潜在竞争力。

  • 模型:大型语言模型 (LLM) 的名称。部分模型名称可能已嵌入相关链接。

  • 分数:模型在竞技场中通过用户投票获得的 Elo 评分。Elo 评分是一种相对排名系统,分数越高表示模型表现越好。

  • 95% 置信区间 (±):模型 Elo 评分的95%置信区间(例如:±6)。这个区间越小,表示模型的评分越稳定和可靠。

  • 票数:该模型在竞技场中收到的总投票数量。投票数越多,通常意味着其评分的统计可靠性越高。

  • 组织/公司:提供该模型的组织或公司。

  • 许可证:模型的许可协议类型,例如专有 (Proprietary)、Apache 2.0、MIT 等。

数据来源与更新频率

本排行榜数据由自动化脚本直接从 1 2 官方网站获取。此排行榜由 GitHub Actions 每天自动更新。

免责声明

本报告仅供参考。排行榜数据是动态变化的,并基于特定时间段内用户在 Chatbot Arena 上的偏好投票。数据的完整性和准确性取决于上游数据源。不同模型可能采用不同的许可协议,使用时请务必参考模型提供商的官方说明。

最后更新于

这有帮助吗?