1 / 83

English

Cherry Studio

Project Introduction

This document was translated from Chinese by AI and has not yet been reviewed.

Project Introduction

Follow our social accounts: Twitter(X), Xiaohongshu, Weibo, Bilibili, Douyin

Join our communities: QQ Group(575014769), Telegram, Discord, WeChat Group(click to view)

Cherry Studio is an all-in-one AI assistant platform integrating multi-model conversations, knowledge base management, AI painting, translation, and more. Cherry Studio's highly customizable design, powerful extensibility, and user-friendly experience make it an ideal choice for professional users and AI enthusiasts. Whether you are a beginner or a developer, you can find suitable AI functions in Cherry Studio to enhance your work efficiency and creativity.

Core Features & Highlights

1. Basic Chat Functionality

One Question, Multiple Answers: Supports generating replies from multiple models simultaneously for the same question, allowing users to compare the performance of different models. For details, see Chat Interface.

Automatic Grouping: Conversation records for each assistant are automatically grouped and managed, making it easy for users to quickly find historical conversations.
Conversation Export: Supports exporting full or partial conversations to various formats (e.g., Markdown, Word) for easy storage and sharing.
Highly Customizable Parameters: In addition to basic parameter adjustments, it also supports custom parameters to meet personalized needs.

Assistant Market: Built-in with over a thousand industry-specific assistants, covering fields like translation, programming, and writing, while also supporting user-defined assistants.

Multiple Format Rendering: Supports Markdown rendering, formula rendering, real-time HTML preview, and other functions to enhance content display.

2. Integration of Various Special Features

AI Painting: Provides a dedicated painting panel where users can generate high-quality images through natural language descriptions.

AI Mini-programs: Integrates various free web-based AI tools, allowing direct use without switching browsers.
Translation Function: Supports a dedicated translation panel, in-conversation translation, prompt translation, and other translation scenarios.
File Management: Files from conversations, paintings, and knowledge bases are managed in a unified and classified manner, avoiding tedious searches.

Global Search: Supports quick location of historical records and knowledge base content, improving work efficiency.

3. Unified Management for Multiple Service Providers

Service Provider Model Aggregation: Supports unified calling of models from major service providers like OpenAI, Gemini, Anthropic, and Azure.
Automatic Model Fetching: One-click to get a complete list of models without manual configuration.
Multi-key Polling: Supports rotating multiple API keys to avoid rate limit issues.
Precise Avatar Matching: Automatically matches each model with an exclusive avatar for better recognition.
Custom Service Providers: Supports third-party service providers that comply with specifications like OpenAI, Gemini, and Anthropic, offering strong compatibility.

4. Highly Customizable Interface and Layout

Custom CSS: Supports global style customization to create a unique interface style.
Custom Chat Layout: Supports list or bubble style layouts and allows customization of message styles (e.g., code snippet styles).
Custom Avatars: Supports setting personalized avatars for the software and assistants.
Custom Sidebar Menu: Users can hide or reorder sidebar functions according to their needs to optimize the user experience.

5. Local Knowledge Base System

Multiple Format Support: Supports importing various file formats such as PDF, DOCX, PPTX, XLSX, TXT, and MD.
Multiple Data Source Support: Supports local files, URLs, sitemaps, and even manually entered content as knowledge base sources.
Knowledge Base Export: Supports exporting processed knowledge bases to share with others.
Search and Check Support: After importing a knowledge base, users can perform real-time retrieval tests to check the processing results and segmentation effects.

6. Special Focus Features

Quick Q&A: Summon a quick assistant in any context (e.g., WeChat, browser) to get answers quickly.
Quick Translation: Supports quick translation of words or text from other contexts.
Content Summarization: Quickly summarizes long text content to improve information extraction efficiency.
Explanation: Explains complex issues with one click, without needing complicated prompts.

7. Data Security

Multiple Backup Solutions: Supports local backup, WebDAV backup, and scheduled backups to ensure data safety.
Data Security: Supports fully local usage scenarios, combined with local large models, to avoid data leakage risks.

Project Advantages

Beginner-Friendly: Cherry Studio is committed to lowering the technical barrier, allowing even users with no prior experience to get started quickly, focusing on their work, study, or creation.
Comprehensive Documentation: Provides detailed user manuals and FAQs to help users solve problems quickly.
Continuous Iteration: The project team actively responds to user feedback and continuously optimizes features to ensure the project's healthy development.
Open Source and Extensibility: Supports customization and extension through open-source code to meet personalized needs.

Applicable Scenarios

Knowledge Management and Query: Quickly build and query exclusive knowledge bases using the local knowledge base feature, suitable for research, education, and other fields.
Multi-model Conversation and Creation: Supports simultaneous conversation with multiple models, helping users quickly obtain information or generate content.
Translation and Office Automation: Built-in translation assistants and file processing functions are suitable for users who need cross-lingual communication or document processing.
AI Painting and Design: Generate images from natural language descriptions to meet creative design needs.

Star History

Client Download

This document was translated from Chinese by AI and has not yet been reviewed.

Client Download

Current latest stable version: v1.4.3

Direct Download

Windows Version

Note: Installing Cherry Studio is not supported on Windows 7.

Installer Version (Setup)

x64 Version

Main Link:

【Cherry Studio Official Website】【GitHub】

Alternate Links:

【Link 1】【Link 2】【Link 3】

ARM64 Version

Main Link:

【Cherry Studio Official Website】【GitHub】

Alternate Links:

【Link 1】【Link 2】【Link 3】

Portable Version (Portable)

x64 Version

Main Link:

【Cherry Studio Official Website】【GitHub】

Alternate Links:

【Link 1】【Link 2】【Link 3】

ARM64 Version

Main Link:

【Cherry Studio Official Website】【GitHub】

Alternate Links:

【Link 1】【Link 2】【Link 3】

macOS Version

Intel Chip Version (x64)

Main Link:

【Cherry Studio Official Website】【GitHub】

Alternate Links:

【Link 1】【Link 2】【Link 3】

Apple Silicon Version (ARM64, M-series chips)

Main Link:

【Cherry Studio Official Website】【GitHub】

Alternate Links:

【Link 1】【Link 2】【Link 3】

Linux Version

x86_64 Version

Main Link:

【Cherry Studio Official Website】【GitHub】

Alternate Links:

【Link 1】【Link 2】【Link 3】

ARM64 Version

Main Link:

【Cherry Studio Official Website】【GitHub】

Alternate Links:

【Link 1】【Link 2】【Link 3】

Cloud Drive Download

Quark

Project Planning

Feature Introduction

This document was translated from Chinese by AI and has not yet been reviewed.

Feature Overview

Agents

This document was translated from Chinese by AI and has not yet been reviewed.

Agents

The Agents page is a hub for assistants. Here, you can select or search for the model presets you want. Clicking on a card will add the assistant to the assistant list on the chat page.

You can also edit and create your own assistants on this page.

Click on My, then click on Create Agent to start creating your own assistant.

The button in the upper right corner of the prompt input box is for AI-optimizing the prompt. Clicking it will overwrite the original text. The model used is the Global Default Assistant Model.

Drawing

This document was translated from Chinese by AI and has not yet been reviewed.

Painting

For questions about the parameters, you can hover over the ? in the corresponding area to see the description.

More service providers will be added in the future. Please stay tuned.

Mini Programs

This document was translated from Chinese by AI and has not yet been reviewed.

Mini Programs

On the Mini Programs page, you can use the web versions of AI-related programs from major service providers within the client. Currently, custom adding and deleting are not yet supported.

Knowledge Base

This document was translated from Chinese by AI and has not yet been reviewed.

Knowledge Base

For usage of the knowledge base, refer to the Knowledge Base Tutorial in the advanced tutorials.

Files

This document was translated from Chinese by AI and has not yet been reviewed.

Files

The Files interface displays all files related to conversations, paintings, knowledge bases, and more. You can manage and view them centrally on this page.

Quick Assistant

This document was translated from Chinese by AI and has not yet been reviewed.

Quick Assistant

Quick Assistant is a convenient tool provided by Cherry Studio that allows you to quickly access AI functions in any application, enabling instant questioning, translation, summarization, and explanation.

Enable Quick Assistant

Open Settings: Navigate to Settings -> Shortcuts -> Quick Assistant.
Enable the Switch: Find and turn on the switch for Quick Assistant.

Set Shortcut (Optional):
- The default shortcut for Windows is Ctrl + E.
- The default shortcut for macOS is ⌘ + E.
- You can customize the shortcut here to avoid conflicts or to better suit your usage habits.

Using Quick Assistant

Invoke: In any application, press your set shortcut (or the default one) to open the Quick Assistant.
Interact: In the Quick Assistant window, you can perform the following actions directly:
- Quick Question: Ask the AI any question.
- Text Translation: Enter the text you need to translate.
- Content Summarization: Input long text for a summary.
- Explanation: Enter concepts or terms that need clarification.
Close: Press the ESC key or click anywhere outside the Quick Assistant window to close it.

The model used by the Quick Assistant is the Global Default Conversation Model.

Tips & Tricks

Shortcut Conflicts: If the default shortcut conflicts with other applications, please modify it.
Explore More Features: In addition to the functions mentioned in the documentation, the Quick Assistant may support other operations, such as code generation, style conversion, etc. It is recommended that you continue to explore during use.
Feedback & Improvement: If you encounter any problems or have any suggestions for improvement during use, please provide feedback to the Cherry Studio team in a timely manner.

Settings

This document was translated from Chinese by AI and has not yet been reviewed.

Settings

Model Provider Settings

This document was translated from Chinese by AI and has not yet been reviewed.

Provider Settings

When using built-in providers, you only need to fill in the corresponding key.
Different providers may have different names for the key. Secret, Key, API Key, Token, etc., all refer to the same thing.

API Key

In Cherry Studio, a single provider supports using multiple keys in a round-robin fashion. The rotation method is a list loop from front to back.

Add multiple keys separated by English commas. For example:

You must use English commas.

API Address

When using built-in providers, you generally do not need to fill in the API address. If you need to modify it, please strictly follow the address provided in the corresponding official documentation.

If the address provided by the provider is in the format https://xxx.xxx.com/v1/chat/completions, you only need to fill in the base URL part (https://xxx.xxx.com).
Cherry Studio will automatically append the remaining path (/v1/chat/completions). Failure to fill it in as required may result in it not working correctly.

Note: The large language model routes for most providers are standardized, so the following operations are generally not necessary. If the provider's API path is v2, v3/chat/completions, or another version, you can manually enter the corresponding version in the address field, ending with a /. When the provider's request route is not the standard /v1/chat/completions, use the complete address provided by the provider and end it with a #.

That is:

If the API address ends with /, only "chat/completions" will be appended.
If the API address ends with #, no appending operation is performed; only the entered address will be used.

Add Models

Usually, clicking the Manage button at the bottom left of the provider configuration page will automatically fetch all models supported by that provider. Click the + sign from the fetched list to add them to the model list.

When you click the manage button, not all models in the pop-up list will be added. You need to click the + to the right of a model to add it to the provider's model list on the configuration page for it to appear in the model selection list.

Connectivity Check

Click the check button after the API Key input box to test if the configuration is successful.

The model check uses the last chat model from the added model list by default. If the check fails, please verify that there are no incorrect or unsupported models in the model list.

After successful configuration, be sure to turn on the switch in the upper right corner. Otherwise, the provider will remain disabled, and you will not be able to find the corresponding models in the model list.

Default Model Settings

This document was translated from Chinese by AI and has not yet been reviewed.

Default Model Settings

Default Assistant Model

When an assistant does not have a default assistant model set, the model selected by default in a new conversation will be the one set here.

The model set here is also used for optimizing prompts and the pop-up text assistant.

Topic Naming Model

After each conversation, a model is called to generate a topic name for the conversation. The model set here is the one used for naming.

Translation Model

The translation function in input boxes for conversations, drawing, etc., and the translation model on the translation interface all use the model set here.

Quick Assistant Model

The model used by the quick assistant feature. For details, see Quick Assistant

General Settings

This document was translated from Chinese by AI and has not yet been reviewed.

General Settings

On this page, you can set the software's interface language, configure a proxy, etc.

Display Settings

This document was translated from Chinese by AI and has not yet been reviewed.

Display Settings

Theme Selection

Here you can set the default interface color mode (Light Mode, Dark Mode, or Follow System).

Topic Settings

This setting is for the layout of the conversation interface.

Topic Position

Auto-switch to Topic

When this setting is enabled, clicking on the assistant's name will automatically switch to the corresponding topic page.

Show Topic Time

When enabled, the creation time of the topic will be displayed below the topic.

Custom CSS

Shortcut Key Settings

This document was translated from Chinese by AI and has not yet been reviewed.

Shortcut Key Settings

On this interface, you can enable (or disable) and set shortcut keys for some functions. Please follow the instructions on the interface for setup.

Basic Tutorials

Installation Tutorial

This document was translated from Chinese by AI and has not yet been reviewed.

Installation Tutorial

Windows

Windows 版本安装教程

This document was translated from Chinese by AI and has not yet been reviewed.

Windows

Open the Official Website

Note: Cherry Studio cannot be installed on Windows 7.

Click download and select the appropriate version

Wait for the Download to Complete

If the browser prompts that the file is not trusted, choose to keep it.
Choose Keep → Trust Cherry-Studio

Open the File

Install

macOS

macOS 版本安装教程

This document was translated from Chinese by AI and has not yet been reviewed.

macOS

First, go to the official website's download page to download the Mac version, or click the direct link below.

Please make sure to download the correct chip version for your Mac.

If you don't know which chip version your Mac uses:

Click the  in the menu bar at the top-left corner of your Mac.
Click "About This Mac" in the dropdown menu.
Check the processor information in the pop-up window.

If it's an Intel chip, download the Intel version installer.

If it's an Apple M* chip, download the Apple chip installer.

After the download is complete, click here.

Drag the icon to install.

Go to Launchpad, find the Cherry Studio icon, and click it. If the Cherry Studio main interface opens, the installation is successful.

Model Provider Configuration

This document was translated from Chinese by AI and has not yet been reviewed.

Model Service Configuration

Alibaba Cloud Bailian

This document was translated from Chinese by AI and has not yet been reviewed.

Alibaba Cloud Bailian

Log in to Alibaba Cloud Bailian. If you don't have an Alibaba Cloud account, you'll need to register one.
Click the 创建我的 API-KEY (Create My API-KEY) button in the upper right corner.

In the pop-up window, select the default business space (or you can customize it), and you can enter a description if you want.

Click the 确定 (Confirm) button in the lower right corner.
Afterward, you should see a new row added to the list. Click the 查看 (View) button on the right.
Click the 复制 (Copy) button.
Go to Cherry Studio, navigate to Settings → Model Providers → Alibaba Cloud Bailian, find API Key, and paste the copied API key here.
You can adjust the relevant settings as described in Model Providers, and then you can start using it.

If you don't find any models from Alibaba Cloud Bailian in the model list, please ensure you have added the models and enabled this provider as described in Model Providers.

SiliconFlow

This document was translated from Chinese by AI and has not yet been reviewed.

SiliconFlow

1. Configure SiliconCloud's Model Service

1.2 Click on Settings in the bottom-left corner and select 【SiliconFlow】 under Model Service

1.2 Click the link to get the SiliconCloud API key

Log in to SiliconCloud (if you haven't registered, an account will be automatically created on your first login)
Visit API Keys to create a new key or copy an existing one

1.3 Click Manage to add a model

2. Using the Model Service

Click the "Chat" button in the left menu bar
Enter text in the input box to start chatting
You can switch models by selecting the model name in the top menu

OpenAI

This document was translated from Chinese by AI and has not yet been reviewed.

OpenAI

Get API Key

Find the OpenAI provider and enter the key you just obtained.

Click Manage or Add at the bottom, add the supported models, and enable the provider switch in the upper right corner to start using it.

OpenAI services are not directly available in regions of China other than Taiwan; you will need to resolve the proxy issue yourself.
You must have a balance in your account.

ByteDance (Doubao)

This document was translated from Chinese by AI and has not yet been reviewed.

ByteDance (Doubao)

Get API Key

Create an API Key.

After successful creation, click the eye icon next to the newly created API Key to reveal and copy it.

Paste the copied API Key into CherryStudio, then turn on the provider switch.

Activate and Add Models

Click Add, and paste the previously obtained Model ID into the Model ID text box.

Follow this process to add models one by one.

API Address

There are two ways to write the API address:

The first is the client default: https://ark.cn-beijing.volces.com/api/v3/
The second way is: https://ark.cn-beijing.volces.com/api/v3/chat/completions#

There is no difference between the two formats. You can keep the default; no modification is needed.

OneAPI and its Forks

This document was translated from Chinese by AI and has not yet been reviewed.

OneAPI and Its Forks

OneAPI

This document was translated from Chinese by AI and has not yet been reviewed.

OneAPI

Create a new token (or you can directly use the default token ↑)

Copy the token

Open CherryStudio's provider settings and click Add at the bottom of the provider list.
Enter a note name, select OpenAI as the provider, and click OK.

Paste the key you just copied.
Go back to the page where you got the API Key and copy the root address from the browser's address bar, for example:

When the address is an IP + port, fill in http://IP:port, for example: http://127.0.0.1:3000
Strictly distinguish between http and https. If SSL is not enabled, do not use https.

Add models (click Manage to automatically fetch or manually enter them) and toggle the switch in the upper right corner to enable them.

Other OneAPI themes may have different interfaces, but the method for adding them is the same as the process described above.

NewAPI

This document was translated from Chinese by AI and has not yet been reviewed.

NewAPI

Log in and open the token page
Click "Add Token"

Enter a token name and click "Submit" (other settings can be configured as needed).

Open the provider settings in CherryStudio and click Add at the bottom of the provider list.
Enter a memo name, select OpenAI as the provider, and click OK.

Paste the key you just copied.
Go back to the page where you obtained the API Key and copy the base URL from your browser's address bar. For example:

When the address is an IP + port, enter http://IP:port, for example: http://127.0.0.1:3000
Strictly distinguish between http and https. If SSL is not enabled, do not use https.

Add models (click "Manage" to fetch them automatically or enter them manually), then enable the switch in the top-right corner to start using them.

Infinigence

This document was translated from Chinese by AI and has not yet been reviewed.

Infini-AI

Are you experiencing this: having 26 insightful articles saved in your WeChat Favorites that you never open again, more than 10 files scattered in a "study materials" folder on your computer, or trying to find a theory you read six months ago but only remembering a few keywords. When the daily amount of information exceeds your brain's processing limit, 90% of valuable knowledge is forgotten within 72 hours. Now, by building a personal knowledge base with the Infini-AI Large Model Service Platform API + Cherry Studio, you can transform those dust-gathering WeChat articles and fragmented course content into structured knowledge for precise retrieval.

I. Building a Personal Knowledge Base

1. Infini-AI API Service: The "Thinking Hub" of Your Knowledge Base, Easy-to-Use and Stable

As the "thinking hub" of the knowledge base, the Infini-AI Large Model Service Platform offers model versions like the full-power DeepSeek R1, providing stable API services. Currently, it's free to use with no barriers after registration. It supports mainstream embedding models like bge and jina for building knowledge bases. The platform also continuously updates with the latest, most powerful, and stable open-source model services, including various modalities such as images, videos, and voice.

2. Cherry Studio: Build a Knowledge Base with Zero Code

Cherry Studio is an easy-to-use AI tool. Compared to the 1-2 month deployment cycle required for RAG knowledge base development, this tool's advantage is its support for zero-code operation. You can import multiple formats like Markdown/PDF/webpages with one click. A 40MB file can be parsed in 1 minute. Additionally, you can add local computer folders, article URLs from WeChat Favorites, and course notes.

II. Build Your Exclusive Knowledge Butler in 3 Steps

Step 1: Basic Preparation

Visit the official Cherry Studio website to download the appropriate version (https://cherry-ai.com/)
Register an account: Log in to the Infini-AI Large Model Service Platform (https://cloud.infini-ai.com/genstudio/model?cherrystudio)

Get API Key: In the "Model Square," select deepseek-r1, click create to get the APIKEY, and copy the model name.

Step 2: Open Cherry Studio settings, select Infini-AI in the Model Service, fill in the API Key, and enable the Infini-AI model service.

After completing the steps above, you can use Infini-AI's API service in Cherry Studio by selecting the desired large model during interaction. For convenience, you can also set a "Default Model" here.

Step 3: Add a Knowledge Base

Select any version of the bge series or jina series embedding models from the Infini-AI Large Model Service Platform.

III. Real User Scenario Test

After importing study materials, enter "Summarize the core formula derivations in Chapter 3 of 'Machine Learning'"

Generated result shown below

GitHub Copilot

This document was translated from Chinese by AI and has not yet been reviewed.

GitHub Copilot

To use GitHub Copilot, you first need a GitHub account and a subscription to the GitHub Copilot service. A free version subscription is also acceptable, but the free version does not support the latest Claude 3.7 model. For details, please refer to the official GitHub Copilot website.

Get Device Code

Click "Login with GitHub" to get the Device Code and copy it.

Enter the Device Code in the browser and authorize

After successfully obtaining the Device Code, click the link to open your browser. Log in to your GitHub account in the browser, enter the Device Code, and authorize.

After successful authorization, return to Cherry Studio and click "Connect to GitHub". Upon success, your GitHub username and avatar will be displayed.

Click "Manage" to get the model list

Click the "Manage" button below, and it will automatically connect to the internet to fetch the list of currently supported models.

Frequently Asked Questions

Failed to get Device Code, please retry

Currently, requests are built using Axios, which does not support SOCKS proxies. Please use a system proxy or an HTTP proxy, or alternatively, do not set a proxy within CherryStudio and use a global proxy instead. First, ensure your network connection is stable to avoid failing to obtain the Device Code.

PPIO Cloud

This document was translated from Chinese by AI and has not yet been reviewed.

PPIO Cloud

Connecting Cherry Studio to PPIO LLM API

Tutorial Overview

Cherry Studio is a multi-model desktop client that currently supports installation packages for Windows, Linux, and macOS. It aggregates mainstream LLM models and provides multi-scenario assistance. Users can improve their work efficiency through intelligent session management, open-source customization, and multi-themed interfaces.

Cherry Studio is now deeply integrated with the PPIO high-performance API channel—ensuring high-speed responses for DeepSeek-R1/V3 and 99.9% service availability through enterprise-grade computing power, bringing you a fast and smooth experience.

The tutorial below provides a complete integration plan (including API key configuration), allowing you to enable the advanced mode of "Cherry Studio Intelligent Scheduling + PPIO High-Performance API" in just 3 minutes.

1. Open CherryStudio and add "PPIO" as a model provider

First, go to the official website to download Cherry Studio: https://cherry-ai.com/download (If you can't access it, you can use the Quark Web Drive link below to download the version you need: https://pan.quark.cn/s/c8533a1ec63e#/list/share

(1) First, click on Settings in the bottom left corner, set the custom provider name to: PPIO, and click "OK"

(2) Go to PPIO Compute Cloud API Key Management, click on your [User Avatar] — [API Key Management] to enter the console

Click the [+ Create] button to create a new API key. Give it a custom name. The generated key is only displayed at the time of creation. Be sure to copy and save it to a document to avoid affecting future use.

(3) In CherryStudio, enter the API key. Click Settings, select [PPIO Cloud], enter the API key generated on the official website, and finally click [Check].

(4) Select the model: For example, deepseek/deepseek-r1/community. If you need to switch to another model, you can do so directly.

The community versions of DeepSeek R1 and V3 are for trial purposes. They are full-parameter models with no difference in stability or performance. For high-volume usage, you must top up your account and switch to a non-community version.

2. Model Usage Configuration

(1) Click [Check]. Once it shows "Connection successful," you can start using it normally.

(2) Finally, click on [@] and select the DeepSeek R1 model you just added under the PPIO provider to start chatting successfully~

[Some materials sourced from: 陈恩]

3. PPIO×Cherry Studio Video Tutorial

If you prefer visual learning, we have prepared a video tutorial on Bilibili. This step-by-step guide will help you quickly master the configuration of "PPIO API + Cherry Studio". Click the link below to go directly to the video and start your smooth development experience → 【Still frustrated by DeepSeek's endless loading?】PPIO Cloud + Full-power DeepSeek =? No more congestion, take off now!

[Video material sourced from: sola]

Web Search Mode

如何在 Cherry Studio 使用联网模式

This document was translated from Chinese by AI and has not yet been reviewed.

Web Search Mode

Examples of scenarios that require web access:

Time-sensitive information: For example, the price of gold futures today/this week/just now.
Real-time data: For example, weather, exchange rates, and other dynamic values.
Emerging knowledge: For example, new things, new concepts, new technologies, etc...

1. How to Enable Web Search

In the Cherry Studio question window, click the [Little Globe] icon to enable web search.

2. Special Note: There Are Two Web Search Modes

Mode 1: The model provider's large model has a built-in web search function

In this case, after enabling web search, you can use the service directly. It's very simple.

You can quickly determine if a model supports web search by checking for a small globe icon next to the model's name at the top of the chat interface.

On the model management page, this method also allows you to quickly distinguish which models support web search and which do not.

Cherry Studio currently supports the following model providers with web search capabilities:
Google Gemini
OpenRouter (all models support web search)
Tencent Hunyuan
Zhipu AI
Alibaba Cloud Bailian, etc.

Special Note:

There is a special case where a model can access the web even without the small globe icon, as explained in the tutorial below.

Mode 2: The model does not have a built-in web search function; use the Tavily service to enable it

When we use a large model without a built-in web search function (no small globe icon next to its name), but we need it to retrieve real-time information for processing, we need to use the Tavily web search service.

When using the Tavily service for the first time, a pop-up will prompt you to configure some settings. Please follow the instructions—it's very simple!

After clicking to get the API key, you will be automatically redirected to the official Tavily website's login/registration page. After registering and logging in, create an API key, then copy the key and paste it into Cherry Studio.

If you don't know how to register, refer to the Tavily web search login and registration tutorial in the same directory as this document.

Tavily registration reference document:

The interface below indicates that the registration was successful.

Let's try again to see the effect. The result shows that the web search is now working correctly, and the number of search results is our default setting: 5.

Note: Tavily has a monthly free usage limit. You will need to pay if you exceed it~~

PS: If you find any errors, please feel free to contact us.

Free Web Search Mode

This document was translated from Chinese by AI and has not yet been reviewed.

Free Networking Mode

Web Search Blacklist Configuration

This document was translated from Chinese by AI and has not yet been reviewed.

Web Search Blacklist Configuration

Cherry Studio supports configuring the blacklist manually or by adding subscription sources. For configuration rules, please refer to ublacklist.

Manual Configuration

You can add rules for search results or click the toolbar icon to block specified websites. Rules can be specified using either: match patterns (example: *://*.example.com/*) or regular expressions (example: /example\.(net|org)/).

Subscription Configuration

You can also subscribe to public rule sets. This website lists some subscriptions: https://iorate.github.io/ublacklist/subscriptions

Here are some recommended subscription source links:

Name

Link

Type

https://git.io/ublacklist

Chinese

https://raw.githubusercontent.com/laylavish/uBlockOrigin-HUGE-AI-Blocklist/main/list_uBlacklist.txt

AI-generated

Data Settings

This document was translated from Chinese by AI and has not yet been reviewed.

Data Settings

Notion Configuration Tutorial

This document was translated from Chinese by AI and has not yet been reviewed.

Notion Configuration Tutorial

Cherry Studio supports importing topics into a Notion database.

Step 1

Step 2

Create an integration.

Name: Cherry Studio

Type: Select the first one

Icon: You can save this image

Step 3

Copy the secret token and paste it into the Cherry Studio settings.

Step 4

Step 5

If your Notion database URL looks like this:

https://www.notion.so/<long_hash_1>?v=<long_hash_2>

Then the Notion database ID is the <long_hash_1> part.

Step 6

Fill in the Page Title Field Name:

If your web page is in English, enter Name If your web page is in Chinese, enter 名称

Step 7

Congratulations, your Notion configuration is complete ✅ You can now export content from Cherry Studio to your Notion database.

SiYuan Note Configuration Tutorial

This document was translated from Chinese by AI and has not yet been reviewed.

SiYuan Note Configuration Tutorial

Supports exporting topics and messages to SiYuan Note.

Step 1

Open SiYuan Note and create a new notebook.

Step 2

Open the notebook settings and copy the Notebook ID.

Step 3

Paste the copied Notebook ID into the Cherry Studio settings.

Step 4

Enter the SiYuan Note address.

Local Usually http://127.0.0.1:6806
Self-hosted Your domain, e.g., http://note.domain.com

Step 5

Copy the SiYuan Note API Token.

Paste it into the Cherry Studio settings and check the connection.

Step 6

Congratulations, the SiYuan Note configuration is complete ✅ You can now export content from Cherry Studio to your SiYuan Note.

Personalization Settings

This document was translated from Chinese by AI and has not yet been reviewed.

Personalization Settings

Clear CSS Settings

This document was translated from Chinese by AI and has not yet been reviewed.

Clear CSS Settings

Use this method to clear CSS settings when you have set incorrect CSS or cannot enter the settings interface after setting the CSS.

Open the console, click on the CherryStudio window, and press the shortcut key Ctrl+Shift+I (MacOS: command+option+I).
In the console window that pops up, click Console.

Then, manually type document.getElementById('user-defined-custom-css').remove(). Copying and pasting will likely not execute.
After typing, press Enter to confirm and clear the CSS settings. Then, go back to CherryStudio's display settings and delete the problematic CSS code.

Knowledge Base Tutorials

Knowledge Base Data

This document was translated from Chinese by AI and has not yet been reviewed.

Data Storage Explanation

All data added to the Cherry Studio knowledge base is stored locally. During the addition process, a copy of the document will be placed in the Cherry Studio data storage directory.

Vector Database: https://turso.tech/libsql

After a document is added to the Cherry Studio knowledge base, the file will be split into several chunks, and then these chunks will be processed by an embedding model.

When using a large model for Q&A, text chunks related to the question will be retrieved and sent to the large language model for processing.

If you have data privacy requirements, it is recommended to use a local embedding database and a local large language model.

Advanced Tutorials

MCP Usage Tutorial

This document was translated from Chinese by AI and has not yet been reviewed.

MCP Usage Tutorial

MCP Environment Installation

This document was translated from Chinese by AI and has not yet been reviewed.

MCP Environment Setup

Using MCP in Cherry Studio

The following uses the fetch feature as an example to demonstrate how to use MCP in Cherry Studio. You can find more details in the documentation.

Preparation: Install uv, bun

Cherry Studio currently only uses the built-in uv and bun, and will not reuse any existing installations of uv and bun on the system.

In Settings - MCP Server, click the Install button to automatically download and install them. Since the downloads are directly from GitHub, the speed might be slow, and there is a high chance of failure. The success of the installation depends on whether the files exist in the folder mentioned below.

Executable Installation Directory:

Windows: C:\Users\YourUsername\.cherrystudio\bin

macOS, Linux: ~/.cherrystudio/bin

If the installation fails:

You can create a symbolic link (soft link) from the corresponding system command to this directory. If the directory does not exist, you need to create it manually. Alternatively, you can manually download the executable files and place them in this directory:

Bun: https://github.com/oven-sh/bun/releases UV: https://github.com/astral-sh/uv/releases

Configure and Use MCP

This document was translated from Chinese by AI and has not yet been reviewed.

Configure and Use MCP

Open Cherry Studio settings.
Find the MCP Server option.
Click Add Server.
Fill in the relevant parameters for the MCP Server (reference link). The content you may need to fill in includes:
- Name: Customize a name, for example, fetch-server
- Type: Select STDIO
- Command: Fill in uvx
- Arguments: Fill in mcp-server-fetch
- (There may be other parameters, depending on the specific Server)
Click Save.

After completing the above configuration, Cherry Studio will automatically download the required MCP Server - fetch server. Once the download is complete, we can start using it! Note: If the mcp-server-fetch configuration is unsuccessful, you can try restarting your computer.

Enable MCP Service in the Chat Box

Successfully added an MCP server in the MCP Server settings

Usage Demonstration

As you can see from the image above, by integrating MCP's fetch feature, Cherry Studio can better understand the user's query intent, retrieve relevant information from the web, and provide more accurate and comprehensive answers.

Automatic MCP Installation

This document was translated from Chinese by AI and has not yet been reviewed.

Automatic MCP Installation

Automatic MCP installation requires upgrading Cherry Studio to v1.1.18 or a higher version.

Feature Introduction

In addition to manual installation, Cherry Studio has a built-in tool, @mcpmarket/mcp-auto-install, which provides a more convenient way to install MCP servers. You just need to input the corresponding command in a large model conversation that supports MCP services.

Beta Phase Reminder:

@mcpmarket/mcp-auto-install is still in its beta phase.
The effectiveness depends on the "intelligence" of the large model. Some configurations will be added automatically, while others may still require manual changes to certain parameters in the MCP settings.
Currently, the search source is @modelcontextprotocol, which you can configure yourself (explained below).

Usage Instructions

For example, you can enter:

The system will automatically recognize your request and complete the installation via @mcpmarket/mcp-auto-install. This tool supports various types of MCP servers, including but not limited to:

filesystem
fetch
sqlite
and more...

The MCP_PACKAGE_SCOPES variable allows you to customize the MCP service search source. The default value is: @modelcontextprotocol, which can be configured.

Introduction to the `@mcpmarket/mcp-auto-install` Library

Default Configuration Reference:

Built-in MCP Configuration

This document was translated from Chinese by AI and has not yet been reviewed.

Built-in MCP Configurations

@cherry/mcp-auto-install

Automatically installs MCP services (beta).

@cherry/memory

A basic implementation of persistent memory based on a local knowledge graph. This allows the model to remember relevant user information across different conversations.

@cherry/sequentialthinking

An MCP server implementation that provides tools for dynamic and reflective problem-solving through structured thought processes.

@cherry/brave-search

An MCP server implementation that integrates the Brave Search API, providing dual functionality for web and local search.

@cherry/fetch

An MCP server for fetching web page content from a URL.

@cherry/filesystem

A Node.js server that implements the Model Context Protocol (MCP) for file system operations.

Configure Dify Knowledge Base

This document was translated from Chinese by AI and has not yet been reviewed.

Configure Dify Knowledge Base

The Dify Knowledge Base MCP requires upgrading Cherry Studio to v1.2.9 or higher.

Add Dify Knowledge Base MCP Server

Open Search MCP.
Add the dify-knowledge server.

Configure Dify Knowledge Base

Parameters and environment variables need to be configured

The Dify knowledge base key can be obtained as follows

Using the Dify Knowledge Base MCP

Add ModelScope MCP Server

This document was translated from Chinese by AI and has not yet been reviewed.

Add ModelScope MCP Server

ModelScope MCP Server requires upgrading Cherry Studio to v1.2.9 or higher.

In version v1.2.9, Cherry Studio officially partnered with ModelScope, significantly simplifying the process of adding MCP servers. This helps avoid configuration errors and allows you to discover a vast number of MCP servers within the ModelScope community. Follow the steps below to learn how to sync ModelScope's MCP servers in Cherry Studio.

Steps

Sync Entry Point:

Click on MCP Server Settings in the settings, and select Sync Server.

Discover MCP Services:

Select ModelScope and browse to discover MCP services.

View MCP Server Details

Connect to Server

In the MCP service details, select "Connect Service".

Apply for and Paste API Token

Click "Get API Token" in Cherry Studio, which will redirect you to the official ModelScope website. Copy the API token and paste it back into Cherry Studio.

Successful Sync

In the MCP server list in Cherry Studio, you can see the MCP service connected from ModelScope and call it in conversations.

Incremental Update

For new MCP servers connected on the ModelScope webpage later, simply click Sync Server to add them incrementally.

By following the steps above, you have successfully learned how to easily sync MCP servers from ModelScope in Cherry Studio. The entire configuration process is not only greatly simplified, effectively avoiding the hassle and potential errors of manual configuration, but it also allows you to easily access the vast MCP server resources provided by the ModelScope community.

Start exploring and using these powerful MCP services to bring more convenience and possibilities to your Cherry Studio experience

FAQ

This document was translated from Chinese by AI and has not yet been reviewed.

Frequently Asked Questions

1. mcp-server-time

Solution

In the "Parameters" field, enter:

mcp-server-time
--local-timezone
<Your standard timezone, e.g., Asia/Shanghai>

Project Contribution

Contributing Code

This document was translated from Chinese by AI and has not yet been reviewed.

Contributing Code

We welcome contributions to Cherry Studio! You can contribute in the following ways:

Contribute Code: Develop new features or optimize existing code.
Fix Bugs: Submit fixes for bugs you find.
Maintain Issues: Help manage GitHub issues.
Product Design: Participate in design discussions.
Write Documentation: Improve user manuals and guides.
Community Engagement: Join discussions and help users.
Promote Usage: Spread the word about Cherry Studio.

How to Participate

Send an email to [email protected]

Email Subject: Apply to become a developer

Email Body: Reason for application

Contributing Docs

This document was translated from Chinese by AI and has not yet been reviewed.

Contributing to the Documentation

Title: Application for Cherry Studio Docs Editor Role

Body: State your reasons for applying

Issues & Feedback

How to Ask Questions Effectively

This document was translated from Chinese by AI and has not yet been reviewed.

How to Ask Questions Effectively

Cherry Studio is a free and open-source project. As the project grows, the workload for the project team has also increased. To reduce communication costs and resolve your issues quickly and efficiently, we hope that you will follow the steps and methods below to handle problems before asking questions. This will allow the project team more time to focus on project maintenance and development. Thank you for your cooperation!

1. Check and Search the Documentation

Most basic questions can be solved by carefully reading the documentation.

For questions about the software's features and usage, you can check the Feature Introduction documentation.
Frequently asked questions are collected on the FAQ page. You can check there first for solutions.
For more complex issues, you can try solving them directly by searching or asking in the search bar.
Be sure to carefully read the content in the hint boxes within each document, as this can help you avoid many problems.
Check or search the GitHub Issues page for similar problems and solutions.

2. Search Online, Ask an AI

For issues unrelated to the client's functionality (such as model errors, unexpected responses, or parameter settings), it is recommended to first search online for relevant solutions or describe the error message and problem to an AI to find a solution.

3. Ask in Official Communities or Create a GitHub Issue

If the first two steps did not provide an answer or solve your problem, you can describe your issue in detail and seek help in our official Telegram Channel, Discord Channel, or (Click to Join).

If it's a model error, please provide a complete screenshot of the interface and the console error message. You can censor sensitive information, but the model name, parameter settings, and error content must be visible in the screenshot. To learn how to view console error messages, click here.
If it's a software bug, please provide a specific error description and detailed steps to help developers debug and fix it. If it's an intermittent issue that cannot be reproduced, please describe the relevant scenarios, context, and configuration parameters when the problem occurred in as much detail as possible. In addition, you also need to include platform information (Windows, Mac, or Linux) and the software version number in your problem description.

Requesting Documentation or Providing Suggestions

You can contact @Wangmouuu on our Telegram channel or QQ (1355873789), or send an email to: [email protected].

Feedback & Suggestions

This document was translated from Chinese by AI and has not yet been reviewed.

Feedback & Suggestions

Telegram Discussion Group

Group members will share their experiences and help you solve problems.

QQ Group

QQ group members can help each other and share download links.

GitHub Issues

Suitable for recording issues to prevent developers from forgetting, or for participating in discussions here.

Email

If you can't find other feedback channels, you can contact the developer for help.

Contact the developer via email: [email protected]

Contact Us

Business Cooperation

This document was translated from Chinese by AI and has not yet been reviewed.

Business Cooperation

Contact Person: Mr. Wang

📮：[email protected]

📱：18954281942 (Not a customer service number)

For usage inquiries, you can join our user communication group at the bottom of the official website homepage, or email [email protected]

Or submit issues at: https://github.com/CherryHQ/cherry-studio/issues

If you need more guidance, you can join our Knowledge Planet

Commercial license details: https://docs.cherry-ai.com/contact-us/questions/cherrystudio-xu-ke-xie-yi

About

Privacy Policy

This document was translated from Chinese by AI and has not yet been reviewed.

Privacy Policy

Welcome to Cherry Studio (hereinafter referred to as "this software" or "we"). We place a high value on protecting your privacy. This Privacy Policy explains how we handle and protect your personal information and data. Please read and understand this policy carefully before using this software:

I. Scope of Information We Collect

To optimize user experience and improve software quality, we may only collect the following anonymous, non-personal information:

• Software version information; • Activity and usage frequency of software features; • Anonymous crash and error log information;

The above information is completely anonymous, does not involve any personally identifiable data, and cannot be associated with your personal information.

II. Information We Do Not Collect

To maximize the protection of your privacy and security, we explicitly promise:

• We will not collect, save, transmit, or process the model service API Key information you enter into this software; • We will not collect, save, transmit, or process any conversation data generated during your use of this software, including but not limited to chat content, command information, knowledge base information, vector data, and other custom content; • We will not collect, save, transmit, or process any personally identifiable sensitive information.

III. Data Interaction Description

This software uses the API Key from a third-party model service provider that you apply for and configure yourself to perform model calls and conversation functions. The model services you use (e.g., large models, API interfaces, etc.) are provided by and are the sole responsibility of the third-party provider you choose. Cherry Studio only acts as a local tool to provide the interface calling function with third-party model services.

Therefore:

• All conversation data generated between you and the large model service is unrelated to Cherry Studio. We do not participate in data storage, nor do we conduct any form of data transmission or relay; • You need to review and accept the privacy policies and related terms of the corresponding third-party model service providers. The privacy policies for these services can be found on the official websites of each provider.

IV. Third-Party Model Service Provider Privacy Policy Statement

You are solely responsible for any privacy risks that may arise from using third-party model service providers. For specific privacy policies, data security measures, and related liabilities, please refer to the relevant content on the official website of your chosen model service provider. We assume no responsibility for this.

V. Agreement Updates and Modifications

This policy may be adjusted appropriately with software version updates. Please check it regularly. When substantial changes to the policy occur, we will notify you in an appropriate manner.

VI. Contact Us

If you have any questions about the content of this policy or Cherry Studio's privacy protection measures, please feel free to contact us.

Thank you for choosing and trusting Cherry Studio. We will continue to provide you with a secure and reliable product experience.

Other

Common Models Reference Information

This document was translated from Chinese by AI and has not yet been reviewed.

Common Model Reference Information

The following information is for reference only. If there are any errors, please contact us for correction. The context size and model information may vary for different providers of some models;
When inputting data in the client, "k" needs to be converted to its actual numerical value (theoretically 1k=1024 tokens; 1m=1024k tokens), e.g., 8k is 8×1024=8192 tokens. It is recommended to multiply by 1000 in actual use to prevent errors, e.g., 8k as 8×1000=8000, and 1m as 1×1000000=1000000;
A max output of "-" indicates that no clear maximum output information for the model was found from official sources.

Model Name

Max Input

Max Output

Function Calling

Model Capabilities

Provider

Introduction

360gpt-pro

Not Supported

Conversation

360AI_360gpt

The flagship hundred-billion-parameter large model in the 360 AI Brain series, with the best performance, widely applicable to complex task scenarios in various fields.

360gpt-turbo

Not Supported

Conversation

360AI_360gpt

A ten-billion-parameter large model that balances performance and effectiveness, suitable for scenarios with high requirements for performance/cost.

360gpt-turbo-responsibility-8k

Not Supported

Conversation

360AI_360gpt

A ten-billion-parameter large model that balances performance and effectiveness, suitable for scenarios with high requirements for performance/cost.

360gpt2-pro

Not Supported

Conversation

360AI_360gpt

The flagship hundred-billion-parameter large model in the 360 AI Brain series, with the best performance, widely applicable to complex task scenarios in various fields.

claude-3-5-sonnet-20240620

200k

16k

Not Supported

Conversation, Vision

Anthropic_claude

A snapshot version released on June 20, 2024. Claude 3.5 Sonnet is a model that balances performance and speed, offering top-tier performance while maintaining high speed, and supports multimodal input.

claude-3-5-haiku-20241022

200k

16k

Not Supported

Conversation

Anthropic_claude

A snapshot version released on October 22, 2024. Claude 3.5 Haiku has improved across various skills, including coding, tool use, and reasoning. As the fastest model in the Anthropic family, it provides rapid response times, suitable for applications requiring high interactivity and low latency, such as user-facing chatbots and instant code completion. It also excels in specialized tasks like data extraction and real-time content moderation, making it a versatile tool for wide application across industries. It does not support image input.

claude-3-5-sonnet-20241022

200k

Not Supported

Conversation, Vision

Anthropic_claude

A snapshot version released on October 22, 2024. Claude 3.5 Sonnet offers capabilities surpassing Opus and faster speeds than Sonnet, while maintaining the same price as Sonnet. Sonnet is particularly adept at programming, data science, visual processing, and agentic tasks.

claude-3-5-sonnet-latest

200K

Not Supported

Conversation, Vision

Anthropic_claude

Dynamically points to the latest Claude 3.5 Sonnet version. Claude 3.5 Sonnet offers capabilities surpassing Opus and faster speeds than Sonnet, while maintaining the same price as Sonnet. Sonnet is particularly adept at programming, data science, visual processing, and agentic tasks. This model points to the latest version.

claude-3-haiku-20240307

200k

Not Supported

Conversation, Vision

Anthropic_claude

Claude 3 Haiku is Anthropic's fastest and most compact model, designed for near-instantaneous responses. It features fast and accurate targeted performance.

claude-3-opus-20240229

200k

Not Supported

Conversation, Vision

Anthropic_claude

Claude 3 Opus is Anthropic's most powerful model for handling highly complex tasks. It excels in performance, intelligence, fluency, and comprehension.

claude-3-sonnet-20240229

200k

Not Supported

Conversation, Vision

Anthropic_claude

A snapshot version released on February 29, 2024. Sonnet is particularly adept at: - Coding: Can autonomously write, edit, and run code, with reasoning and troubleshooting capabilities - Data Science: Enhances human data science expertise; can process unstructured data when using multiple tools to gain insights - Visual Processing: Excels at interpreting charts, graphs, and images, accurately transcribing text to extract insights beyond the text itself - Agentic Tasks: Excellent tool use, making it ideal for handling agentic tasks (i.e., complex, multi-step problem-solving that requires interaction with other systems)

google/gemma-2-27b-it

Not Supported

Conversation

Google_gamma

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are decoder-only large language models that support English and come with open weights, pre-trained, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning.

google/gemma-2-9b-it

Not Supported

Conversation

Google_gamma

Gemma is one of the lightweight, state-of-the-art open model series developed by Google. It is a decoder-only large language model that supports English, with open weights, pre-trained, and instruction-tuned variants available. Gemma models are suitable for various text generation tasks, including question answering, summarization, and reasoning. This 9B model was trained on 8 trillion tokens.

gemini-1.5-pro

Not Supported

Conversation

Google_gemini

The latest stable version of Gemini 1.5 Pro. As a powerful multimodal model, it can handle up to 60,000 lines of code or 2,000 pages of text. It is particularly suitable for tasks requiring complex reasoning.

gemini-1.0-pro-001

33k

Not Supported

Conversation

Google_gemini

This is a stable version of Gemini 1.0 Pro. As an NLP model, it specializes in tasks like multi-turn text and code chat, as well as code generation. This model will be discontinued on February 15, 2025, and it is recommended to migrate to the 1.5 series models.

gemini-1.0-pro-002

32k

Not Supported

Conversation

Google_gemini

gemini-1.0-pro-latest

33k

Not Supported

Conversation, Deprecated or soon to be deprecated

Google_gemini

This is the latest version of Gemini 1.0 Pro. As an NLP model, it specializes in tasks like multi-turn text and code chat, as well as code generation. This model will be discontinued on February 15, 2025, and it is recommended to migrate to the 1.5 series models.

gemini-1.0-pro-vision-001

16k

Not Supported

Conversation

Google_gemini

This is the vision version of Gemini 1.0 Pro. This model will be discontinued on February 15, 2025, and it is recommended to migrate to the 1.5 series models.

gemini-1.0-pro-vision-latest

16k

Not Supported

Vision

Google_gemini

This is the latest vision version of Gemini 1.0 Pro. This model will be discontinued on February 15, 2025, and it is recommended to migrate to the 1.5 series models.

gemini-1.5-flash

Not Supported

Conversation, Vision

Google_gemini

This is the latest stable version of Gemini 1.5 Flash. As a balanced multimodal model, it can process audio, image, video, and text inputs.

gemini-1.5-flash-001

Not Supported

Conversation, Vision

Google_gemini

This is a stable version of Gemini 1.5 Flash. It offers the same basic features as gemini-1.5-flash but is version-pinned, making it suitable for production environments.

gemini-1.5-flash-002

Not Supported

Conversation, Vision

Google_gemini

This is a stable version of Gemini 1.5 Flash. It offers the same basic features as gemini-1.5-flash but is version-pinned, making it suitable for production environments.

gemini-1.5-flash-8b

Not Supported

Conversation, Vision

Google_gemini

Gemini 1.5 Flash-8B is Google's latest multimodal AI model, designed for efficient handling of large-scale tasks. With 8 billion parameters, the model supports text, image, audio, and video inputs, making it suitable for various application scenarios such as chat, transcription, and translation. Compared to other Gemini models, Flash-8B is optimized for speed and cost-effectiveness, especially for cost-sensitive users. Its rate limit is doubled, allowing developers to handle large-scale tasks more efficiently. Additionally, Flash-8B uses "knowledge distillation" technology to extract key knowledge from larger models, ensuring it is lightweight and efficient while retaining core capabilities.

gemini-1.5-flash-exp-0827

Not Supported

Conversation, Vision

Google_gemini

This is an experimental version of Gemini 1.5 Flash, which is regularly updated with the latest improvements. It is suitable for exploratory testing and prototyping, but not recommended for production environments.

gemini-1.5-flash-latest

Not Supported

Conversation, Vision

Google_gemini

This is the cutting-edge version of Gemini 1.5 Flash, which is regularly updated with the latest improvements. It is suitable for exploratory testing and prototyping, but not recommended for production environments.

gemini-1.5-pro-001

Not Supported

Conversation, Vision

Google_gemini

This is a stable version of Gemini 1.5 Pro, offering fixed model behavior and performance characteristics. It is suitable for production environments that require stability.

gemini-1.5-pro-002

Not Supported

Conversation, Vision

Google_gemini

This is a stable version of Gemini 1.5 Pro, offering fixed model behavior and performance characteristics. It is suitable for production environments that require stability.

gemini-1.5-pro-exp-0801

Not Supported

Conversation, Vision

Google_gemini

An experimental version of Gemini 1.5 Pro. As a powerful multimodal model, it can handle up to 60,000 lines of code or 2,000 pages of text. It is particularly suitable for tasks requiring complex reasoning.

gemini-1.5-pro-exp-0827

Not Supported

Conversation, Vision

Google_gemini

gemini-1.5-pro-latest

Not Supported

Conversation, Vision

Google_gemini

This is the latest version of Gemini 1.5 Pro, dynamically pointing to the most recent snapshot version.

gemini-2.0-flash

Not Supported

Conversation, Vision

Google_gemini

Gemini 2.0 Flash is Google's latest model, featuring a faster Time to First Token (TTFT) compared to the 1.5 version, while maintaining a quality level comparable to Gemini Pro 1.5. This model shows significant improvements in multimodal understanding, coding ability, complex instruction following, and function calling, thereby providing a smoother and more powerful intelligent experience.

gemini-2.0-flash-exp

100k

Supported

Conversation, Vision

Google_gemini

Gemini 2.0 Flash introduces a real-time multimodal API, improved speed and performance, enhanced quality, stronger agent capabilities, and adds image generation and voice conversion functions.

gemini-2.0-flash-lite-preview-02-05

Not Supported

Conversation, Vision

Google_gemini

Gemini 2.0 Flash-Lite is Google's latest cost-effective AI model, offering better quality at the same speed as 1.5 Flash. It supports a 1 million token context window and can handle multimodal tasks involving images, audio, and code. As Google's most cost-effective model currently, it uses a simplified single pricing strategy, making it particularly suitable for large-scale application scenarios that require cost control.

gemini-2.0-flash-thinking-exp

40k

Not Supported

Conversation, Reasoning

Google_gemini

gemini-2.0-flash-thinking-exp is an experimental model that can generate the "thinking process" it goes through when formulating a response. Therefore, "thinking mode" responses have stronger reasoning capabilities compared to the basic Gemini 2.0 Flash model.

gemini-2.0-flash-thinking-exp-01-21

64k

Not Supported

Conversation, Reasoning

Google_gemini

Gemini 2.0 Flash Thinking EXP-01-21 is Google's latest AI model, focusing on enhancing reasoning abilities and user interaction experience. The model has strong reasoning capabilities, especially in math and programming, and supports a context window of up to 1 million tokens, suitable for complex tasks and in-depth analysis scenarios. Its unique feature is the ability to generate its thinking process, improving the comprehensibility of AI thinking. It also supports native code execution, enhancing the flexibility and practicality of interactions. By optimizing algorithms, the model reduces logical contradictions, further improving the accuracy and consistency of its answers.

gemini-2.0-flash-thinking-exp-1219

40k

Not Supported

Conversation, Reasoning, Vision

Google_gemini

gemini-2.0-flash-thinking-exp-1219 is an experimental model that can generate the "thinking process" it goes through when formulating a response. Therefore, "thinking mode" responses have stronger reasoning capabilities compared to the basic Gemini 2.0 Flash model.

gemini-2.0-pro-exp-01-28

64k

Not Supported

Conversation, Vision

Google_gemini

Pre-announced model, not yet online.

gemini-2.0-pro-exp-02-05

Not Supported

Conversation, Vision

Google_gemini

Gemini 2.0 Pro Exp 02-05 is Google's latest experimental model released in February 2024, excelling in world knowledge, code generation, and long-text understanding. The model supports an ultra-long context window of 2 million tokens, capable of processing content equivalent to 2 hours of video, 22 hours of audio, over 60,000 lines of code, and more than 1.4 million words. As part of the Gemini 2.0 series, this model adopts a new Flash Thinking training strategy, significantly improving its performance and ranking high on several LLM leaderboards, demonstrating strong comprehensive capabilities.

gemini-exp-1114

Not Supported

Conversation, Vision

Google_gemini

This is an experimental model released on November 14, 2024, primarily focusing on quality improvements.

gemini-exp-1121

Not Supported

Conversation, Vision, Code

Google_gemini

This is an experimental model released on November 21, 2024, with improvements in coding, reasoning, and visual capabilities.

gemini-exp-1206

Not Supported

Conversation, Vision

Google_gemini

This is an experimental model released on December 6, 2024, with improvements in coding, reasoning, and visual capabilities.

gemini-exp-latest

Not Supported

Conversation, Vision

Google_gemini

This is an experimental model, dynamically pointing to the latest version.

gemini-pro

33k

Not Supported

Conversation

Google_gemini

Same as gemini-1.0-pro, it is an alias for gemini-1.0-pro.

gemini-pro-vision

16k

Not Supported

Conversation, Vision

Google_gemini

This is the vision version of Gemini 1.0 Pro. This model will be discontinued on February 15, 2025, and it is recommended to migrate to the 1.5 series models.

grok-2

128k

Not Supported

Conversation

Grok_grok

A new version of the grok model released by X.ai on December 12, 2024.

grok-2-1212

128k

Not Supported

Conversation

Grok_grok

A new version of the grok model released by X.ai on December 12, 2024.

grok-2-latest

128k

Not Supported

Conversation

Grok_grok

A new version of the grok model released by X.ai on December 12, 2024.

grok-2-vision-1212

32k

Not Supported

Conversation, Vision

Grok_grok

The grok vision version model released by X.ai on December 12, 2024.

grok-beta

100k

Not Supported

Conversation

Grok_grok

Performance comparable to Grok 2, but with improved efficiency, speed, and functionality.

grok-vision-beta

Not Supported

Conversation, Vision

Grok_grok

The latest image understanding model can process various visual information, including documents, charts, screenshots, and photos.

internlm/internlm2_5-20b-chat

32k

Supported

Conversation

internlm

InternLM2.5-20B-Chat is an open-source large-scale conversational model developed based on the InternLM2 architecture. With 20 billion parameters, this model excels in mathematical reasoning, surpassing comparable models like Llama3 and Gemma2-27B. InternLM2.5-20B-Chat has significantly improved tool-calling capabilities, supporting information collection from hundreds of web pages for analysis and reasoning, and possessing stronger instruction understanding, tool selection, and result reflection abilities.

meta-llama/Llama-3.2-11B-Vision-Instruct

Not Supported

Conversation, Vision

Meta_llama

The current Llama series models can not only process text data but also image data. Some models in Llama 3.2 have added visual understanding functions. This model supports simultaneous input of text and image data, understands the image, and outputs text information.

meta-llama/Llama-3.2-3B-Instruct

32k

Not Supported

Conversation

Meta_llama

Meta Llama 3.2 multilingual Large Language Models (LLMs), where 1B and 3B are lightweight models that can run on edge and mobile devices. This model is the 3B version.

meta-llama/Llama-3.2-90B-Vision-Instruct

Not Supported

Conversation, Vision

Meta_llama

meta-llama/Llama-3.3-70B-Instruct

131k

Not Supported

Conversation

Meta_llama

Meta's latest 70B LLM, with performance comparable to Llama 3.1 405B.

meta-llama/Meta-Llama-3.1-405B-Instruct

32k

Not Supported

Conversation

Meta_llama

The Meta Llama 3.1 multilingual Large Language Model (LLM) collection is a set of pre-trained and instruction-tuned generative models in 8B, 70B, and 405B sizes. This model is the 405B version. The Llama 3.1 instruction-tuned text models (8B, 70B, 405B) are optimized for multilingual conversations and outperform many available open-source and closed-source chat models on common industry benchmarks.

meta-llama/Meta-Llama-3.1-70B-Instruct

32k

Not Supported

Conversation

Meta_llama

Meta Llama 3.1 is a family of multilingual large language models developed by Meta, including pre-trained and instruction-tuned variants in 8B, 70B, and 405B parameter sizes. This 70B instruction-tuned model is optimized for multilingual conversation scenarios and performs excellently on several industry benchmarks. The model was trained on over 15 trillion tokens of public data and uses techniques like supervised fine-tuning and reinforcement learning with human feedback to enhance its usefulness and safety.

meta-llama/Meta-Llama-3.1-8B-Instruct

32k

Not Supported

Conversation

Meta_llama

The Meta Llama 3.1 multilingual Large Language Model (LLM) collection is a set of pre-trained and instruction-tuned generative models in 8B, 70B, and 405B sizes. This model is the 8B version. The Llama 3.1 instruction-tuned text models (8B, 70B, 405B) are optimized for multilingual conversations and outperform many available open-source and closed-source chat models on common industry benchmarks.

abab5.5-chat

16k

Supported

Conversation

Minimax_abab

Chinese persona conversation scenarios.

abab5.5s-chat

Supported

Conversation

Minimax_abab

Chinese persona conversation scenarios.

abab6.5g-chat

Supported

Conversation

Minimax_abab

Persona conversation scenarios in English and other languages.

abab6.5s-chat

245k

Supported

Conversation

Minimax_abab

General scenarios.

abab6.5t-chat

Supported

Conversation

Minimax_abab

Chinese persona conversation scenarios.

chatgpt-4o-latest

128k

16k

Not Supported

Conversation, Vision

OpenAI

The chatgpt-4o-latest model version continuously points to the GPT-4o version used in ChatGPT and is updated the fastest when there are significant changes.

gpt-4o-2024-11-20

128k

16k

Supported

Conversation

OpenAI

The latest gpt-4o snapshot version from November 20, 2024.

gpt-4o-audio-preview

128k

16k

Not Supported

Conversation

OpenAI

OpenAI's real-time voice conversation model.

gpt-4o-audio-preview-2024-10-01

128k

16k

Supported

Conversation

OpenAI

OpenAI's real-time voice conversation model.

128k

32k

Not Supported

Conversation, Reasoning, Vision

OpenAI

OpenAI's new reasoning model for complex tasks that require extensive common sense. The model has a 200k context, is currently the most powerful model in the world, and supports image recognition.

o1-mini-2024-09-12

128k

64k

Not Supported

Conversation, Reasoning

OpenAI

A fixed snapshot version of o1-mini. It is smaller, faster, and 80% cheaper than o1-preview, performing well in code generation and small-context operations.

o1-preview-2024-09-12

128k

32k

Not Supported

Conversation, Reasoning

OpenAI

A fixed snapshot version of o1-preview.

gpt-3.5-turbo

16k

Supported

Conversation

OpenAI_gpt-3

Based on GPT-3.5: GPT-3.5 Turbo is an improved version built on the GPT-3.5 model, developed by OpenAI. Performance Goals: Designed to improve model inference speed, processing efficiency, and resource utilization through optimized model structure and algorithms. Increased Inference Speed: Compared to GPT-3.5, GPT-3.5 Turbo typically offers faster inference speeds on the same hardware, which is particularly beneficial for applications requiring large-scale text processing. Higher Throughput: When processing a large number of requests or data, GPT-3.5 Turbo can achieve higher concurrent processing capabilities, thereby increasing overall system throughput. Optimized Resource Consumption: While maintaining performance, it may have reduced demand for hardware resources (such as memory and computing resources), which helps lower operating costs and improve system scalability. Wide Range of NLP Tasks: GPT-3.5 Turbo is suitable for a variety of natural language processing tasks, including but not limited to text generation, semantic understanding, dialogue systems, and machine translation. Developer Tools and API Support: Provides API interfaces that are easy for developers to integrate and use, supporting rapid application development and deployment.

gpt-3.5-turbo-0125

16k

Supported

Conversation

OpenAI_gpt-3

An updated GPT 3.5 Turbo model with higher accuracy in responding to requested formats and a fix for a bug that caused text encoding issues for non-English language function calls. Returns a maximum of 4,096 output tokens.

gpt-3.5-turbo-0613

16k

Supported

Conversation

OpenAI_gpt-3

Updated fixed snapshot version of GPT 3.5 Turbo. Now deprecated.

gpt-3.5-turbo-1106

16k

Supported

Conversation

OpenAI_gpt-3

Features improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Returns a maximum of 4,096 output tokens.

gpt-3.5-turbo-16k

16k

Supported

Conversation, Deprecated or soon to be deprecated

OpenAI_gpt-3

(Deprecated)

gpt-3.5-turbo-16k-0613

16k

Supported

Conversation, Deprecated or soon to be deprecated

OpenAI_gpt-3

A snapshot of gpt-3.5-turbo from June 13, 2023. (Deprecated)

gpt-3.5-turbo-instruct

Supported

Conversation

OpenAI_gpt-3

Capabilities similar to GPT-3 era models. Compatible with the legacy Completions endpoint, not for Chat Completions.

gpt-3.5o

16k

Not Supported

Conversation

OpenAI_gpt-3

Same as gpt-4o-lite.

gpt-4

Supported

Conversation

OpenAI_gpt-4

Currently points to gpt-4-0613.

gpt-4-0125-preview

128k

Supported

Conversation

OpenAI_gpt-4

The latest GPT-4 model, designed to reduce "laziness" where the model does not complete tasks. Returns a maximum of 4,096 output tokens.

gpt-4-0314

Supported

Conversation

OpenAI_gpt-4

A snapshot of gpt-4 from March 14, 2023.

gpt-4-0613

Supported

Conversation

OpenAI_gpt-4

A snapshot of gpt-4 from June 13, 2023, with enhanced function calling support.

gpt-4-1106-preview

128k

Supported

Conversation

OpenAI_gpt-4

A GPT-4 Turbo model with improved instruction following, JSON mode, reproducible outputs, function calling, and more. Returns a maximum of 4,096 output tokens. This is a preview model.

gpt-4-32k

32k

Supported

Conversation

OpenAI_gpt-4

gpt-4-32k will be deprecated on 2025-06-06.

gpt-4-32k-0613

32k

Supported

Conversation, Deprecated or soon to be deprecated

OpenAI_gpt-4

Will be deprecated on 2025-06-06.

gpt-4-turbo

128k

Supported

Conversation

OpenAI_gpt-4

The latest version of the GPT-4 Turbo model adds vision capabilities, supporting visual requests via JSON mode and function calling. The current version of this model is gpt-4-turbo-2024-04-09.

gpt-4-turbo-2024-04-09

128k

Supported

Conversation

OpenAI_gpt-4

GPT-4 Turbo model with vision capabilities. Vision requests can now be made via JSON mode and function calling. gpt-4-turbo currently points to this version.

gpt-4-turbo-preview

128k

Supported

Conversation, Vision

OpenAI_gpt-4

Currently points to gpt-4-0125-preview.

gpt-4o

128k

16k

Supported

Conversation, Vision

OpenAI_gpt-4

OpenAI's highly intelligent flagship model, suitable for complex, multi-step tasks. GPT-4o is cheaper and faster than GPT-4 Turbo.

gpt-4o-2024-05-13

128k

Supported

Conversation, Vision

OpenAI_gpt-4

The original gpt-4o snapshot from May 13, 2024.

gpt-4o-2024-08-06

128k

16k

Supported

Conversation, Vision

OpenAI_gpt-4

The first snapshot to support structured outputs. gpt-4o currently points to this version.

gpt-4o-mini

128k

16k

Supported

Conversation, Vision

OpenAI_gpt-4

OpenAI's affordable version of gpt-4o, suitable for fast, lightweight tasks. GPT-4o mini is cheaper and more powerful than GPT-3.5 Turbo. Currently points to gpt-4o-mini-2024-07-18.

gpt-4o-mini-2024-07-18

128k

16k

Supported

Conversation, Vision

OpenAI_gpt-4

A fixed snapshot version of gpt-4o-mini.

gpt-4o-realtime-preview

128k

Supported

Conversation, Real-time Voice

OpenAI_gpt-4

OpenAI's real-time voice conversation model.

gpt-4o-realtime-preview-2024-10-01

128k

Supported

Conversation, Real-time Voice, Vision

OpenAI_gpt-4

gpt-4o-realtime-preview currently points to this snapshot version.

o1-mini

128k

64k

Not Supported

Conversation, Reasoning

OpenAI_o1

Smaller, faster, and 80% cheaper than o1-preview, performing well in code generation and small-context operations.

o1-preview

128k

32k

Not Supported

Conversation, Reasoning

OpenAI_o1

o1-preview is a new reasoning model for complex tasks that require extensive common sense. The model has a 128K context and a knowledge cutoff of October 2023. It focuses on advanced reasoning and solving complex problems, including mathematical and scientific tasks. It is ideal for applications requiring deep contextual understanding and autonomous workflows.

o3-mini

200k

100k

Supported

Conversation, Reasoning

OpenAI_o1

o3-mini is OpenAI's latest small reasoning model, offering high intelligence while maintaining the same cost and latency as o1-mini. It focuses on science, math, and coding tasks, supports developer features like structured output, function calling, and batch API, with a knowledge cutoff of October 2023, demonstrating a significant balance in reasoning capability and cost-effectiveness.

o3-mini-2025-01-31

200k

100k

Supported

Conversation, Reasoning

OpenAI_o1

o3-mini currently points to this version. o3-mini-2025-01-31 is OpenAI's latest small reasoning model, offering high intelligence while maintaining the same cost and latency as o1-mini. It focuses on science, math, and coding tasks, supports developer features like structured output, function calling, and batch API, with a knowledge cutoff of October 2023, demonstrating a significant balance in reasoning capability and cost-effectiveness.

Baichuan2-Turbo

32k

Not Supported

Conversation

Baichuan_baichuan

Compared to similarly sized models in the industry, this model maintains a leading performance while significantly reducing the price.

Baichuan3-Turbo

32k

Not Supported

Conversation

Baichuan_baichuan

Compared to similarly sized models in the industry, this model maintains a leading performance while significantly reducing the price.

Baichuan3-Turbo-128k

128k

Not Supported

Conversation

Baichuan_baichuan

The Baichuan model processes complex text with a 128k ultra-long context window, is specifically optimized for industries like finance, and significantly reduces costs while maintaining high performance, providing a cost-effective solution for enterprises.

Baichuan4

32k

Not Supported

Conversation

Baichuan_baichuan

Baichuan's MoE model provides a highly efficient and cost-effective solution for enterprise applications through specialized optimization, cost reduction, and performance enhancement.

Baichuan4-Air

32k

Not Supported

Conversation

Baichuan_baichuan

Baichuan's MoE model provides a highly efficient and cost-effective solution for enterprise applications through specialized optimization, cost reduction, and performance enhancement.

Baichuan4-Turbo

32k

Not Supported

Conversation

Baichuan_baichuan

Trained on massive high-quality scenario data, usability in high-frequency enterprise scenarios is improved by 10%+ compared to Baichuan4, information summarization by 50%, multilingual capabilities by 31%, and content generation by 13%. Specially optimized for inference performance, the first token response speed is increased by 51% and token stream speed by 73% compared to Baichuan4.

ERNIE-3.5-128K

128k

Supported

Conversation

Baidu_ernie

Baidu's self-developed flagship large language model, covering massive Chinese and English corpora, with powerful general capabilities to meet most dialogue, Q&A, creative generation, and plugin application requirements. Supports automatic integration with the Baidu search plugin to ensure the timeliness of Q&A information.

ERNIE-3.5-8K

Supported

Conversation

Baidu_ernie

ERNIE-3.5-8K-Preview

Supported

Conversation

Baidu_ernie

ERNIE-4.0-8K

Supported

Conversation

Baidu_ernie

Baidu's self-developed flagship ultra-large-scale language model. Compared to ERNIE 3.5, it has a comprehensive upgrade in model capabilities, widely applicable to complex task scenarios in various fields. Supports automatic integration with the Baidu search plugin to ensure the timeliness of Q&A information.

ERNIE-4.0-8K-Latest

Supported

Conversation

Baidu_ernie

ERNIE-4.0-8K-Latest has fully improved capabilities compared to ERNIE-4.0-8K, with significant enhancements in role-playing and instruction-following abilities. Compared to ERNIE 3.5, it has a comprehensive upgrade in model capabilities, widely applicable to complex task scenarios in various fields. Supports automatic integration with the Baidu search plugin to ensure the timeliness of Q&A information, and supports 5K tokens input + 2K tokens output. This document introduces the method for calling the ERNIE-4.0-8K-Latest API.

ERNIE-4.0-8K-Preview

Supported

Conversation

Baidu_ernie

ERNIE-4.0-Turbo-128K

128k

Supported

Conversation

Baidu_ernie

ERNIE 4.0 Turbo is Baidu's self-developed flagship ultra-large-scale language model with outstanding overall performance, widely applicable to complex task scenarios in various fields. Supports automatic integration with the Baidu search plugin to ensure the timeliness of Q&A information. It has better performance compared to ERNIE 4.0. ERNIE-4.0-Turbo-128K is a version of the model with better overall performance on long documents than ERNIE-3.5-128K. This document introduces the relevant API and its usage.

ERNIE-4.0-Turbo-8K

Supported

Conversation

Baidu_ernie

ERNIE 4.0 Turbo is Baidu's self-developed flagship ultra-large-scale language model with outstanding overall performance, widely applicable to complex task scenarios in various fields. Supports automatic integration with the Baidu search plugin to ensure the timeliness of Q&A information. It has better performance compared to ERNIE 4.0. ERNIE-4.0-Turbo-8K is a version of the model. This document introduces the relevant API and its usage.

ERNIE-4.0-Turbo-8K-Latest

Supported

Conversation

Baidu_ernie

ERNIE-4.0-Turbo-8K-Preview

Supported

Conversation

Baidu_ernie

ERNIE-Character-8K

Not Supported

Conversation

Baidu_ernie

Baidu's self-developed vertical large language model, suitable for application scenarios such as game NPCs, customer service dialogues, and dialogue role-playing. It has a more distinct and consistent persona style, stronger instruction-following ability, and better inference performance.

ERNIE-Lite-8K

Not Supported

Conversation

Baidu_ernie

Baidu's self-developed lightweight large language model, balancing excellent model performance with inference efficiency, suitable for inference on low-power AI accelerator cards.

ERNIE-Lite-Pro-128K

128k

Supported

Conversation

Baidu_ernie

Baidu's self-developed lightweight large language model, with better performance than ERNIE Lite, balancing excellent model performance with inference efficiency, suitable for inference on low-power AI accelerator cards. ERNIE-Lite-Pro-128K supports a 128K context length and has better performance than ERNIE-Lite-128K.

ERNIE-Novel-8K

Not Supported

Conversation

Baidu_ernie

ERNIE-Novel-8K is Baidu's self-developed general-purpose large language model, with a significant advantage in novel continuation capabilities. It can also be used in scenarios like short dramas and movies.

ERNIE-Speed-128K

128k

Not Supported

Conversation

Baidu_ernie

Baidu's latest self-developed high-performance large language model released in 2024, with excellent general capabilities. It is suitable as a base model for fine-tuning to better handle specific scenario problems, while also having excellent inference performance.

ERNIE-Speed-8K

Not Supported

Conversation

Baidu_ernie

ERNIE-Speed-Pro-128K

128k

Not Supported

Conversation

Baidu_ernie

ERNIE Speed Pro is Baidu's latest self-developed high-performance large language model released in 2024, with excellent general capabilities. It is suitable as a base model for fine-tuning to better handle specific scenario problems, while also having excellent inference performance. ERNIE-Speed-Pro-128K is the initial version released on August 30, 2024, supporting a 128K context length and having better performance than ERNIE-Speed-128K.

ERNIE-Tiny-8K

Not Supported

Conversation

Baidu_ernie

Baidu's self-developed ultra-high-performance large language model, with the lowest deployment and fine-tuning costs in the ERNIE series.

Doubao-1.5-lite-32k

32k

12k

Supported

Conversation

Doubao_doubao

Doubao1.5-lite is also among the world's top-tier lightweight language models, matching or surpassing GPT-4o mini and Claude 3.5 Haiku on authoritative evaluation benchmarks for general knowledge (MMLU_pro), reasoning (BBH), math (MATH), and professional knowledge (GPQA).

Doubao-1.5-pro-256k

256k

12k

Supported

Conversation

Doubao_doubao

Doubao-1.5-Pro-256k, a fully upgraded version based on Doubao-1.5-Pro. Compared to Doubao-pro-256k/241115, the overall performance is significantly improved by 10%. The output length is greatly increased, supporting up to 12k tokens.

Doubao-1.5-pro-32k

32k

12k

Supported

Conversation

Doubao_doubao

Doubao-1.5-pro, a new generation flagship model with comprehensive performance upgrades, excelling in knowledge, code, reasoning, and more. It achieves world-leading performance on multiple public evaluation benchmarks, especially achieving the best scores on knowledge, code, reasoning, and Chinese authoritative benchmarks, with a composite score superior to top industry models like GPT4o and Claude 3.5 Sonnet.

Doubao-1.5-vision-pro

32k

12k

Not Supported

Conversation, Vision

Doubao_doubao

Doubao-1.5-vision-pro, a newly upgraded multimodal large model, supports image recognition of any resolution and extreme aspect ratios, enhancing visual reasoning, document recognition, detailed information understanding, and instruction-following capabilities.

Doubao-embedding

Supported

Embedding

Doubao_doubao

Doubao-embedding is a semantic vectorization model developed by ByteDance, primarily for vector retrieval scenarios. It supports Chinese and English, with a maximum context length of 4K. The following versions are currently available: text-240715: Maximum vector dimension of 2560, supports dimensionality reduction to 512, 1024, and 2048. Chinese and English retrieval performance is significantly improved compared to the text-240515 version, and this version is recommended. text-240515: Maximum vector dimension of 2048, supports dimensionality reduction to 512 and 1024.

Doubao-embedding-large

Not Supported

Embedding

Doubao_doubao

Chinese and English retrieval performance is significantly improved compared to the Doubao-embedding/text-240715 version.

Doubao-embedding-vision

Not Supported

Embedding

Doubao_doubao

Doubao-embedding-vision, a newly upgraded image-text multimodal vectorization model, is primarily for image-text multi-vector retrieval scenarios. It supports image input and Chinese/English text input, with a maximum context length of 8K.

Doubao-lite-128k

128k

Supported

Conversation

Doubao_doubao

Doubao-lite offers extremely fast response speeds and better cost-effectiveness, providing more flexible choices for customers in different scenarios. Supports inference and fine-tuning with a 128k context window.

Doubao-lite-32k

32k

Supported

Conversation

Doubao_doubao

Doubao-lite-4k

Supported

Conversation

Doubao_doubao

Doubao-pro-128k

128k

Supported

Conversation

Doubao_doubao

The flagship model with the best performance, suitable for handling complex tasks, with excellent results in reference Q&A, summarization, creation, text classification, role-playing, and other scenarios. Supports inference and fine-tuning with a 128k context window.

Doubao-pro-32k

32k

Supported

Conversation

Doubao_doubao

Doubao-pro-4k

Supported

Conversation

Doubao_doubao

step-1-128k

128k

Supported

Conversation

StepFun

The step-1-128k model is an ultra-large-scale language model capable of processing inputs of up to 128,000 tokens. This capability gives it a significant advantage in generating long-form content and performing complex reasoning, making it suitable for applications that require rich context, such as writing novels and scripts.

step-1-256k

256k

Supported

Conversation

StepFun

The step-1-256k model is one of the largest language models available, supporting inputs of 256,000 tokens. It is designed to meet extremely complex task requirements, such as large-scale data analysis and multi-turn dialogue systems, and can provide high-quality output in various domains.

step-1-32k

32k

Supported

Conversation

StepFun

The step-1-32k model extends the context window to support 32,000 tokens of input. This makes it perform excellently when handling long articles and complex conversations, suitable for tasks that require deep understanding and analysis, such as legal documents and academic research.

step-1-8k

Supported

Conversation

StepFun

The step-1-8k model is an efficient language model designed for processing shorter texts. It can perform reasoning within a context of 8,000 tokens, making it suitable for application scenarios that require quick responses, such as chatbots and real-time translation.

step-1-flash

Supported

Conversation

StepFun

The step-1-flash model focuses on rapid response and efficient processing, suitable for real-time applications. Its design allows it to provide high-quality language understanding and generation capabilities even with limited computing resources, making it suitable for mobile devices and edge computing scenarios.

step-1.5v-mini

32k

Supported

Conversation, Vision

StepFun

The step-1.5v-mini model is a lightweight version designed to run in resource-constrained environments. Despite its small size, it still retains good language processing capabilities, making it suitable for embedded systems and low-power devices.

step-1v-32k

32k

Supported

Conversation, Vision

StepFun

The step-1v-32k model supports inputs of 32,000 tokens, suitable for applications requiring longer context. It performs excellently in handling complex dialogues and long texts, making it suitable for fields such as customer service and content creation.

step-1v-8k

Supported

Conversation, Vision

StepFun

The step-1v-8k model is an optimized version designed for 8,000-token inputs, suitable for fast generation and processing of short texts. It strikes a good balance between speed and accuracy, making it suitable for real-time applications.

step-2-16k

16k

Supported

Conversation

StepFun

The step-2-16k model is a medium-sized language model supporting 16,000 tokens of input. It performs well in various tasks and is suitable for application scenarios such as education, training, and knowledge management.

yi-lightning

16k

Supported

Conversation

01.AI_yi

The latest high-performance model, ensuring high-quality output while significantly increasing inference speed. Suitable for real-time interaction and highly complex reasoning scenarios, its extremely high cost-effectiveness can provide excellent support for commercial products.

yi-vision-v2

16K

Supported

Conversation, Vision

01.AI_yi

Suitable for scenarios that require analyzing and interpreting images and charts, such as image Q&A, chart understanding, OCR, visual reasoning, education, research report understanding, or multilingual document reading.

qwen-14b-chat

Supported

Conversation

Qwen_qwen

Alibaba Cloud's official open-source version of Tongyi Qianwen.

qwen-72b-chat

32k

Supported

Conversation

Qwen_qwen

Alibaba Cloud's official open-source version of Tongyi Qianwen.

qwen-7b-chat

7.5k

1.5k

Supported

Conversation

Qwen_qwen

Alibaba Cloud's official open-source version of Tongyi Qianwen.

qwen-coder-plus

128k

Supported

Conversation, Code

Qwen_qwen

Qwen-Coder-Plus is a programming-specific model in the Qwen series, designed to enhance code generation and understanding capabilities. Trained on a large scale of programming data, this model can handle multiple programming languages and supports functions like code completion, error detection, and code refactoring. Its design goal is to provide developers with more efficient programming assistance and improve development efficiency.

qwen-coder-plus-latest

128k

Supported

Conversation, Code

Qwen_qwen

Qwen-Coder-Plus-Latest is the newest version of Qwen-Coder-Plus, incorporating the latest algorithm optimizations and dataset updates. This model shows significant performance improvements, enabling it to understand context more accurately and generate code that better meets developers' needs. It also introduces support for more programming languages, enhancing its multilingual programming capabilities.

qwen-coder-turbo

128k

Supported

Conversation, Code

Qwen_qwen

The Tongyi Qianwen series of code and programming models are language models specifically for programming and code generation, featuring fast inference speed and low cost. This version always points to the latest stable snapshot.

qwen-coder-turbo-latest

128k

Supported

Conversation, Code

Qwen_qwen

qwen-long

10m

Supported

Conversation

Qwen_qwen

Qwen-Long is a large language model from Tongyi Qianwen for ultra-long context processing scenarios. It supports input in different languages such as Chinese and English, and supports ultra-long context dialogues of up to 10 million tokens (about 15 million words or 15,000 pages of documents). Combined with the synchronously launched document service, it can parse and have dialogues on various document formats such as Word, PDF, Markdown, EPUB, and MOBI. Note: For requests submitted directly via HTTP, it supports a length of 1M tokens. For lengths exceeding this, it is recommended to submit via file.

qwen-math-plus

Supported

Conversation

Qwen_qwen

Qwen-Math-Plus is a model focused on solving mathematical problems, designed to provide efficient mathematical reasoning and calculation capabilities. Trained on a large number of math problems, this model can handle complex mathematical expressions and problems, supporting a variety of calculation needs from basic arithmetic to higher mathematics. Its application scenarios include education, scientific research, and engineering.

qwen-math-plus-latest

Supported

Conversation

Qwen_qwen

Qwen-Math-Plus-Latest is the newest version of Qwen-Math-Plus, integrating the latest mathematical reasoning techniques and algorithm improvements. This model performs better in handling complex mathematical problems, providing more accurate solutions and reasoning processes. It also expands its understanding of mathematical symbols and formulas, making it suitable for a wider range of mathematical applications.

qwen-math-turbo

Supported

Conversation

Qwen_qwen

Qwen-Math-Turbo is a high-performance mathematical model designed for fast calculation and real-time inference. This model optimizes calculation speed, enabling it to process a large number of mathematical problems in a very short time, suitable for application scenarios that require quick feedback, such as online education and real-time data analysis. Its efficient algorithms allow users to get instant results in complex calculations.

qwen-math-turbo-latest

Supported

Conversation

Qwen_qwen

Qwen-Math-Turbo-Latest is the newest version of Qwen-Math-Turbo, further improving calculation efficiency and accuracy. This model has undergone multiple algorithmic optimizations, enabling it to handle more complex mathematical problems and maintain high efficiency in real-time inference. It is suitable for mathematical applications that require rapid response, such as financial analysis and scientific computing.

qwen-max

32k

Supported

Conversation

Qwen_qwen

The Tongyi Qianwen 2.5 series hundred-billion-level ultra-large-scale language model supports input in different languages such as Chinese and English. As the model is upgraded, qwen-max will be updated on a rolling basis.

qwen-max-latest

32k

Supported

Conversation

Qwen_qwen

The best-performing model in the Tongyi Qianwen series. This model is a dynamically updated version, and model updates will not be announced in advance. It is suitable for complex, multi-step tasks. The model's comprehensive abilities in Chinese and English are significantly improved, human preference is significantly enhanced, reasoning ability and complex instruction understanding are significantly strengthened, performance on difficult tasks is better, and math and code abilities are significantly improved. It also has enhanced understanding and generation capabilities for structured data like tables and JSON.

qwen-plus

128k

Supported

Conversation

Qwen_qwen

A well-balanced model in the Tongyi Qianwen series, with inference performance and speed between Tongyi Qianwen-Max and Tongyi Qianwen-Turbo, suitable for moderately complex tasks. The model's comprehensive abilities in Chinese and English are significantly improved, human preference is significantly enhanced, reasoning ability and complex instruction understanding are significantly strengthened, performance on difficult tasks is better, and math and code abilities are significantly improved.

qwen-plus-latest

128k

Supported

Conversation

Qwen_qwen

Qwen-Plus is an enhanced version of the visual language model in the Tongyi Qianwen series, designed to improve detail recognition and text recognition capabilities. This model supports images with resolutions over one million pixels and any aspect ratio, performing excellently in a wide range of visual language tasks, making it suitable for applications requiring high-precision image understanding.

qwen-turbo

128k

Supported

Conversation

Qwen_qwen

The fastest and most cost-effective model in the Tongyi Qianwen series, suitable for simple tasks. The model's comprehensive abilities in Chinese and English are significantly improved, human preference is significantly enhanced, reasoning ability and complex instruction understanding are significantly strengthened, performance on difficult tasks is better, and math and code abilities are significantly improved.

qwen-turbo-latest

Supported

Conversation

Qwen_qwen

Qwen-Turbo is an efficient model designed for simple tasks, emphasizing speed and cost-effectiveness. It performs excellently in basic visual language tasks and is suitable for applications with strict response time requirements, such as real-time image recognition and simple Q&A systems.

qwen-vl-max

32k

Supported

Conversation

Qwen_qwen

Tongyi Qianwen VL-Max (qwen-vl-max), the ultra-large-scale visual language model from Tongyi Qianwen. Compared to the enhanced version, it further improves visual reasoning and instruction-following capabilities, providing a higher level of visual perception and cognition. It offers the best performance on more complex tasks.

qwen-vl-max-latest

32k

Supported

Conversation, Vision

Qwen_qwen

Qwen-VL-Max is the most advanced version in the Qwen-VL series, designed to solve complex multimodal tasks. It combines advanced visual and language processing technologies, capable of understanding and analyzing high-resolution images with extremely strong reasoning abilities, suitable for applications requiring deep understanding and complex reasoning.

qwen-vl-ocr

34k

Supported

Conversation, Vision

Qwen_qwen

Only supports OCR, not conversation.

qwen-vl-ocr-latest

34k

Supported

Conversation, Vision

Qwen_qwen

Only supports OCR, not conversation.

qwen-vl-plus

Supported

Conversation, Vision

Qwen_qwen

Tongyi Qianwen VL-Plus (qwen-vl-plus), the enhanced version of the Tongyi Qianwen large-scale visual language model. It significantly improves detail recognition and text recognition capabilities, supports images with resolutions over one million pixels and any aspect ratio. It provides excellent performance on a wide range of visual tasks.

qwen-vl-plus-latest

32k

Supported

Conversation, Vision

Qwen_qwen

Qwen-VL-Plus-Latest is the newest version of Qwen-VL-Plus, enhancing the model's multimodal understanding capabilities. It excels in the combined processing of images and text, making it suitable for applications that need to efficiently handle multiple input formats, such as intelligent customer service and content generation.

Qwen/Qwen2-1.5B-Instruct

32k

Not Supported

Conversation

Qwen_qwen

Qwen2-1.5B-Instruct is an instruction-tuned large language model in the Qwen2 series with a parameter size of 1.5B. Based on the Transformer architecture, the model uses SwiGLU activation functions, attention QKV biases, and group query attention. It excels in multiple benchmark tests for language understanding, generation, multilingual capabilities, coding, math, and reasoning, surpassing most open-source models.

Qwen/Qwen2-72B-Instruct

128k

Not Supported

Conversation

Qwen_qwen

Qwen2-72B-Instruct is an instruction-tuned large language model in the Qwen2 series with a parameter size of 72B. Based on the Transformer architecture, the model uses SwiGLU activation functions, attention QKV biases, and group query attention. It can handle large-scale inputs. The model excels in multiple benchmark tests for language understanding, generation, multilingual capabilities, coding, math, and reasoning, surpassing most open-source models.

Qwen/Qwen2-7B-Instruct

128k

Not Supported

Conversation

Qwen_qwen

Qwen2-7B-Instruct is an instruction-tuned large language model in the Qwen2 series with a parameter size of 7B. Based on the Transformer architecture, the model uses SwiGLU activation functions, attention QKV biases, and group query attention. It can handle large-scale inputs. The model excels in multiple benchmark tests for language understanding, generation, multilingual capabilities, coding, math, and reasoning, surpassing most open-source models.

Qwen/Qwen2-VL-72B-Instruct

32k

Not Supported

Conversation

Qwen_qwen

Qwen2-VL is the latest iteration of the Qwen-VL model, achieving state-of-the-art performance in visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, and MTVQA. Qwen2-VL can understand videos over 20 minutes long for high-quality video-based Q&A, dialogue, and content creation. It also has complex reasoning and decision-making capabilities, and can be integrated with mobile devices, robots, etc., for automated operations based on visual environments and text instructions.

Qwen/Qwen2-VL-7B-Instruct

32k

Not Supported

Conversation

Qwen_qwen

Qwen2-VL-7B-Instruct is the latest iteration of the Qwen-VL model, achieving state-of-the-art performance in visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, and MTVQA. Qwen2-VL can be used for high-quality video-based Q&A, dialogue, and content creation, and also has complex reasoning and decision-making capabilities, and can be integrated with mobile devices, robots, etc., for automated operations based on visual environments and text instructions.

Qwen/Qwen2.5-72B-Instruct

128k

Not Supported

Conversation

Qwen_qwen

Qwen2.5-72B-Instruct is one of the latest large language model series released by Alibaba Cloud. This 72B model has significantly improved capabilities in areas such as coding and mathematics. It supports inputs of up to 128K tokens and can generate long texts of over 8K tokens.

Qwen/Qwen2.5-72B-Instruct-128K

128k

Not Supported

Conversation

Qwen_qwen

Qwen/Qwen2.5-7B-Instruct

128k

Not Supported

Conversation

Qwen_qwen

Qwen2.5-7B-Instruct is one of the latest large language model series released by Alibaba Cloud. This 7B model has significantly improved capabilities in areas such as coding and mathematics. The model also provides multilingual support, covering over 29 languages, including Chinese and English. The model has significant improvements in instruction following, understanding structured data, and generating structured output (especially JSON).

Qwen/Qwen2.5-Coder-32B-Instruct

128k

Not Supported

Conversation, Code

Qwen_qwen

Qwen2.5-32B-Instruct is one of the latest large language model series released by Alibaba Cloud. This 32B model has significantly improved capabilities in areas such as coding and mathematics. The model also provides multilingual support, covering over 29 languages, including Chinese and English. The model has significant improvements in instruction following, understanding structured data, and generating structured output (especially JSON).

Qwen/Qwen2.5-Coder-7B-Instruct

128k

Not Supported

Conversation

Qwen_qwen

Qwen/QwQ-32B-Preview

32k

16k

Not Supported

Conversation, Reasoning

Qwen_qwen

QwQ-32B-Preview is an experimental research model developed by the Qwen team, aimed at enhancing the reasoning capabilities of artificial intelligence. As a preview version, it demonstrates excellent analytical abilities, but also has some important limitations: 1. Language mixing and code-switching: The model may mix languages or switch between languages unexpectedly, affecting the clarity of the response. 2. Recursive reasoning loops: The model may enter a cyclic reasoning mode, leading to lengthy answers without a clear conclusion. 3. Safety and ethical considerations: The model requires strengthened safety measures to ensure reliable and safe performance, and users should exercise caution when using it. 4. Performance and benchmark limitations: The model performs excellently in mathematics and programming, but there is still room for improvement in other areas such as common sense reasoning and nuanced language understanding.

qwen1.5-110b-chat

32k

Not Supported

Conversation

Qwen_qwen

qwen1.5-14b-chat

Not Supported

Conversation

Qwen_qwen

qwen1.5-32b-chat

32k

Not Supported

Conversation

Qwen_qwen

qwen1.5-72b-chat

32k

Not Supported

Conversation

Qwen_qwen

qwen1.5-7b-chat

Not Supported

Conversation

Qwen_qwen

qwen2-57b-a14b-instruct

65k

Not Supported

Conversation

Qwen_qwen

Qwen2-72B-Instruct

Not Supported

Conversation

Qwen_qwen

qwen2-7b-instruct

128k

Not Supported

Conversation

Qwen_qwen

qwen2-math-72b-instruct

Not Supported

Conversation

Qwen_qwen

qwen2-math-7b-instruct

Not Supported

Conversation

Qwen_qwen

qwen2.5-14b-instruct

128k

Not Supported

Conversation

Qwen_qwen

qwen2.5-32b-instruct

128k

Not Supported

Conversation

Qwen_qwen

qwen2.5-72b-instruct

128k

Not Supported

Conversation

Qwen_qwen

qwen2.5-7b-instruct

128k

Not Supported

Conversation

Qwen_qwen

qwen2.5-coder-14b-instruct

128k

Not Supported

Conversation, Code

Qwen_qwen

qwen2.5-coder-32b-instruct

128k

Not Supported

Conversation, Code

Qwen_qwen

qwen2.5-coder-7b-instruct

128k

Not Supported

Conversation, Code

Qwen_qwen

qwen2.5-math-72b-instruct

Not Supported

Conversation

Qwen_qwen

qwen2.5-math-7b-instruct

Not Supported

Conversation

Qwen_qwen

deepseek-ai/DeepSeek-R1

64k

Not Supported

Conversation, Reasoning

DeepSeek_deepseek

The DeepSeek-R1 model is an open-source reasoning model based purely on reinforcement learning. It excels in tasks such as mathematics, code, and natural language reasoning, with performance comparable to OpenAI's o1 model and achieving excellent results in several benchmark tests.

deepseek-ai/DeepSeek-V2-Chat

128k

Not Supported

Conversation

DeepSeek_deepseek

DeepSeek-V2 is a powerful, cost-effective Mixture-of-Experts (MoE) language model. It was pre-trained on a high-quality corpus of 8.1 trillion tokens and further enhanced with Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). Compared to DeepSeek 67B, DeepSeek-V2 achieves stronger performance while saving 42.5% in training costs, reducing KV cache by 93.3%, and increasing maximum generation throughput by 5.76 times.

deepseek-ai/DeepSeek-V2.5

32k

Supported

Conversation

DeepSeek_deepseek

DeepSeek-V2.5 is an upgraded version of DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct, integrating the general and coding capabilities of the two previous versions. This model has been optimized in several aspects, including writing and instruction-following abilities, to better align with human preferences.

deepseek-ai/DeepSeek-V3

128k

Not Supported

Conversation

DeepSeek_deepseek

Open-source version of deepseek. Compared to the official version, it has a longer context and no issues with sensitive word refusal.

deepseek-chat

64k

Supported

Conversation

DeepSeek_deepseek

236B parameters, 64K context (API), top-ranked on the open-source leaderboard for Chinese comprehensive ability (AlignBench), and in the same tier as closed-source models like GPT-4-Turbo and ERNIE 4.0 in evaluations.

deepseek-coder

64k

Supported

Conversation, Code

DeepSeek_deepseek

deepseek-reasoner

64k

Supported

Conversation, Reasoning

DeepSeek_deepseek

DeepSeek-Reasoner (DeepSeek-R1) is the latest reasoning model from DeepSeek, designed to enhance reasoning capabilities through reinforcement learning training. The model's reasoning process involves a large amount of reflection and validation, enabling it to handle complex logical reasoning tasks, with a chain-of-thought length that can reach tens of thousands of words. DeepSeek-R1 excels in solving mathematical, coding, and other complex problems and has been widely applied in various scenarios, demonstrating its powerful reasoning ability and flexibility. Compared to other models, DeepSeek-R1's reasoning performance is close to that of top-tier closed-source models, showcasing the potential and competitiveness of open-source models in the field of reasoning.

hunyuan-code

Not Supported

Conversation, Code

Tencent_hunyuan

Hunyuan's latest code generation model. The base model was augmented with 200B high-quality code data and trained with high-quality SFT data for half a year. The context window length has been increased to 8K. It ranks at the top in automatic evaluation metrics for code generation in five major languages. In high-quality manual evaluations of 10 comprehensive code tasks across five major languages, its performance is in the top tier.

hunyuan-functioncall

28k

Supported

Conversation

Tencent_hunyuan

Hunyuan's latest MOE architecture FunctionCall model, trained with high-quality FunctionCall data, with a context window of up to 32K, leading in evaluation metrics across multiple dimensions.

hunyuan-large

28k

Not Supported

Conversation

Tencent_hunyuan

The Hunyuan-large model has a total of about 389B parameters, with about 52B activated parameters, making it the open-source MoE model with the largest parameter scale and best performance in the industry.

hunyuan-large-longcontext

128k

Not Supported

Conversation

Tencent_hunyuan

Excels at handling long-text tasks such as document summarization and document Q&A, while also being capable of handling general text generation tasks. It performs excellently in the analysis and generation of long texts, effectively handling complex and detailed long-form content processing needs.

hunyuan-lite

250k

Not Supported

Conversation

Tencent_hunyuan

Upgraded to an MOE structure with a 256k context window, leading many open-source models in NLP, code, math, and industry-specific evaluation sets.

hunyuan-pro

28k

Supported

Conversation

Tencent_hunyuan

A trillion-parameter scale MOE-32K long-text model. It achieves an absolute leading level on various benchmarks, with complex instruction and reasoning capabilities, complex mathematical abilities, and supports functioncall. It is specially optimized for applications in multilingual translation, finance, law, and medicine.

hunyuan-role

28k

Not Supported

Conversation

Tencent_hunyuan

Hunyuan's latest role-playing model. This is a role-playing model officially fine-tuned and launched by Hunyuan, based on the Hunyuan model and augmented with role-playing scenario datasets, providing better foundational performance in role-playing scenarios.

hunyuan-standard

30k

Not Supported

Conversation

Tencent_hunyuan

Adopts a better routing strategy, while also alleviating the problems of load balancing and expert convergence. MOE-32K has a relatively higher cost-performance ratio and can handle long text inputs while balancing performance and price.

hunyuan-standard-256K

250k

Not Supported

Conversation

Tencent_hunyuan

Adopts a better routing strategy, while also alleviating the problems of load balancing and expert convergence. For long texts, the "needle in a haystack" metric reaches 99.9%. MOE-256K further breaks through in length and performance, greatly expanding the input length.

hunyuan-translation-lite

Not Supported

Conversation

Tencent_hunyuan

The Hunyuan translation model supports natural language conversational translation; it supports mutual translation between Chinese and 15 languages including English, Japanese, French, Portuguese, Spanish, Turkish, Russian, Arabic, Korean, Italian, German, Vietnamese, Malay, and Indonesian.

hunyuan-turbo

28k

Supported

Conversation

Tencent_hunyuan

The default version of the Hunyuan-turbo model, which uses a new Mixture-of-Experts (MoE) structure, resulting in faster inference efficiency and stronger performance compared to hunyuan-pro.

hunyuan-turbo-latest

28k

Supported

Conversation

Tencent_hunyuan

The dynamically updated version of the Hunyuan-turbo model. It is the best-performing version in the Hunyuan model series, consistent with the C-end (Tencent Yuanbao).

hunyuan-turbo-vision

Supported

Vision, Conversation

Tencent_hunyuan

Hunyuan's new generation flagship visual language model, using a new Mixture-of-Experts (MoE) structure. Its capabilities in basic recognition, content creation, knowledge Q&A, and analysis/reasoning related to image-text understanding are comprehensively improved compared to the previous generation model. Max input 6k, max output 2k.

hunyuan-vision

Supported

Conversation, Vision

Tencent_hunyuan

Hunyuan's latest multimodal model, supporting image + text input to generate text content. Basic Image Recognition: Recognizes subjects, elements, scenes, etc., in images. Image Content Creation: Summarizes images, creates advertising copy, social media posts, poems, etc. Multi-turn Image Dialogue: Engages in multi-turn interactive Q&A about a single image. Image Analysis and Reasoning: Performs statistical analysis on logical relationships, math problems, code, and charts in images. Image Knowledge Q&A: Answers questions about knowledge points contained in images, such as historical events, movie posters. Image OCR: Recognizes text in images from natural life scenes and non-natural scenes.

SparkDesk-Lite

Not Supported

Conversation

Spark_SparkDesk

Supports online web search function, with fast and convenient responses, suitable for low-power inference and model fine-tuning and other customized scenarios.

SparkDesk-Max

128k

Supported

Conversation

Spark_SparkDesk

Quantized from the latest Spark Large Model Engine 4.0 Turbo. It supports multiple built-in plugins such as web search, weather, and date. Core capabilities are fully upgraded, with universal improvements in application effects across various scenarios. Supports System role persona and FunctionCall.

SparkDesk-Max-32k

32k

Supported

Conversation

Spark_SparkDesk

Stronger reasoning: Enhanced context understanding and logical reasoning abilities. Longer input: Supports 32K tokens of text input, suitable for long document reading, private knowledge Q&A, and other scenarios.

SparkDesk-Pro

128k

Not Supported

Conversation

Spark_SparkDesk

Specially optimized for scenarios such as math, code, medicine, and education. Supports multiple built-in plugins like web search, weather, and date, covering most knowledge Q&A, language understanding, and text creation scenarios.

SparkDesk-Pro-128K

128k

Not Supported

Conversation

Spark_SparkDesk

Professional-grade large language model with tens of billions of parameters. It has been specially optimized for scenarios in medicine, education, and code, with lower latency in search scenarios. Suitable for business scenarios that have higher requirements for performance and response speed, such as text and intelligent Q&A.

moonshot-v1-128k

128k

Supported

Conversation

Moonshot AI_moonshot

A model with a length of 8k, suitable for generating short text.

moonshot-v1-32k

32k

Supported

Conversation

Moonshot AI_moonshot

A model with a length of 32k, suitable for generating long text.

moonshot-v1-8k

Supported

Conversation

Moonshot AI_moonshot

A model with a length of 128k, suitable for generating ultra-long text.

codegeex-4

128k

Not Supported

Conversation, Code

Zhipu_codegeex

Zhipu's code model: suitable for automatic code completion tasks.

charglm-3

Not Supported

Conversation

Zhipu_glm

Persona model.

emohaa

Not Supported

Conversation

Zhipu_glm

Psychology model: possesses professional counseling abilities to help users understand emotions and cope with emotional problems.

glm-3-turbo

128k

Not Supported

Conversation

Zhipu_glm

To be deprecated (June 30, 2025).

glm-4

128k

Supported

Conversation

Zhipu_glm

Old flagship: released on January 16, 2024, now replaced by GLM-4-0520.

glm-4-0520

128k

Supported

Conversation

Zhipu_glm

High-intelligence model: suitable for handling highly complex and diverse tasks.

glm-4-air

128k

Supported

Conversation

Zhipu_glm

High cost-performance: the most balanced model between inference capability and price.

glm-4-airx

Supported

Conversation

Zhipu_glm

Extremely fast inference: has ultra-fast inference speed and powerful inference effects.

glm-4-flash

128k

Supported

Conversation

Zhipu_glm

High speed, low price: ultra-fast inference speed.

glm-4-flashx

128k

Supported

Conversation

Zhipu_glm

High speed, low price: Enhanced version of Flash, ultra-fast inference speed.

glm-4-long

Supported

Conversation

Zhipu_glm

Ultra-long input: specially designed for handling ultra-long text and memory-intensive tasks.

glm-4-plus

128k

Supported

Conversation

Zhipu_glm

High-intelligence flagship: comprehensive performance improvement, with significantly enhanced long-text and complex task capabilities.

glm-4v

Not Supported

Conversation, Vision

Zhipu_glm

Image understanding: possesses image understanding and reasoning capabilities.

glm-4v-flash

Not Supported

Conversation, Vision

Zhipu_glm

Free model: possesses powerful image understanding capabilities.

Model Leaderboard

This document was translated from Chinese by AI and has not yet been reviewed.

LLM Arena Leaderboard (Live Updates)

This is a leaderboard based on data from Chatbot Arena (lmarena.ai), generated through an automated process.

Data Updated: 2025-06-12 11:42:10 UTC / 2025-06-12 19:42:10 CST (Beijing Time)

Click on the Model Name in the leaderboard to go to its detailed information or trial page.

Leaderboard

Rank (UB)

Rank (StyleCtrl)

Model Name

Score

Confidence Interval

Votes

Provider

License

Knowledge Cutoff

1478

+6/-7

7,343

Google

Proprietary

No data

1446

+6/-7

12,351

Google

Proprietary

No data

1425

+4/-5

15,210

OpenAI

Proprietary

No data

1423

+4/-4

19,762

OpenAI

Proprietary

No data

1420

+5/-5

12,614

Google

Proprietary

No data

1417

+4/-4

21,879

xAI

Proprietary

No data

1411

+4/-5

15,271

OpenAI

Proprietary

No data

1396

+5/-6

14,148

Google

Proprietary

No data

1384

+4/-5

13,830

OpenAI

Proprietary

No data

1382

+4/-4

16,550

DeepSeek

MIT

No data

1373

+4/-4

13,850

Anthropic

Proprietary

No data

1372

+6/-7

5,944

Tencent

Proprietary

No data

1371

+3/-4

19,430

DeepSeek

MIT

No data

1363

+6/-5

12,003

Mistral

Proprietary

No data

1361

+8/-6

6,636

xAI

Proprietary

No data

1363

+4/-3

29,038

OpenAI

Proprietary

No data

1362

+3/-3

34,240

Google

Proprietary

No data

1361

+6/-5

13,554

OpenAI

Proprietary

No data

1360

+4/-5

10,677

Alibaba

Apache 2.0

No data

1358

+3/-3

29,484

Alibaba

Proprietary

No data

1355

+4/-5

20,295

Google

Gemma

No data

1348

+3/-2

33,177

OpenAI

Proprietary

2023/10

1345

+5/-6

10,740

Anthropic

Proprietary

No data

1338

+4/-6

19,404

OpenAI

Proprietary

No data

1336

+5/-5

12,702

OpenAI

Proprietary

No data

1334

+7/-8

3,976

Google

Gemma

No data

1330

+11/-11

2,595

Amazon

Proprietary

No data

1332

+3/-4

22,841

DeepSeek

No data

1330

+5/-5

15,930

Alibaba

Apache 2.0

No data

1324

+8/-7

6,055

Alibaba

Proprietary

No data

1326

+3/-4

26,104

Google

Proprietary

No data

1324

+6/-7

6,028

Zhipu

Proprietary

No data

1323

+4/-3

20,084

Cohere

CC-BY-NC-4.0

No data

1316

+10/-8

2,452

Tencent

Proprietary

No data

1318

+7/-6

5,126

StepFun

Proprietary

No data

1319

+3/-3

32,421

OpenAI

Proprietary

No data

1310

+11/-8

2,371

Nvidia

No data

1310

+11/-11

2,510

Tencent

Proprietary

No data

1317

+2/-2

54,951

OpenAI

Proprietary

2023/10

1316

+2/-2

58,645

Google

Proprietary

No data

1313

+4/-4

21,310

Anthropic

Proprietary

No data

1301

+8/-8

3,913

Google

Gemma

No data

1306

+3/-3

25,983

Anthropic

Proprietary

No data

1301

+2/-2

67,084

xAI

Proprietary

2024/3

1301

+4/-3

28,968

01 AI

Proprietary

No data

1298

+2/-2

117,747

OpenAI

Proprietary

2023/10

1296

+5/-6

10,715

Alibaba

Proprietary

No data

1297

+2/-2

73,327

Anthropic

Proprietary

2024/4

1293

+7/-5

7,243

DeepSeek

No data

1289

+8/-10

4,321

Google

Gemma

No data

1285

+9/-7

3,856

Tencent

Proprietary

No data

1289

+3/-3

26,074

NexusFlow

No data

1287

+4/-3

27,788

Zhipu AI

Proprietary

No data

1287

+4/-5

13,750

Meta

Llama 4

No data

1284

+8/-7

6,302

OpenAI

Proprietary

No data

1285

+3/-2

72,536

OpenAI

Proprietary

2023/10

1285

+3/-3

37,021

Google

Proprietary

No data

1282

+6/-6

7,577

Nvidia

Llama 3.1

2023/12

1282

+2/-3

43,788

Meta

Llama 3.1 Community

2023/12

1282

+2/-2

86,159

Anthropic

Proprietary

2024/4

1274

+10/-9

4,014

Tencent

Proprietary

No data

1281

+2/-2

63,038

Meta

Llama 3.1 Community

2023/12

1280

+3/-2

52,144

Google

Proprietary

Online

1280

+2/-3

55,442

xAI

Proprietary

2024/3

1279

+3/-3

47,973

OpenAI

Proprietary

2023/10

1277

+4/-4

17,432

Alibaba

Qwen

No data

1273

+2/-2

82,435

Google

Proprietary

2023/11

1272

+2/-5

26,344

DeepSeek

No data

1271

+2/-3

41,519

Alibaba

Qwen

2024/9

1270

+3/-3

44,800

Meta

Llama-3.3

No data

1263

+10/-11

2,484

Mistral

Apache 2.0

No data

1270

+2/-2

102,133

OpenAI

Proprietary

2023/12

1265

+2/-3

48,217

Mistral

Mistral Research

2024/7

1264

+4/-3

20,580

NexusFlow

CC-BY-NC-4.0

2024/7

1258

+8/-8

3,010

Ai2

Llama 3.1

No data

1263

+1/-2

103,748

OpenAI

Proprietary

2023/4

1262

+2/-2

29,633

Mistral

MRL

No data

1261

+3/-2

58,637

Meta

Llama 3.1 Community

2023/12

1261

+1/-2

202,641

Anthropic

Proprietary

2023/8

1258

+3/-4

26,371

Amazon

Proprietary

No data

1258

+2/-2

97,079

OpenAI

Proprietary

2023/12

1251

+3/-3

44,893

Anthropic

Propretary

No data

1249

+6/-5

7,948

Reka AI

Proprietary

No data

1240

+2/-2

65,661

Google

Proprietary

2023/11

1235

+6/-5

9,125

AI21 Labs

Jamba Open

2024/3

1231

+7/-6

5,730

Alibaba

Apache 2.0

No data

1233

+2/-2

79,538

Google

Gemma license

2024/6

1231

+5/-4

15,321

Mistral

Apache 2.0

No data

1230

+3/-4

20,646

Amazon

Proprietary

No data

1230

+6/-5

10,548

Princeton

MIT

2024/7

1229

+4/-6

10,535

Cohere

CC-BY-NC-4.0

2024/8

1225

+8/-7

3,889

Nvidia

Llama 3.1

2023/12

1226

+3/-2

37,697

Google

Proprietary

No data

1219

+9/-10

3,460

Allen AI

Apache-2.0

No data

1223

+3/-3

28,768

Cohere

CC-BY-NC-4.0

No data

1223

+4/-5

20,608

Nvidia

NVIDIA Open Model

2023/6

1220

+5/-6

10,221

Zhipu AI

Proprietary

No data

1219

+6/-6

8,132

Reka AI

Proprietary

No data

1220

+2/-2

163,629

Meta

Llama 3 Community

2023/12

101

1219

+3/-3

25,213

Microsoft

MIT

No data

1214

+2/-2

113,067

Anthropic

Proprietary

2023/8

109

1211

+4/-3

20,654

Amazon

Proprietary

No data

109

1202

+12/-11

2,901

Tencent

Proprietary

No data

103

1206

+3/-2

57,197

Google

Gemma license

2024/6

103

1203

+2/-2

80,846

Cohere

CC-BY-NC-4.0

2024/3

103

111

1199

+9/-9

3,074

Ai2

Llama 3.1

No data

104

1201

+3/-3

38,872

Alibaba

Qianwen LICENSE

2024/6

104

1200

+3/-3

55,962

OpenAI

Proprietary

2021/9

104

109

1196

+7/-7

5,111

Mistral

MRL

No data

105

111

1193

+5/-5

10,391

Cohere

CC-BY-NC-4.0

No data

105

100

1193

+7/-4

10,851

Cohere

CC-BY-NC-4.0

2024/8

107

101

1193

+2/-2

122,309

Anthropic

Proprietary

2023/8

107

1192

+4/-4

15,753

DeepSeek AI

DeepSeek License

2024/6

107

109

1189

+5/-6

9,274

AI21 Labs

Jamba Open

2024/3

108

126

1189

+2/-3

52,578

Meta

Llama 3.1 Community

2023/12

116

1177

+2/-2

91,614

OpenAI

Proprietary

2021/9

116

111

1175

+3/-3

27,430

Alibaba

Qianwen LICENSE

2024/4

116

144

1166

+11/-9

3,410

Alibaba

Apache 2.0

No data

117

126

1171

+4/-3

25,135

01 AI

Apache-2.0

2024/5

117

111

1171

+2/-3

64,926

Mistral

Proprietary

No data

117

111

1169

+4/-4

16,027

Reka AI

Proprietary

Online

119

120

1165

+2/-2

109,056

Meta

Llama 3 Community

2023/3

120

133

1162

+5/-7

10,599

InternLM

Other

2024/8

121

115

1162

+2/-3

56,398

Cohere

CC-BY-NC-4.0

2024/3

121

120

1161

+3/-3

35,556

Mistral

Proprietary

No data

121

114

1161

+2/-2

53,751

Mistral

Apache 2.0

2024/4

121

118

1161

+3/-3

25,803

Reka AI

Proprietary

2023/11

121

115

1161

+3/-3

40,658

Alibaba

Qianwen LICENSE

2024/2

121

1156

+8/-9

3,289

IBM

Apache 2.0

No data

122

133

1157

+2/-3

48,892

Google

Gemma license

2024/7

130

115

1145

+4/-4

18,800

Google

Proprietary

2023/4

130

126

1141

+8/-8

4,854

HuggingFace

Apache 2.0

2024/4

131

129

1139

+4/-4

22,765

Alibaba

Qianwen LICENSE

2024/2

131

135

1133

+8/-7

3,380

IBM

Apache 2.0

No data

132

133

1136

+3/-4

26,105

Microsoft

MIT

2023/10

132

143

1132

+4/-4

16,676

Nexusflow

Apache-2.0

2024/3

135

133

1128

+3/-2

76,126

Mistral

Apache 2.0

2023/12

135

138

1125

+4/-4

15,917

01 AI

Yi License

2023/6

135

125

1124

+7/-7

6,557

Google

Proprietary

2023/4

136

1122

+5/-3

18,687

Alibaba

Qianwen LICENSE

2024/2

136

135

1120

+6/-4

8,383

Microsoft

Llama 2 Community

2023/8

138

123

1119

+3/-2

68,867

OpenAI

Proprietary

2021/9

138

143

1116

+7/-6

8,390

Meta

Llama 3.2

2023/12

139

133

1117

+3/-3

33,743

Databricks

DBRX LICENSE

2023/12

139

140

1116

+4/-4

18,476

Microsoft

MIT

2023/10

139

143

1113

+7/-5

6,658

AllenAI/UW

AI2 ImpACT Low-risk

2023/11

143

133

1107

+8/-7

7,002

IBM

Apache 2.0

No data

145

138

1105

+5/-4

12,990

OpenChat

Apache-2.0

2024/1

146

152

1106

+3/-3

39,595

Meta

Llama 2 Community

2023/7

146

144

1104

+3/-4

22,936

LMSYS

Non-commercial

2023/8

146

148

1102

+6/-5

10,415

UC Berkeley

CC-BY-NC-4.0

2023/11

147

138

1103

+2/-3

34,173

Snowflake

Apache 2.0

2024/4

147

156

1098

+7/-8

3,836

NousResearch

Apache-2.0

2024/1

147

154

1094

+10/-9

3,636

Nvidia

Llama 2 Community

2023/11

151

140

1097

+3/-4

25,070

Google

Gemma license

2024/2

152

143

1090

+9/-8

4,988

DeepSeek AI

DeepSeek License

2023/11

153

141

1090

+6/-5

8,106

OpenChat

Apache-2.0

2023/11

153

143

1088

+7/-11

5,088

NousResearch

Apache-2.0

2023/11

153

148

1087

+6/-7

7,191

IBM

Apache 2.0

No data

154

159

1083

+8/-10

4,872

Alibaba

Qianwen LICENSE

2024/2

155

159

1086

+4/-5

20,067

Mistral

Apache-2.0

2023/12

155

159

1084

+5/-5

12,808

Microsoft

MIT

2023/10

155

135

1081

+5/-4

17,036

OpenAI

Proprietary

2021/9

155

154

1076

+11/-13

1,714

Cognitive Computations

Apache-2.0

2023/10

157

163

1080

+4/-3

21,097

Microsoft

MIT

2023/10

157

159

1076

+8/-10

4,286

Upstage AI

CC-BY-NC-4.0

2023/11

160

163

1077

+3/-4

19,722

Meta

Llama 2 Community

2023/7

161

159

1072

+7/-7

7,176

Microsoft

Llama 2 Community

2023/7

165

169

1067

+6/-6

8,523

Meta

Llama 3.2

2023/12

166

167

1067

+4/-5

11,321

HuggingFace

MIT

2023/10

166

162

1060

+12/-10

2,375

HuggingFace

Apache 2.0

No data

166

159

1059

+9/-12

2,644

MosaicML

CC-BY-NC-SA-4.0

2023/6

166

168

1056

+9/-6

7,509

Meta

Llama 2 Community

2023/7

166

168

1055

+16/-17

1,192

Meta

Llama 2 Community

2024/1

166

163

1054

+15/-14

1,811

HuggingFace

MIT

2023/10

169

159

1048

+16/-16

1,327

TII

Falcon-180B TII License

2023/9

171

162

1055

+4/-4

19,775

LMSYS

Llama 2 Community

2023/7

171

169

1051

+6/-5

9,176

Google

Gemma license

2024/2

171

168

1050

+3/-4

21,622

Microsoft

MIT

2023/10

171

183

1050

+4/-6

14,532

Meta

Llama 2 Community

2023/7

171

162

1048

+9/-7

5,065

Alibaba

Qianwen LICENSE

2023/8

171

169

1046

+11/-12

2,996

Non-commercial

2023/5

180

174

1034

+5/-4

11,351

Google

Gemma license

2024/2

180

176

1031

+8/-8

5,276

Together AI

Apache 2.0

2023/12

181

189

1029

+6/-8

6,503

Allen AI

Apache-2.0

2024/2

184

181

1021

+7/-6

9,142

Mistral

Apache 2.0

2023/9

184

183

1018

+6/-6

7,017

LMSYS

Llama 2 Community

2023/7

184

172

1017

+7/-6

8,713

Google

Proprietary

2021/6

188

187

1003

+9/-9

4,918

Google

Gemma license

2024/2

189

185

1002

+5/-6

7,816

Alibaba

Qianwen LICENSE

2024/2

191

190

978

+6/-8

7,020

UC Berkeley

Non-commercial

2023/4

191

968

+8/-8

4,763

Tsinghua

Apache-2.0

2023/10

193

190

946

+14/-15

1,788

Nomic AI

Non-commercial

2023/3

193

191

942

+9/-9

3,997

MosaicML

CC-BY-NC-SA-4.0

2023/5

193

196

938

+13/-14

2,713

Tsinghua

Apache-2.0

2023/6

193

196

935

+9/-8

4,920

RWKV

Apache 2.0

2023/4

197

191

915

+6/-9

5,864

Stanford

Non-commercial

2023/3

197

196

906

+8/-9

6,368

OpenAssistant

Apache 2.0

2023/4

198

199

892

+9/-10

4,983

Tsinghua

Non-commercial

2023/3

199

881

+8/-9

4,288

LMSYS

Apache 2.0

2023/4

201

853

+10/-10

3,336

Stability AI

CC-BY-NC-SA-4.0

2023/4

201

199

836

+12/-12

3,480

Databricks

MIT

2023/4

202

200

813

+14/-12

2,446

Meta

Non-commercial

2023/2

Explanation

Rank (UB): A ranking calculated based on the Bradley-Terry model. This rank reflects the model's overall performance in the arena and provides an upper bound estimate of its Elo score, helping to understand the model's potential competitiveness.
Rank (StyleCtrl): The ranking after applying dialogue style control. This ranking aims to reduce preference bias caused by the model's response style (e.g., verbosity, conciseness) to more purely evaluate its core capabilities.
Model Name: The name of the Large Language Model (LLM). This column has embedded links to the models; click to navigate.
Score: The Elo rating the model received from user votes in the arena. The Elo rating is a relative ranking system where a higher score indicates better performance. This score is dynamic and reflects the model's relative strength in the current competitive environment.
Confidence Interval: The 95% confidence interval for the model's Elo rating (e.g., +6/-6). A smaller interval indicates that the model's rating is more stable and reliable; conversely, a larger interval may suggest insufficient data or significant performance fluctuations. It provides a quantitative assessment of the rating's accuracy.
Votes: The total number of votes the model has received in the arena. A higher number of votes generally means higher statistical reliability of its rating.
Provider: The organization or company that provides the model.
License: The type of license for the model, such as Proprietary, Apache 2.0, MIT, etc.
Knowledge Cutoff: The knowledge cutoff date for the model's training data. No data indicates that the relevant information is not provided or is unknown.

Data Source and Update Frequency

The data for this leaderboard is automatically generated and provided by the fboulnois/llm-leaderboard-csv project, which sources and processes data from lmarena.ai. This leaderboard is updated daily via GitHub Actions.

Disclaimer

This report is for reference only. The leaderboard data is dynamic and based on user preference votes on Chatbot Arena over a specific period. The completeness and accuracy of the data depend on the upstream data source and the updates and processing from the fboulnois/llm-leaderboard-csv project. Different models may have different license agreements; please refer to the official documentation from the model provider before use.

English

Cherry Studio

Project Introduction

Project Introduction

Core Features & Highlights

Project Advantages

Applicable Scenarios

Star History

Follow Our Social Accounts

Client Download

Client Download

Direct Download

Windows Version

macOS Version

Linux Version

Cloud Drive Download

Project Planning

Project Plan

To-Do List

Feature Introduction

Feature Overview

Agents

Agents

Drawing

Painting

Mini Programs

Mini Programs

Knowledge Base

Knowledge Base

Files

Files

Quick Assistant

Quick Assistant

Enable Quick Assistant

Using Quick Assistant

Tips & Tricks

Settings

Settings

Model Provider Settings

Provider Settings

API Key

API Address

Add Models

Connectivity Check

Default Model Settings

Default Model Settings

Default Assistant Model

Topic Naming Model

Translation Model

Quick Assistant Model

General Settings

General Settings

Display Settings

Display Settings

Theme Selection

Topic Settings

Custom CSS

Shortcut Key Settings

Shortcut Key Settings

Basic Tutorials

Installation Tutorial

Installation Tutorial

Windows

Windows

Open the Official Website

Wait for the Download to Complete

Open the File

Install

macOS

macOS

Model Provider Configuration

Model Service Configuration

Alibaba Cloud Bailian

Alibaba Cloud Bailian

SiliconFlow

SiliconFlow

1. Configure SiliconCloud's Model Service

​2. Using the Model Service

OpenAI

OpenAI

2. Using the Model Service

Tutorial Overview

1. Open CherryStudio and add "PPIO" as a model provider

2. Model Usage Configuration

3. PPIO×Cherry Studio Video Tutorial