Chat Interface
This document was translated from Chinese by AI and has not yet been reviewed.
Chat Interface
Assistants and Topics
Assistant
An Assistant
is a personalized configuration of a selected model, including settings like prompt presets and parameter presets. These settings allow the selected model to better meet your expected work requirements.
The System Default Assistant
has a fairly general parameter preset (no prompt), which you can use directly or go to the Agent Page to find a preset that suits your needs.
Topic
An Assistant
is a parent set to a Topic
. Multiple topics (i.e., conversations) can be created under a single assistant. All Topics
share the Assistant
's parameter settings, preset words (prompt), and other model settings.


Buttons in the Chat Box

New Topic
Creates a new topic within the current assistant.
Upload Image or Document
Uploading images requires model support. Uploaded documents will be automatically parsed into text and provided to the model as context.
Web Search
Requires configuring web search-related information in the settings. The search results are returned to the large model as context. For details, see Web Search Mode.
Knowledge Base
Enables the knowledge base. For details, see Knowledge Base Tutorial.
MCP Server
Enables the MCP server function. For details, see MCP Usage Tutorial.
Generate Image
Not displayed by default. For models that support image generation (like Gemini), you need to manually activate it to generate images.
Select Model
Switches to the specified model for the subsequent conversation while retaining the context.
Quick Phrases
You need to preset common phrases in the settings first. They can be invoked here and entered directly, with support for variables.
Clear Messages
Deletes all content under the current topic.
Expand
Makes the chat box larger for entering long texts.
Clear Context
Truncates the context available to the model without deleting the content, meaning the model will "forget" the previous conversation.
Estimate Token Count
Displays the estimated token count. The four data points are Current Context Count
, Max Context Count
(∞ means infinite context), Character Count in Current Input Box
, and Estimated Token Count
.
Translate
Translates the content in the current input box into English.
Conversation Settings

Model Settings
Model settings are synchronized with the Model Settings
parameters in the assistant settings. For details, see Assistant Settings.
Message Settings
Message Divider
:
Separates the message body from the action bar with a divider.

Use Serif Font
:
Switches the font style. You can now also change the font via custom CSS.
Show Line Numbers in Code
:
Displays line numbers in code blocks when the model outputs code snippets.

Collapsible Code Blocks
:
When enabled, long code snippets will be automatically collapsed.
Wrap Lines in Code Blocks
:
When enabled, long single lines of code (exceeding the window) will automatically wrap.
Auto-collapse Thinking Process
:
When enabled, models that support showing their thinking process will automatically collapse it after completion.
Message Style
:
You can switch the chat interface to either bubble style or list style.
Code Style
:
You can switch the display style of code snippets.
Math Formula Engine
:
KaTeX renders faster as it is specifically designed for performance optimization.
MathJax renders slower but is more feature-complete, supporting more mathematical symbols and commands.
Message Font Size
:
Adjusts the font size of the chat interface.
Input Settings
Show Estimated Token Count
:
Displays the estimated token consumption of the input text in the input box (not the actual context token consumption, for reference only).
Paste Long Text as File
:
When pasting a long text from another source into the input box, it will automatically be displayed as a file style to reduce interference with subsequent input.
Render Input Messages with Markdown
:
When off, only the model's reply messages are rendered, not the messages you send.

Triple-press Space to Translate
:
After typing a message in the chat input box, pressing the spacebar three times consecutively will translate the input into English.
Note: This action will overwrite the original text.
Target Language
:
Sets the target language for the translate button in the input box and the triple-press space translation feature.
Assistant Settings
In the assistant interface, select the assistant name you want to configure → choose the corresponding setting from the right-click context menu.
Edit Assistant

Prompt Settings
Name
:
You can customize the assistant's name for easy identification.
Prompt
:
This is the prompt
. You can refer to the prompt writing style on the agent page to edit the content.
Model Settings
Default Model
:
You can set a fixed default model for this assistant. When adding from the agent page or copying an assistant, the initial model will be this one. If this is not set, the initial model will be the global initial model (i.e., the Default Assistant Model).
Auto Reset Model
:
When on - If you switch to another model during a conversation in a topic, creating a new topic will reset the model for the new topic to the assistant's default model. When this option is off, the model for a new topic will follow the model used in the previous topic.
For example, if the assistant's default model is gpt-3.5-turbo, and I create Topic 1 under this assistant, and during the conversation in Topic 1, I switch to gpt-4o, then:
If auto-reset is on: When creating Topic 2, the default model for Topic 2 will be gpt-3.5-turbo.
If auto-reset is off: When creating Topic 2, the default model for Topic 2 will be gpt-4o.
Temperature
:
The temperature parameter controls the degree of randomness and creativity in the text generated by the model (default is 0.7). Specifically:
Low temperature value (0-0.3):
Output is more deterministic and focused
Suitable for scenarios requiring accuracy, like code generation and data analysis
Tends to select the most likely words
Medium temperature value (0.4-0.7):
Balances creativity and coherence
Suitable for daily conversations and general writing
Recommended for chatbot conversations (around 0.5)
High temperature value (0.8-1.0):
Produces more creative and diverse output
Suitable for creative writing, brainstorming, etc.
May reduce the coherence of the text
Top P (Nucleus Sampling)
:
The default value is 1. The smaller the value, the more monotonous and easier to understand the AI-generated content is. The larger the value, the wider and more diverse the vocabulary of the AI's response.
Nucleus sampling affects the output by controlling the probability threshold for vocabulary selection:
Smaller value (0.1-0.3):
Considers only the highest probability words
Output is more conservative and controllable
Suitable for code comments, technical documentation, etc.
Medium value (0.4-0.6):
Balances vocabulary diversity and accuracy
Suitable for general conversation and writing tasks
Larger value (0.7-1.0):
Considers a wider range of vocabulary choices
Produces richer and more diverse content
Suitable for creative writing and other scenarios requiring diverse expression
Context Window
The number of messages to keep in the context. The larger the value, the longer the context and the more tokens are consumed:
5-10: Suitable for normal conversations
>10: For complex tasks requiring longer memory (e.g., generating a long article step-by-step according to an outline, which requires ensuring the generated context is logically coherent)
Note: The more messages, the greater the token consumption
Enable Message Length Limit (MaxToken)
The maximum number of Tokens for a single response. In large language models, max tokens is a key parameter that directly affects the quality and length of the generated response.
For example: When testing if a model is connected after filling in the key in CherryStudio, you only need to know if the model returns a message correctly without specific content. In this case, setting MaxToken to 1 is sufficient.
The MaxToken limit for most models is 32k Tokens, but some have 64k or even more. You need to check the corresponding introduction page for specifics.
The specific setting depends on your needs, but you can also refer to the following suggestions.
Suggestions:
Normal chat: 500-800
Short article generation: 800-2000
Code generation: 2000-3600
Long article generation: 4000 and above (requires model support)
Generally, the model's response will be limited within the MaxToken range. However, it might be truncated (e.g., when writing long code) or the expression may be incomplete. In special cases, you need to adjust it flexibly according to the actual situation.
Streaming Output (Stream)
Streaming output is a data processing method that allows data to be transmitted and processed as a continuous stream, rather than sending all data at once. This method allows data to be processed and output immediately after it is generated, greatly improving real-time performance and efficiency.
In an environment like the CherryStudio client, it's simply a typewriter effect.
When off (non-streaming): The model outputs the entire message at once after generating it (imagine receiving a message on WeChat).
When on: Word-by-word output. You can think of it as the large model sending you each word as it generates it, until the entire message is sent.
Custom Parameters
Adds extra request parameters to the request body, such as presence_penalty
, which are generally not needed by most users.
The parameters mentioned above like top-p, maxtokens, stream, etc., are examples of these parameters.
How to fill: Parameter Name—Parameter Type (text, number, etc.)—Value. Refer to the documentation: Click to go
最后更新于
这有帮助吗?