Chat Interface
This document was translated from Chinese by AI and has not yet been reviewed.
Assistants and Topics
Assistant
An assistant allows for personalized settings for a chosen model, such as preset prompts and parameter presets. These settings help the selected model work more in line with your expectations.
The System Default Assistant comes with a relatively general set of parameters (no prompt). You can use it directly or find the presets you need on the Agents page.
Topic
An assistant is a superset of a topic. Multiple topics (i.e., conversations) can be created under a single assistant. All topics share the assistant's parameter settings and model settings, such as preset words (prompts).


Buttons in the Chatbox

New Topic creates a new topic within the current assistant.
Upload Image or Document. Uploading images requires model support. Uploading documents will automatically parse them into text to be provided to the model as context.
Web Search requires configuring web search-related information in the settings. Search results are returned to the large model as context. See Web Search Mode for details.
Knowledge Base enables the knowledge base feature. See Knowledge Base Tutorial for details.
MCP Server enables the MCP server feature. See MCP Usage Tutorial for details.
Generate Image is displayed only when the selected chat model supports image generation. (For non-chat image generation models, please go to Drawing).
Select Model switches to the specified model for the subsequent conversation, while retaining the context.
Quick Phrases requires presetting common phrases in the settings. They can be invoked here, directly input, and support variables.
Clear Messages deletes all content in this topic.
Expand enlarges the chatbox for entering long texts.
Clear Context truncates the context available to the model without deleting content, meaning the model will "forget" previous conversation content.
Estimated Token Count displays the estimated token count. The four values are Current Context Count, Maximum Context Count (∞ means infinite context), Current Input Box Message Character Count, and Estimated Token Count.
Translate translates the content in the current input box into English.
Chat Settings

Model Settings
Model settings are synchronized with the Model Settings in the Assistant settings. See Assistant Settings.
Message Settings
Message Separator:
Message Separator:Use a separator to distinguish the message body from the action bar.


Use Serif Font:
Use Serif Font:Font style switching. You can also change the font via Custom CSS.
Display Line Numbers for Code:
Display Line Numbers for Code:Displays line numbers for code blocks when the model outputs code snippets.


Collapsible Code Blocks:
Collapsible Code Blocks:When enabled, code blocks will automatically collapse if the code snippet is too long.
Code Block Word Wrap:
Code Block Word Wrap:When enabled, single lines of code within code snippets will automatically wrap if they exceed the window width.
Auto-collapse Thinking Content:
Auto-collapse Thinking Content:When enabled, models that support "thinking" will automatically collapse the thinking process after completion.
Message Style:
Message Style:Can switch the chat interface to a bubble style or list style.
Code Style:
Code Style:Can switch the display style of code snippets.
Math Formula Engine:
Math Formula Engine:KaTeX renders faster because it is specifically designed for performance optimization;
MathJax renders slower but is more comprehensive, supporting more mathematical symbols and commands.
Message Font Size:
Message Font Size:Adjusts the font size of the chat interface.
Input Settings
Show Estimated Token Count:
Show Estimated Token Count:Displays the estimated number of tokens consumed by the input text in the input box (not the actual context consumption, for reference only).
Paste Long Text as File:
Paste Long Text as File:When copying and pasting a long passage of text from elsewhere into the input box, it will automatically appear as a file, reducing interference when entering subsequent content.
Markdown Render Input Messages:
Markdown Render Input Messages:When off, only model replies are rendered, not sent messages.


Translate by Tapping Space 3 Times:
Translate by Tapping Space 3 Times:After entering a message in the chat interface input box, tapping the space bar three times consecutively will translate the input content into English.
Note: This operation will overwrite the original text.
Target Language:
Target Language:Sets the target language for the input box translation button and the "Translate by tapping space 3 times" feature.
Assistant Settings
In the assistant interface, select the assistant name to be set → choose the corresponding setting from the right-click menu.
Edit Assistant

Prompt Settings
Name:
Name:Customizable assistant name for easy identification.
Prompt:
Prompt:This is the prompt. You can refer to the prompt writing style on the Agents page to edit the content.
Model Settings
Default Model:
Default Model:You can fix a default model for this assistant. When adding from the Agents page or copying an assistant, the initial model will be this model. If this item is not set, the initial model will be the global initial model (i.e., Default Assistant Model).
Auto-reset Model:
Auto-reset Model:When enabled - if another model is switched to during usage within this topic, creating a new topic will reset the new topic's model to the assistant's default model. When this item is disabled, the model for a new topic will follow the model used in the previous topic.
For example, if the assistant's default model is gpt-3.5-turbo, and I create Topic 1 under this assistant, then switch to gpt-4o during the conversation in Topic 1:
If auto-reset is enabled: when creating Topic 2, Topic 2 will default to gpt-3.5-turbo.
If auto-reset is not enabled: when creating Topic 2, Topic 2 will default to gpt-4o.
Temperature:
Temperature:The temperature parameter controls the degree of randomness and creativity in the text generated by the model (default value is 0.7). Specifically:
Low temperature values (0-0.3):
Output is more deterministic and focused.
Suitable for tasks requiring accuracy, such as code generation and data analysis.
Tends to select the most probable words for output.
Medium temperature values (0.4-0.7):
Balances creativity and coherence.
Suitable for daily conversations and general writing.
Recommended for chatbot conversations (around 0.5).
High temperature values (0.8-1.0):
Produces more creative and diverse outputs.
Suitable for creative writing, brainstorming, and similar scenarios.
May reduce the coherence of the text.
Top P (Nucleus Sampling):
Top P (Nucleus Sampling):The default value is 1. The smaller the value, the more monotonous and easier to understand the AI-generated content; the larger the value, the wider the range of vocabulary the AI uses, making it more diverse.
Nucleus sampling influences the output by controlling the probability threshold for word selection:
Smaller values (0.1-0.3):
Considers only the highest probability words.
Output is more conservative and controlled.
Suitable for code comments, technical documentation, etc.
Medium values (0.4-0.6):
Balances vocabulary diversity and accuracy.
Suitable for general conversations and writing tasks.
Larger values (0.7-1.0):
Considers a wider range of vocabulary choices.
Produces richer and more diverse content.
Suitable for creative writing and other scenarios requiring diverse expression.
Context Window
Context WindowThe number of messages to retain in context. A larger value means longer context and consumes more tokens:
5-10: Suitable for general conversations.
10: For complex tasks requiring longer memory (e.g., generating long texts step-by-step according to an outline, where logical coherence of generated context is needed).
Note: More messages mean higher token consumption.
Enable Message Length Limit (MaxToken)
Enable Message Length Limit (MaxToken)The maximum Token count for a single response. In large language models, max_tokens is a key parameter that directly affects the quality and length of the model's generated response.
For example: When testing if a model is connected in CherryStudio after entering the key, you only need to know if the model returns a message correctly, not specific content. In this case, setting
MaxTokento 1 is sufficient.
Most models have a MaxToken limit of 32k Tokens, but some have 64k or even more. You need to check the corresponding introductory page for details.
The specific setting depends on your needs, but you can also refer to the suggestions below.
Suggestions:
General chat: 500-800
Short text generation: 800-2000
Code generation: 2000-3600
Long text generation: 4000 and above (requires model support)
Generally, the model's response will be limited to the MaxToken range. However, truncation (e.g., when writing long code) or incomplete expressions may occur. In special cases, flexible adjustments need to be made based on actual circumstances.
Stream Output (Stream)
Stream Output (Stream)Stream output is a data processing method that allows data to be transmitted and processed in a continuous stream, rather than sending all data at once. This method allows data to be processed and output immediately after it is generated, greatly improving real-time performance and efficiency.
In environments like the CherryStudio client, it simply means a "typewriter effect".
When off (non-streaming): The model generates the entire piece of information and outputs it all at once (imagine receiving a message on WeChat);
When on: Output character by character. This can be understood as the large model sending you each generated character immediately until all characters are sent.
Custom Parameters
Custom ParametersAdds additional request parameters to the request body, such as presence_penalty, etc. Most people generally do not need to use this.
The
top-p,maxtokens,streamparameters mentioned above are among these parameters.
Filling method: Parameter Name — Parameter Type (text, number, etc.) — Value. Reference documentation: Click to go
Last updated
Was this helpful?