Chat Interface
This document was translated from Chinese by AI and has not yet been reviewed.
Assistants and Topics
Assistants
Assistant refers to applying personalized settings to a selected model, such as preset prompts and parameter presets. These settings allow the chosen model to perform more in line with your expectations.
The System Default Assistant comes with a general parameter preset (no prompt). You can use it directly or find the preset you need on the Agents page.
Topics
Assistant is the parent of Topic. Multiple topics (i.e., conversations) can be created under a single assistant. All Topics share the assistant's parameter settings and model settings like preset words (prompt).


Buttons in the Chat Box

New Topic creates a new topic within the current assistant.
Upload Image or Document requires model support for image uploads. Document uploads will automatically be parsed into text and provided to the model as context.
Web Search requires configuring web search-related information in the settings. Search results are returned to the large language model as context. See Web Search for details.
Knowledge Base enables the knowledge base. See Knowledge Base Tutorial for details.
MCP Server enables the MCP server function. See MCP Usage Tutorial for details.
Generate Image is not shown by default. For models that support image generation (e.g., Gemini), you need to manually activate it to generate images.
Select Model switches to the specified model for the subsequent conversation, while preserving context.
Quick Phrase requires pre-setting common phrases in the settings to be called and directly entered here, supporting variables.
Clear Messages deletes all content under this topic.
Expand makes the chat box larger to accommodate long text input.
Clear Context truncates the context available to the model without deleting content, meaning the model will "forget" previous conversation content.
Estimated Token Count displays the estimated token count. The four values represent Current Context Count, Maximum Context Count (∞ indicates infinite context), Current Input Box Message Count, and Estimated Token Count.
Translate translates the content in the current input box into English.
Chat Settings

Model Settings
Model settings are synchronized with the Model Settings in Assistant Settings. See Assistant Settings for details.
Message Settings
Message Separator:
Message Separator:Use a separator to separate the message body from the action bar.


Use Serif Font:
Use Serif Font:Font style switching. You can also change the font via Custom CSS.
Show Line Numbers for Code:
Show Line Numbers for Code:Displays line numbers for code blocks when the model outputs code snippets.


Foldable Code Blocks:
Foldable Code Blocks:When enabled, code blocks will automatically fold if the code snippet is too long.
Word Wrap Code Blocks:
Word Wrap Code Blocks:When enabled, single-line code within code snippets will automatically wrap if it exceeds the window width.
Auto-fold Thought Content:
Auto-fold Thought Content:When enabled, models that support "thinking" will automatically fold the thinking process once completed.
Message Style:
Message Style:Allows switching the chat interface to bubble or list style.
Code Style:
Code Style:Allows switching the display style of code snippets.
Math Formula Engine:
Math Formula Engine:KaTeX renders faster because it is specifically optimized for performance;
MathJax renders slower but is more comprehensive, supporting more mathematical symbols and commands.
Message Font Size:
Message Font Size:Adjusts the font size of the chat interface.
Input Settings
Show Estimated Token Count:
Show Estimated Token Count:Displays the estimated number of Tokens consumed by the input text in the input box (not the actual context consumption, for reference only).
Paste Long Text as File:
Paste Long Text as File:When copying and pasting a long block of text from elsewhere into the input box, it will automatically display as a file, reducing interference during subsequent input.
Markdown Render Input Messages:
Markdown Render Input Messages:When turned off, only the model's reply messages are rendered, not the sent messages.


Translate by Hitting Space 3 Times Quickly:
Translate by Hitting Space 3 Times Quickly:After entering a message in the chat interface's input box, hitting the spacebar three times quickly will translate the input content into English.
Note: This operation will overwrite the original text.
Target Language:
Target Language:Sets the target language for the translate button in the input box and for the "Translate by Hitting Space 3 Times Quickly" feature.
Assistant Settings
In the assistant interface, select the desired Assistant Name → select the corresponding settings from the right-click menu.
Edit Assistant

Prompt Settings
Name:
Name:Customizable assistant name for easy identification.
Prompt:
Prompt:The prompt, which can be edited by referring to the prompt writing style on the Agents page.
Model Settings
Default Model:
Default Model:You can fix a default model for this assistant. When adding from the Agents page or duplicating an assistant, the initial model will be this one. If not set, the initial model will be the global initial model (i.e., Default Assistant Model).
Auto-reset Model:
Auto-reset Model:When enabled - If you switch to another model during use within this topic, creating a new topic will reset the new topic to the assistant's default model. When this item is disabled, the model for a new topic will follow the model used in the previous topic.
For example, if the assistant's default model is gpt-3.5-turbo, and I create Topic 1 under this assistant, then during the conversation in Topic 1, I switch to using gpt-4o. In this case:
If auto-reset is enabled: When creating Topic 2, the default selected model for Topic 2 will be gpt-3.5-turbo;
If auto-reset is not enabled: When creating Topic 2, the default selected model for Topic 2 will be gpt-4o.
Temperature:
Temperature:The temperature parameter controls the degree of randomness and creativity in the text generated by the model (default value is 0.7). Specifically:
Low temperature values (0-0.3):
Output is more deterministic, more focused
Suitable for scenarios requiring accuracy, such as code generation and data analysis
Tends to choose the most probable words for output
Medium temperature values (0.4-0.7):
Balances creativity and coherence
Suitable for daily conversations, general writing
Recommended for chatbot conversations (around 0.5)
High temperature values (0.8-1.0):
Produces more creative and diverse outputs
Suitable for creative writing, brainstorming scenarios
But may reduce text coherence
Top P (Nucleus Sampling):
Top P (Nucleus Sampling):The default value is 1. The smaller the value, the more monotonous and easier to understand the AI-generated content; the larger the value, the wider and more diverse the range of vocabulary in the AI's response.
Nucleus sampling affects the output by controlling the probability threshold for vocabulary selection:
Smaller values (0.1-0.3):
Only considers the highest probability vocabulary
Output is more conservative, more controllable
Suitable for scenarios like code comments, technical documentation
Medium values (0.4-0.6):
Balances vocabulary diversity and accuracy
Suitable for general conversation and writing tasks
Larger values (0.7-1.0):
Considers a wider range of vocabulary choices
Produces richer and more diverse content
Suitable for creative writing and other scenarios requiring diverse expression
Context Window
Context WindowThe number of messages to retain in the context. A larger value means a longer context and consumes more tokens:
5-10: Suitable for general conversations
>10: For complex tasks requiring longer memory (e.g., tasks that generate long texts step-by-step according to a writing outline, needing to ensure logical coherence of the generated context)
Note: More messages consume more tokens
Enable Message Length Limit (MaxToken)
Enable Message Length Limit (MaxToken)The maximum number of Tokens for a single response. In large language models, max token is a key parameter that directly affects the quality and length of the model's generated response.
For example: When testing if a model is connected after filling in the key in CherryStudio, you only need to know if the model returns a message correctly, not specific content. In this case, setting MaxToken to 1 is sufficient.
The MaxToken limit for most models is 32k Tokens, but some have 64k or even more. Specific details should be checked on the corresponding introduction page.
The exact setting depends on your needs, but you can also refer to the suggestions below.
Suggestions:
General chat: 500-800
Short text generation: 800-2000
Code generation: 2000-3600
Long text generation: 4000 and above (requires model support)
Generally, the model's generated response will be limited to the MaxToken range. However, it may sometimes be truncated (e.g., when writing long code) or incomplete. In special cases, flexible adjustments may be needed based on actual circumstances.
Stream Output
Stream OutputStream output is a data processing method that allows data to be transmitted and processed in a continuous stream, rather than sending all data at once. This method allows data to be processed and output immediately after it is generated, greatly improving real-time performance and efficiency.
In environments like the CherryStudio client, it essentially means a typewriter effect.
When disabled (non-streaming): The model generates the information and outputs the entire segment at once (imagine receiving a message on WeChat);
When enabled: Outputs character by character. You can think of it as the large model sending you each character immediately after it's generated, until all are sent.
Custom Parameters
Custom ParametersAdds extra request parameters to the request body, such as presence_penalty fields. Most people generally don't need this.
The top-p, maxtokens, stream, and other parameters mentioned above are some of these parameters.
Fill-in method: Parameter Name — Parameter Type (text, number, etc.) — Value. Reference documentation: Click to go
Last updated
Was this helpful?