Chat Interface

Assistants and Topics

Assistants

Assistant refers to applying personalized settings to a selected model, such as preset prompts and parameter presets. These settings allow the chosen model to perform more in line with your expectations.

The System Default Assistant comes with a general parameter preset (no prompt). You can use it directly or find the preset you need on the Agents page.

Topics

Assistant is the parent of Topic. Multiple topics (i.e., conversations) can be created under a single assistant. All Topics share the assistant's parameter settings and model settings like preset words (prompt).

Buttons in the Chat Box

New Topic creates a new topic within the current assistant.

Upload Image or Document requires model support for image uploads. Document uploads will automatically be parsed into text and provided to the model as context.

Web Search requires configuring web search-related information in the settings. Search results are returned to the large language model as context. See Web Search for details.

Knowledge Base enables the knowledge base. See Knowledge Base Tutorial for details.

MCP Server enables the MCP server function. See MCP Usage Tutorial for details.

Generate Image is not shown by default. For models that support image generation (e.g., Gemini), you need to manually activate it to generate images.

Due to technical reasons, you must manually activate the button to generate images. This button will be removed once this feature is optimized.

Select Model switches to the specified model for the subsequent conversation, while preserving context.

Quick Phrase requires pre-setting common phrases in the settings to be called and directly entered here, supporting variables.

Clear Messages deletes all content under this topic.

Expand makes the chat box larger to accommodate long text input.

Clear Context truncates the context available to the model without deleting content, meaning the model will "forget" previous conversation content.

Estimated Token Count displays the estimated token count. The four values represent Current Context Count, Maximum Context Count (∞ indicates infinite context), Current Input Box Message Count, and Estimated Token Count.

This function is only for estimating Token count. The actual Token count varies for each model; please refer to the data provided by the model provider.

Translate translates the content in the current input box into English.

Chat Settings

Model Settings

Model settings are synchronized with the Model Settings in Assistant Settings. See Assistant Settings for details.

In chat settings, only the model settings apply to the current assistant; other settings apply globally. For example, if you set the message style to bubble, it will be a bubble style for any topic in any assistant.

Message Settings

Message Separator:

Use a separator to separate the message body from the action bar.

Use Serif Font:

Font style switching. You can also change the font via Custom CSS.

Show Line Numbers for Code:

Displays line numbers for code blocks when the model outputs code snippets.

Foldable Code Blocks:

When enabled, code blocks will automatically fold if the code snippet is too long.

Word Wrap Code Blocks:

When enabled, single-line code within code snippets will automatically wrap if it exceeds the window width.

Auto-fold Thought Content:

When enabled, models that support "thinking" will automatically fold the thinking process once completed.

Message Style:

Allows switching the chat interface to bubble or list style.

Code Style:

Allows switching the display style of code snippets.

Math Formula Engine:

  • KaTeX renders faster because it is specifically optimized for performance;

  • MathJax renders slower but is more comprehensive, supporting more mathematical symbols and commands.

Message Font Size:

Adjusts the font size of the chat interface.

Input Settings

Show Estimated Token Count:

Displays the estimated number of Tokens consumed by the input text in the input box (not the actual context consumption, for reference only).

Paste Long Text as File:

When copying and pasting a long block of text from elsewhere into the input box, it will automatically display as a file, reducing interference during subsequent input.

Markdown Render Input Messages:

When turned off, only the model's reply messages are rendered, not the sent messages.

Translate by Hitting Space 3 Times Quickly:

After entering a message in the chat interface's input box, hitting the spacebar three times quickly will translate the input content into English.

Target Language:

Sets the target language for the translate button in the input box and for the "Translate by Hitting Space 3 Times Quickly" feature.

Assistant Settings

In the assistant interface, select the desired Assistant Name → select the corresponding settings from the right-click menu.

Edit Assistant

Assistant settings apply to all topics under that assistant.

Prompt Settings

Name:

Customizable assistant name for easy identification.

Prompt:

The prompt, which can be edited by referring to the prompt writing style on the Agents page.

Model Settings

Default Model:

You can fix a default model for this assistant. When adding from the Agents page or duplicating an assistant, the initial model will be this one. If not set, the initial model will be the global initial model (i.e., Default Assistant Model).

There are two types of default models for assistants: one is the Global Default Chat Model, and the other is the assistant's default model. The assistant's default model has higher priority than the global default chat model. If the assistant's default model is not set, then the assistant's default model = the global default chat model.

Auto-reset Model:

When enabled - If you switch to another model during use within this topic, creating a new topic will reset the new topic to the assistant's default model. When this item is disabled, the model for a new topic will follow the model used in the previous topic.

For example, if the assistant's default model is gpt-3.5-turbo, and I create Topic 1 under this assistant, then during the conversation in Topic 1, I switch to using gpt-4o. In this case:

If auto-reset is enabled: When creating Topic 2, the default selected model for Topic 2 will be gpt-3.5-turbo;

If auto-reset is not enabled: When creating Topic 2, the default selected model for Topic 2 will be gpt-4o.

Temperature:

The temperature parameter controls the degree of randomness and creativity in the text generated by the model (default value is 0.7). Specifically:

  • Low temperature values (0-0.3):

    • Output is more deterministic, more focused

    • Suitable for scenarios requiring accuracy, such as code generation and data analysis

    • Tends to choose the most probable words for output

  • Medium temperature values (0.4-0.7):

    • Balances creativity and coherence

    • Suitable for daily conversations, general writing

    • Recommended for chatbot conversations (around 0.5)

  • High temperature values (0.8-1.0):

    • Produces more creative and diverse outputs

    • Suitable for creative writing, brainstorming scenarios

    • But may reduce text coherence

Top P (Nucleus Sampling):

The default value is 1. The smaller the value, the more monotonous and easier to understand the AI-generated content; the larger the value, the wider and more diverse the range of vocabulary in the AI's response.

Nucleus sampling affects the output by controlling the probability threshold for vocabulary selection:

  • Smaller values (0.1-0.3):

    • Only considers the highest probability vocabulary

    • Output is more conservative, more controllable

    • Suitable for scenarios like code comments, technical documentation

  • Medium values (0.4-0.6):

    • Balances vocabulary diversity and accuracy

    • Suitable for general conversation and writing tasks

  • Larger values (0.7-1.0):

    • Considers a wider range of vocabulary choices

    • Produces richer and more diverse content

    • Suitable for creative writing and other scenarios requiring diverse expression

  • These two parameters can be used independently or in combination.

  • Choose appropriate parameter values based on the specific task type.

  • It is recommended to experiment to find the most suitable parameter combination for a particular application scenario.

  • The above content is for reference and conceptual understanding only; the given parameter ranges may not be suitable for all models. Please refer to the model's documentation for specific parameter recommendations.

Context Window

The number of messages to retain in the context. A larger value means a longer context and consumes more tokens:

  • 5-10: Suitable for general conversations

  • >10: For complex tasks requiring longer memory (e.g., tasks that generate long texts step-by-step according to a writing outline, needing to ensure logical coherence of the generated context)

  • Note: More messages consume more tokens

Enable Message Length Limit (MaxToken)

The maximum number of Tokens for a single response. In large language models, max token is a key parameter that directly affects the quality and length of the model's generated response.

For example: When testing if a model is connected after filling in the key in CherryStudio, you only need to know if the model returns a message correctly, not specific content. In this case, setting MaxToken to 1 is sufficient.

The MaxToken limit for most models is 32k Tokens, but some have 64k or even more. Specific details should be checked on the corresponding introduction page.

The exact setting depends on your needs, but you can also refer to the suggestions below.

Stream Output

Stream output is a data processing method that allows data to be transmitted and processed in a continuous stream, rather than sending all data at once. This method allows data to be processed and output immediately after it is generated, greatly improving real-time performance and efficiency.

In environments like the CherryStudio client, it essentially means a typewriter effect.

When disabled (non-streaming): The model generates the information and outputs the entire segment at once (imagine receiving a message on WeChat);

When enabled: Outputs character by character. You can think of it as the large model sending you each character immediately after it's generated, until all are sent.

If certain special models do not support streaming output, this switch needs to be turned off, such as o1-mini and others that initially only supported non-streaming.

Custom Parameters

Adds extra request parameters to the request body, such as presence_penalty fields. Most people generally don't need this.

The top-p, maxtokens, stream, and other parameters mentioned above are some of these parameters.

Fill-in method: Parameter Name — Parameter Type (text, number, etc.) — Value. Reference documentation: Click to go

Each model provider may have its own unique parameters, and you will need to find usage instructions in the provider's documentation.

  • Custom parameters take precedence over built-in parameters. That is, if a custom parameter duplicates a built-in parameter, the custom parameter will overwrite the built-in parameter.

For example: If model is set to gpt-4o in custom parameters, then gpt-4o will be used for conversations regardless of which model is selected.

  • Using the setting Parameter Name:undefined can exclude a parameter.

Last updated

Was this helpful?