Chat Interface

Assistants and Topics

Assistant

An assistant allows for personalized settings for a chosen model, such as preset prompts and parameter presets. These settings help the selected model work more in line with your expectations.

The System Default Assistant comes with a relatively general set of parameters (no prompt). You can use it directly or find the presets you need on the Agents page.

Topic

An assistant is a superset of a topic. Multiple topics (i.e., conversations) can be created under a single assistant. All topics share the assistant's parameter settings and model settings, such as preset words (prompts).

Buttons in the Chatbox

New Topic creates a new topic within the current assistant.

Upload Image or Document. Uploading images requires model support. Uploading documents will automatically parse them into text to be provided to the model as context.

Web Search requires configuring web search-related information in the settings. Search results are returned to the large model as context. See Web Search Mode for details.

Knowledge Base enables the knowledge base feature. See Knowledge Base Tutorial for details.

MCP Server enables the MCP server feature. See MCP Usage Tutorial for details.

Generate Image is displayed only when the selected chat model supports image generation. (For non-chat image generation models, please go to Drawing).

Select Model switches to the specified model for the subsequent conversation, while retaining the context.

Quick Phrases requires presetting common phrases in the settings. They can be invoked here, directly input, and support variables.

Clear Messages deletes all content in this topic.

Expand enlarges the chatbox for entering long texts.

Clear Context truncates the context available to the model without deleting content, meaning the model will "forget" previous conversation content.

Estimated Token Count displays the estimated token count. The four values are Current Context Count, Maximum Context Count (∞ means infinite context), Current Input Box Message Character Count, and Estimated Token Count.

This feature is only for estimating token count. The actual token count varies for each model, please refer to the data provided by the model provider.

Translate translates the content in the current input box into English.

Chat Settings

Model Settings

Model settings are synchronized with the Model Settings in the Assistant settings. See Assistant Settings.

In chat settings, only the model settings apply to the current assistant. Other settings apply globally. For example, if you set the message style to speech bubble, it will be a speech bubble style in any topic of any assistant.

Message Settings

Message Separator:

Use a separator to distinguish the message body from the action bar.

Use Serif Font:

Font style switching. You can also change the font via Custom CSS.

Display Line Numbers for Code:

Displays line numbers for code blocks when the model outputs code snippets.

Collapsible Code Blocks:

When enabled, code blocks will automatically collapse if the code snippet is too long.

Code Block Word Wrap:

When enabled, single lines of code within code snippets will automatically wrap if they exceed the window width.

Auto-collapse Thinking Content:

When enabled, models that support "thinking" will automatically collapse the thinking process after completion.

Message Style:

Can switch the chat interface to a bubble style or list style.

Code Style:

Can switch the display style of code snippets.

Math Formula Engine:

  • KaTeX renders faster because it is specifically designed for performance optimization;

  • MathJax renders slower but is more comprehensive, supporting more mathematical symbols and commands.

Message Font Size:

Adjusts the font size of the chat interface.

Input Settings

Show Estimated Token Count:

Displays the estimated number of tokens consumed by the input text in the input box (not the actual context consumption, for reference only).

Paste Long Text as File:

When copying and pasting a long passage of text from elsewhere into the input box, it will automatically appear as a file, reducing interference when entering subsequent content.

Markdown Render Input Messages:

When off, only model replies are rendered, not sent messages.

Translate by Tapping Space 3 Times:

After entering a message in the chat interface input box, tapping the space bar three times consecutively will translate the input content into English.

Target Language:

Sets the target language for the input box translation button and the "Translate by tapping space 3 times" feature.

Assistant Settings

In the assistant interface, select the assistant name to be set → choose the corresponding setting from the right-click menu.

Edit Assistant

Assistant settings apply to all topics under that assistant.

Prompt Settings

Name:

Customizable assistant name for easy identification.

Prompt:

This is the prompt. You can refer to the prompt writing style on the Agents page to edit the content.

Model Settings

Default Model:

You can fix a default model for this assistant. When adding from the Agents page or copying an assistant, the initial model will be this model. If this item is not set, the initial model will be the global initial model (i.e., Default Assistant Model).

There are two types of default models for assistants: one is the Global Default Chat Model, and the other is the assistant's default model. The assistant's default model takes precedence over the global default chat model. If the assistant's default model is not set, then the assistant's default model = the global default chat model.

Auto-reset Model:

When enabled - if another model is switched to during usage within this topic, creating a new topic will reset the new topic's model to the assistant's default model. When this item is disabled, the model for a new topic will follow the model used in the previous topic.

For example, if the assistant's default model is gpt-3.5-turbo, and I create Topic 1 under this assistant, then switch to gpt-4o during the conversation in Topic 1:

If auto-reset is enabled: when creating Topic 2, Topic 2 will default to gpt-3.5-turbo.

If auto-reset is not enabled: when creating Topic 2, Topic 2 will default to gpt-4o.

Temperature:

The temperature parameter controls the degree of randomness and creativity in the text generated by the model (default value is 0.7). Specifically:

  • Low temperature values (0-0.3):

    • Output is more deterministic and focused.

    • Suitable for tasks requiring accuracy, such as code generation and data analysis.

    • Tends to select the most probable words for output.

  • Medium temperature values (0.4-0.7):

    • Balances creativity and coherence.

    • Suitable for daily conversations and general writing.

    • Recommended for chatbot conversations (around 0.5).

  • High temperature values (0.8-1.0):

    • Produces more creative and diverse outputs.

    • Suitable for creative writing, brainstorming, and similar scenarios.

    • May reduce the coherence of the text.

Top P (Nucleus Sampling):

The default value is 1. The smaller the value, the more monotonous and easier to understand the AI-generated content; the larger the value, the wider the range of vocabulary the AI uses, making it more diverse.

Nucleus sampling influences the output by controlling the probability threshold for word selection:

  • Smaller values (0.1-0.3):

    • Considers only the highest probability words.

    • Output is more conservative and controlled.

    • Suitable for code comments, technical documentation, etc.

  • Medium values (0.4-0.6):

    • Balances vocabulary diversity and accuracy.

    • Suitable for general conversations and writing tasks.

  • Larger values (0.7-1.0):

    • Considers a wider range of vocabulary choices.

    • Produces richer and more diverse content.

    • Suitable for creative writing and other scenarios requiring diverse expression.

  • These two parameters can be used independently or in combination.

  • Choose appropriate parameter values based on the specific task type.

  • It is recommended to experiment to find the optimal parameter combination for a particular application scenario.

  • The above content is for reference and conceptual understanding only; the given parameter ranges may not be suitable for all models. Please refer to the model's documentation for specific parameter recommendations.

Context Window

The number of messages to retain in context. A larger value means longer context and consumes more tokens:

  • 5-10: Suitable for general conversations.

  • 10: For complex tasks requiring longer memory (e.g., generating long texts step-by-step according to an outline, where logical coherence of generated context is needed).

  • Note: More messages mean higher token consumption.

Enable Message Length Limit (MaxToken)

The maximum Token count for a single response. In large language models, max_tokens is a key parameter that directly affects the quality and length of the model's generated response.

For example: When testing if a model is connected in CherryStudio after entering the key, you only need to know if the model returns a message correctly, not specific content. In this case, setting MaxToken to 1 is sufficient.

Most models have a MaxToken limit of 32k Tokens, but some have 64k or even more. You need to check the corresponding introductory page for details.

The specific setting depends on your needs, but you can also refer to the suggestions below.

Stream Output (Stream)

Stream output is a data processing method that allows data to be transmitted and processed in a continuous stream, rather than sending all data at once. This method allows data to be processed and output immediately after it is generated, greatly improving real-time performance and efficiency.

In environments like the CherryStudio client, it simply means a "typewriter effect".

When off (non-streaming): The model generates the entire piece of information and outputs it all at once (imagine receiving a message on WeChat);

When on: Output character by character. This can be understood as the large model sending you each generated character immediately until all characters are sent.

If certain specific models do not support stream output, this switch needs to be turned off, such as o1-mini which initially only supported non-streaming.

Custom Parameters

Adds additional request parameters to the request body, such as presence_penalty, etc. Most people generally do not need to use this.

The top-p, maxtokens, stream parameters mentioned above are among these parameters.

Filling method: Parameter Name — Parameter Type (text, number, etc.) — Value. Reference documentation: Click to go

Each model provider has its own unique parameters, more or less. You need to find the usage method in the provider's documentation.

  • Custom parameters take precedence over built-in parameters. That is, if a custom parameter duplicates a built-in parameter, the custom parameter will override the built-in parameter.

For example: if model is set to gpt-4o in custom parameters, then gpt-4o will be used in the conversation regardless of which model is selected.

  • Setting Parameter Name:undefined can exclude a parameter.

Last updated

Was this helpful?