Conversation Interface

Assistant and Topic

Assistant

Assistant applies some personalized settings to the selected model to use it, such as prompt presets and parameter presets, so that the selected model can better match your expected work.

System Default Assistant has preset relatively general parameters (no prompt). You can use it directly or go to the Agent page to find the preset you need to use.

Topic

Assistant is the Topic superset of, and multiple topics (i.e., conversations) can be created under a single assistant. All Topic share Assistant the parameter settings and prompt and other model settings.

Buttons in the chat box

New Topic Create a new topic within the current assistant.

Upload Image or Document Uploading images requires model support. Uploaded documents will be automatically parsed into text and provided to the model as context.

Web Search You must configure web search-related information in Settings. The search results are returned to the large model as context. For details, see Online Mode.

Knowledge Base Enable the knowledge base. For details, see Knowledge Base Tutorial.

MCP Server Enable the MCP server feature. For details, see MCP Usage Tutorial.

Generate Image Only when the selected Chat Model supports image generation will this be shown. (For non-chat image generation models, please go to Drawing)

Select Model Switch to the specified model for the upcoming conversation while retaining context.

Quick Phrases You need to preset commonly used phrases in Settings first, then invoke them here and input directly. Variables are supported.

Clear Messages Delete all content under this topic.

Expand Make the chat box larger to facilitate long-text input.

Clear Context Without deleting content, truncate the context available to the model—in other words, the model will “forget” previous conversation content.

Estimated Token Count Display the estimated token count. The four figures are, respectively, Current context count , Maximum context count (∞ indicates unlimited context), Word count in the current input box , Estimated Token Count .

This feature is only for estimating token count. The actual token count differs by model; please refer to the data from the model provider.

Translate Translate the content in the current input box into English.

Conversation Settings

Model Settings

The model settings and the assistant settings have Model Settings parameter synchronization. For details, see Assistant Settings.

In Conversation Settings, only the model settings apply to the current assistant; other settings apply globally. For example, after setting the message style to bubbles, it will be bubble style under any assistant and any topic.

Message Settings

Message Divider:

Use a divider to separate the message body from the action bar.

Use serif font:

Switch font style. You can now also use custom CSS to change the font.

Show line numbers in code:

Display line numbers for code blocks when the model outputs code snippets.

Collapsible code blocks:

When enabled, if the code in a snippet is long, the code block will be automatically collapsed.

Wrappable code blocks:

When enabled, if a single line in a code snippet is long (exceeds the window), it will automatically wrap.

Auto-collapse thinking content:

When enabled, models that support thinking will automatically collapse the thinking process after completion.

Message Style:

You can switch the conversation interface to bubble style or list style.

Code Style:

You can switch the display style of code snippets.

Math Formula Engine:

  • KaTeX renders faster because it is specifically designed for performance optimization;

  • MathJax renders more slowly but is more feature-complete and supports more mathematical symbols and commands.

Message Font Size:

Adjust the font size in the conversation interface.

Input Settings

Show Estimated Token Count:

Display the estimated number of tokens consumed by the input text in the input box (not the actual tokens consumed by context, for reference only).

Paste long text as file:

When you paste long text copied from elsewhere into the input box, it will automatically display as a file style to reduce interference with subsequent input.

Render input messages with Markdown:

When off, only the model’s reply messages are rendered, and sent messages are not rendered.

Translate by hitting space 3 times quickly:

After entering a message in the conversation input box, hitting the spacebar three times in a row will translate the input into English.

Target Language:

Set the target language for the input box translation button and the quick triple-space translation.

Assistant Settings

In the assistant interface, select theAssistant Namethat needs to be set → in thecontext menuchoose the corresponding setting

Edit Assistant

Assistant settings apply to all topics under that assistant.

Prompt Settings

Name:

You can customize an assistant name that is easy to recognize.

Prompt:

i.e., prompt. You can refer to the prompt writing style on the Agent page to edit the content.

Model Settings

Default Model:

You can fix a default model for this assistant. When adding from the Agent page or copying the assistant, the initial model will be this model. If not set, the initial model will be the global initial model (i.e., Default assistant model ).

There are two types of default models for an assistant: one is the the global default conversation model , and the other is the assistant’s default model; the assistant’s default model has higher priority than the global default chat model. When the assistant default model is not set, assistant default model = global default chat model.

Auto Reset Model:

When on — if you switch to other models during use under this topic, creating a new topic will reset the new topic to the assistant’s default model. When this is off, the model for a new topic will follow the model used in the previous topic.

For example, if the assistant’s default model is gpt-3.5-turbo, and I create Topic 1 under this assistant, then switch to gpt-4o during the conversation in Topic 1, then:

If auto reset is on: when creating Topic 2, the default model selected for Topic 2 will be gpt-3.5-turbo;

If auto reset is off: when creating Topic 2, the default model selected for Topic 2 will be gpt-4o.

Temperature :

The temperature parameter controls the randomness and creativity of the model’s text generation (default is 0.7). Specifically:

  • Low temperature (0–0.3):

    • More deterministic and focused outputs

    • Suitable for scenarios requiring accuracy, such as code generation and data analysis

    • Tends to choose the most likely words

  • Medium temperature (0.4–0.7):

    • Balances creativity and coherence

    • Suitable for daily conversation and general writing

    • Recommended for chatbot conversations (around 0.5)

  • High temperature (0.8–1.0):

    • Produces more creative and diverse outputs

    • Suitable for creative writing and brainstorming

    • But may reduce textual coherence

Top P (Nucleus Sampling):

Default is 1. The smaller the value, the more monotonous and easier to understand the AI’s generated content; the larger the value, the wider and more diverse the AI’s vocabulary in replies.

Nucleus sampling affects output by controlling the probability threshold for word selection:

  • Smaller values (0.1–0.3):

    • Consider only the highest-probability words

    • More conservative and controllable outputs

    • Suitable for scenarios like code comments and technical documentation

  • Medium values (0.4–0.6):

    • Balance vocabulary diversity and accuracy

    • Suitable for general conversations and writing tasks

  • Larger values (0.7–1.0):

    • Consider a broader range of word choices

    • Produce richer and more diverse content

    • Suitable for scenarios requiring diverse expression, such as creative writing

  • These two parameters can be used independently or in combination

  • Choose appropriate parameter values based on the specific task type

  • It is recommended to find the parameter combination most suitable for your application scenario through experimentation

  • The above content is for reference and understanding concepts only. The given parameter ranges may not be suitable for all models; refer to the parameter recommendations in the relevant model documentation.

Context Window

The number of messages to keep in context. The larger the value, the longer the context and the more tokens consumed:

  • 5–10: suitable for normal conversations

  • >10: complex tasks that require longer memory (e.g., step-by-step long-form generation according to an outline, where you need to ensure logical coherence in generated context)

  • Note: The more messages, the greater the token consumption

Enable message length limit (MaxToken)

Maximum per-response Tokens In large language models, max tokens is a key parameter that directly affects the quality and length of generated answers.

For example: In CherryStudio, after filling in the key to test whether the model is connected, you only need to know whether the model returns a message correctly and not any specific content. In such cases, set MaxToken to 1.

Most models have a MaxToken upper limit of 32k tokens. There are also 64k or even more; check the corresponding introduction page for details.

How much to set depends on your needs; you can also refer to the following suggestions.

Streaming Output (Stream)

Streaming output is a data processing method that allows data to be transmitted and processed as a continuous stream, rather than sending all data at once. This allows data to be processed and output immediately upon generation, greatly improving real-time performance and efficiency.

In the CherryStudio client and similar environments, this is simply the typewriter effect.

When off (non-streaming): the model outputs the entire segment at once after finishing generating the message (imagine receiving a message in WeChat);

When on: output character by character; you can understand it as the large model sending each character to you as soon as it generates it, until it’s all sent.

If certain special models do not support streaming output, you need to turn this off, such asinitiallyonly supporting non-streaming for o1-mini, etc.

Custom Parameters

Add extra request parameters in the request body, such as presence_penalty and other fields, which most people generally won’t need.

The above parameters such as top-p, max tokens, stream are among these parameters.

Filling format: parameter name — parameter type (text, number, etc.) — value. Reference documentation:Click to go

Each model provider has more or less their own unique parameters. You need to look in the provider’s documentation for usage methods.

  • Custom parameters have higher priority than built-in parameters. That is, if a custom parameter duplicates a built-in parameter, the custom parameter will override the built-in parameter.

For example: if in custom parameters you set model to gpt-4o then in the conversation, no matter which model you select, the gpt-4o model will be used.

  • Use Parameter name: undefined to exclude a parameter.

Last updated

Was this helpful?