Chat Interface

Assistants and Topics

Assistants

Assistants It is a way to use the selected model by making some personalized settings for it, such as prompt presets and parameter presets. Through these settings, the selected model can better match the work you expect.

System Default Assistant A relatively general set of parameters is preset (no prompt), and you can use it directly or go to the Agent page to find the preset you need to use.

Topic

Assistants is Topic the superset of. Under a single assistant, multiple topics (i.e., conversations) can be created, and all Topic share Assistants the same parameter settings and preset words (prompts) and other model settings.

Buttons in the chat box

New Topic Create a new topic within the current assistant.

Upload images or documents Uploading images requires model support. Uploaded documents will be automatically parsed into text and provided to the model as context.

Web Search You must configure web search-related information in Settings. Search results are returned to the large model as context; see Online mode.

Knowledge Base Enable the knowledge base; see Knowledge Base Tutorial.

MCP Server Enable the MCP server feature; see MCP Usage Tutorial.

Generate Images Only when the selected conversation model supports image generation will it be shown. (For non-conversational image-generation models, please go to Drawing)

Select Model For the following conversation, switch to the specified model while keeping the context.

Quick Phrases You need to preset commonly used phrases in Settings first, then call them here and insert them directly; variables are supported.

Clear Messages Delete all content under this topic.

Expand Make the chat box larger so you can input long text.

Clear Context Truncate the context the model can access without deleting content; in other words, the model will “forget” the previous conversation content.

Estimated Token Count Displays the estimated token count. The four data points are current context count , maximum context count (∞ means unlimited context), message character count in the current input box , Estimated Token Count .

This feature is only for estimating token count. The actual token count differs for each model; please refer to the data provided by the model vendor.

Translate Translate the content currently in the input box into English.

Conversation Settings

Model Settings

The parameters in model settings and assistant settings are Model Settings synced; see Assistant Settings.

In conversation settings, only this model setting applies to the current assistant; other settings apply globally. For example: if the message style is set to bubble, then any topic under any assistant will use the bubble style.

Message Settings

Message Divider Line:

Use a divider line to separate the message body from the action bar.

Use serif font:

Font style switch. Now you can also change the font via custom CSS .

Show code line numbers:

Show line numbers for code blocks when the model outputs code snippets.

Code blocks collapsible:

When enabled, long code snippets will automatically collapse into code blocks.

Code blocks wrap lines:

When enabled, long single-line code snippets (exceeding the window) will automatically wrap.

Automatically collapse thought content:

When enabled, models that support thinking will automatically collapse the thinking process after it is complete.

Message Style:

You can switch the conversation interface to bubble style or list style.

Code Style:

You can switch the display style of code snippets.

Mathematical formula engine:

  • KaTeX renders faster because it is specifically designed for performance optimization;

  • MathJax renders slower, but it is more comprehensive and supports more mathematical symbols and commands.

Message font size:

Adjust the font size of the conversation interface.

Input Settings

Show estimated token count:

Show the estimated token cost of the input text in the input box (not the actual context token usage; for reference only).

Paste long text as file:

When copying a long section of text from elsewhere and pasting it into the input box, it will automatically be displayed as a file style to reduce interference with later input.

Render input messages with Markdown:

When turned off, only the messages replied by the model are rendered; sent messages are not rendered.

Press space 3 times quickly to translate:

After entering a message in the conversation interface input box, pressing space three times in a row can translate the input content into English.

Target language:

Set the target language for the input box translation button and the quick triple-space translation.

Assistant Settings

In the assistant interface, select theassistant name→ in theright-click menuselect the corresponding setting

Edit Assistant

Assistant settings apply to all topics under this assistant.

Prompt Settings

Name:

You can customize an assistant name that is easy to identify.

Prompt:

That is, the prompt. You can refer to the prompt writing style on the Agent page to edit the content.

Model Settings

Default Model:

You can assign a fixed default model for this assistant. When added from the Agent page or when the assistant is duplicated, the initial model will be this model. If this item is not set, the initial model will be the global initial model (i.e. Default assistant model ).

There are two default models for an assistant: one is global default chat model , and the other is the assistant's default model; the assistant's default model has higher priority than the global default conversation model. When the assistant's default model is not set, the assistant's default model = the global default conversation model.

Auto Reset Model:

When enabled - if you switch to another model during use under this topic, creating a new topic again will reset the new topic to the assistant's default model. When this option is disabled, the model of the new topic will follow the model used in the previous topic.

For example, if the assistant's default model is gpt-3.5-turbo, I create Topic 1 under this assistant, and during the conversation in Topic 1 I switch to gpt-4o. At this point:

If auto reset is enabled: when creating Topic 2, the default model selected for Topic 2 is gpt-3.5-turbo;

If auto reset is not enabled: when creating Topic 2, the default model selected for Topic 2 is gpt-4o.

Temperature (Temperature) :

The temperature parameter controls the randomness and creativity of the text generated by the model (default value is 0.7). Specifically:

  • Low temperature value (0-0.3):

    • More deterministic and focused output

    • Suitable for scenarios requiring accuracy, such as code generation and data analysis

    • Tends to choose the most likely words for output

  • Medium temperature value (0.4-0.7):

    • Balances creativity and coherence

    • Suitable for everyday conversations and general writing

    • Recommended for chatbot conversations (around 0.5)

  • High temperature value (0.8-1.0):

    • Produces more creative and diverse output

    • Suitable for creative writing, brainstorming, and similar scenarios

    • But may reduce textual coherence

Top P (nucleus sampling):

The default value is 1. The smaller the value, the more monotonous the AI-generated content and the easier it is to understand; the larger the value, the broader the vocabulary range in the AI's replies, and the more diverse it is.

Nucleus sampling affects the output by controlling the probability threshold for word selection:

  • Smaller values (0.1-0.3):

    • Only consider words with the highest probability

    • More conservative and controllable output

    • Suitable for code comments, technical documentation, and similar scenarios

  • Medium values (0.4-0.6):

    • Balance vocabulary diversity and accuracy

    • Suitable for general conversation and writing tasks

  • Larger values (0.7-1.0):

    • Consider a wider range of word choices

    • Generate richer and more diverse content

    • Suitable for creative writing and other scenarios that require diverse expression

  • These two parameters can be used independently or in combination

  • Choose appropriate parameter values according to the specific task type

  • It is recommended to find the most suitable parameter combination for a specific application scenario through experimentation

  • The above content is for reference only to help understand the concepts. The given parameter ranges may not be suitable for all models; please refer to the parameter recommendations in the relevant model documentation.

Context Window (Context Window)

The number of messages retained in the context; the larger the value, the longer the context and the more tokens consumed:

  • 5-10: Suitable for ordinary conversations

  • >10: Complex tasks that require longer memory (for example: tasks that generate long articles step by step according to an outline, where it is necessary to ensure coherent context logic in the generated content)

  • Note: The more messages, the greater the token consumption

Enable message length limit (MaxToken)

Maximum per response Token count. In large language models, max token (maximum number of tokens) is a key parameter that directly affects the quality and length of the model's generated response.

For example: in CherryStudio, when testing whether the model is connected after filling in the key, you only need to know whether the model returns a message correctly and not a specific content; in this case, setting MaxToken to 1 is enough.

The MaxToken upper limit for most models is 32k tokens. Of course, there are also 64k and even more; please check the corresponding introduction page for details.

How much to set depends on your own needs, and you can also refer to the following suggestions.

Streaming Output (Stream)

Streaming output is a data processing method that allows data to be transmitted and processed as a continuous stream rather than sending all data at once. This method allows data to be processed and output immediately after generation, greatly improving real-time performance and efficiency.

In environments such as the CherryStudio client, simply put, it is the typewriter effect.

When turned off (non-streaming): the model outputs the entire generated information at once after it is finished (imagine receiving a message in WeChat);

When turned on: output character by character; you can understand it as the large model sending each character to you immediately after generating it, until everything is sent.

If some special models do not support streaming output, you need to turn this switch off, for exampleat the beginningonly supports non-streaming models such as o1-mini.

Custom Parameters

Add extra request parameters to the request body (body), such as presence_penalty and other fields; generally, most people do not need them.

The above parameters such as top-p, maxtokens, stream, etc. are among these parameters.

Format: parameter name — parameter type (text, number, etc.) — value, reference documentation:Click to go

Different model providers have more or less their own unique parameters, and you need to find the usage method in the provider's documentation

  • Custom parameters have higher priority than built-in parameters. That is, if a custom parameter duplicates a built-in parameter, the custom parameter will override the built-in parameter.

For example: in custom parameters, set model For gpt-4o then, no matter which model is selected in the conversation, the one used is the gpt-4o model.

  • Use parameter name: undefined settings to exclude parameters.

Last updated

Was this helpful?