Chat Interface

Assistants and Topics

Assistants

An Assistant allows you to personalize the selected model with settings such as prompt presets and parameter presets, making the model work more in line with your expectations.

The System Default Assistant provides a general parameter preset (no prompt). You can use it directly or find the preset you need on the Agents Page.

Topics

An Assistant is the parent set of Topics. Multiple topics (i.e., conversations) can be created under a single assistant. All Topics share the Assistant's parameter settings and prompt presets, among other model settings.

Buttons in the Chatbox

New Topic creates a new topic within the current assistant.

Upload Image or Document requires model support for image uploads. Uploading documents will automatically parse them into text to be provided as context to the model.

Web Search requires configuring web search related information in the settings. Search results are returned to the large model as context. For details, see Web Search Mode.

Knowledge Base enables the knowledge base feature. For details, see Knowledge Base Tutorial.

MCP Server enables the MCP server feature. For details, see MCP Usage Tutorial.

Generate Image is not displayed by default. For models that support image generation (e.g., Gemini), it needs to be manually enabled before images can be generated.

Due to technical reasons, you must manually enable the button to generate images. This button will be removed after this feature is optimized.

Select Model switches to the specified model for subsequent conversations while preserving the context.

Quick Phrases requires pre-setting common phrases in the settings to be called here, allowing direct input and supporting variables.

Clear Messages deletes all content under this topic.

Expand makes the chatbox larger for entering longer texts.

Clear Context truncates the context available to the model without deleting content, meaning the model will "forget" previous conversation content.

Estimated Token Count displays the estimated token count. The four values represent Current Context Count, Maximum Context Count (∞ indicates unlimited context), Current Input Box Message Character Count, and Estimated Token Count.

This function is only for estimating the token count. The actual token count varies for each model, so please refer to the data provided by the model provider.

Translate translates the content in the current input box into English.

Conversation Settings

Model Settings

Model settings are synchronized with the Model Settings parameters in the assistant settings. For details, see Assistant Settings.

In the conversation settings, only the model settings apply to the current assistant; other settings apply globally. For example, if you set the message style to bubbles, it will be in bubble style for any topic of any assistant.

Message Settings

Message Separator:

Uses a separator to divide the message body from the action bar.

Use Serif Font:

Font style toggle. You can now also change fonts via custom CSS.

Display Line Numbers for Code:

Displays line numbers for code blocks when the model outputs code snippets.

Collapsible Code Blocks:

When enabled, code blocks will automatically collapse if the code snippet is long.

Code Block Word Wrap:

When enabled, single lines of code within a code snippet will automatically wrap if they are too long (exceed the window).

Auto-collapse Thinking Content:

When enabled, models that support thinking will automatically collapse the thinking process after completion.

Message Style:

Allows switching the conversation interface to bubble style or list style.

Code Style:

Allows switching the display style of code snippets.

Mathematical Formula Engine:

  • KaTeX renders faster because it is specifically designed for performance optimization.

  • MathJax renders slower but is more comprehensive, supporting more mathematical symbols and commands.

Message Font Size:

Adjusts the font size of the conversation interface.

Input Settings

Display Estimated Token Count:

Displays the estimated token count consumed by the input text in the input box (not the actual context token consumption, for reference only).

Paste Long Text as File:

When copying and pasting a long block of text from elsewhere into the input box, it will automatically display as a file, reducing interference during subsequent input.

Markdown Render Input Messages:

When disabled, only model replies are rendered, not sent messages.

Triple Space for Translation:

After entering a message in the conversation interface input box, pressing the spacebar three times consecutively will translate the input content into English.

Target Language:

Sets the target language for the input box translation button and the triple space translation feature.

Assistant Settings

In the assistant interface, select the assistant name you want to set → select the corresponding setting in the right-click menu.

Edit Assistant

Assistant settings apply to all topics under that assistant.

Prompt Settings

Name:

Customizable assistant name for easy identification.

Prompt:

The prompt itself. You can refer to the prompt writing style on the agent page to edit the content.

Model Settings

Default Model:

You can fix a default model for this assistant. When adding from the agent page or copying an assistant, the initial model will be this model. If this item is not set, the initial model will be the global initial model (i.e., Default Assistant Model).

There are two types of default models for assistants: one is the global default conversation model, and the other is the assistant's default model. The assistant's default model has a higher priority than the global default conversation model. If the assistant's default model is not set, the assistant's default model will be equal to the global default conversation model.

Auto Reset Model:

When enabled - if you switch to another model during use in this topic, creating a new topic will reset the new topic's model to the assistant's default model. When this item is disabled, the model for a new topic will follow the model used in the previous topic.

For example, if the assistant's default model is gpt-3.5-turbo, and I create Topic 1 under this assistant, then switch to gpt-4o during the conversation in Topic 1:

If auto-reset is enabled: When creating Topic 2, the default model selected for Topic 2 will be gpt-3.5-turbo.

If auto-reset is not enabled: When creating Topic 2, the default model selected for Topic 2 will be gpt-4o.

Temperature:

The temperature parameter controls the degree of randomness and creativity in the text generated by the model (default value is 0.7). Specifically:

  • Low temperature values (0-0.3):

    • Output is more deterministic and focused.

    • Suitable for scenarios requiring accuracy, such as code generation and data analysis.

    • Tends to select the most probable words for output.

  • Medium temperature values (0.4-0.7):

    • Balances creativity and coherence.

    • Suitable for daily conversations and general writing.

    • Recommended for chatbot conversations (around 0.5).

  • High temperature values (0.8-1.0):

    • Produces more creative and diverse output.

    • Suitable for creative writing, brainstorming, etc.

    • May reduce text coherence.

Top P (Nucleus Sampling):

The default value is 1. A smaller value makes the AI-generated content more monotonous and easier to understand; a larger value gives the AI a wider and more diverse range of vocabulary for replies.

Nucleus sampling affects output by controlling the probability threshold for vocabulary selection:

  • Smaller values (0.1-0.3):

    • Only consider words with the highest probability.

    • Output is more conservative and controllable.

    • Suitable for scenarios like code comments and technical documentation.

  • Medium values (0.4-0.6):

    • Balances vocabulary diversity and accuracy.

    • Suitable for general conversations and writing tasks.

  • Larger values (0.7-1.0):

    • Considers a wider range of vocabulary.

    • Produces richer and more diverse content.

    • Suitable for creative writing and other scenarios requiring diverse expression.

  • These two parameters can be used independently or in combination.

  • Choose appropriate parameter values based on the specific task type.

  • It is recommended to experiment to find the parameter combination best suited for a particular application scenario.

  • The above content is for reference and conceptual understanding only; the given parameter ranges may not be suitable for all models. Please refer to the parameter recommendations in the model's documentation.

Context Window

The number of messages to keep in the context. A larger value means a longer context and consumes more tokens:

  • 5-10: Suitable for general conversations.

  • 10: For complex tasks requiring longer memory (e.g., generating long texts step-by-step according to an outline, where generated context needs to ensure logical coherence).

  • Note: More messages consume more tokens.

Enable Message Length Limit (MaxToken)

The maximum Token count for a single response. In large language models, max token (maximum token count) is a crucial parameter that directly affects the quality and length of the model's generated responses.

For example, when testing if a model is connected after filling in the key in CherryStudio, if you only need to know if the model returns a message correctly without specific content, you can set MaxToken to 1.

The MaxToken limit for most models is 32k Tokens, but some support 64k or even more; you need to check the relevant introduction page for details.

The specific setting depends on your needs, but you can also refer to the following suggestions.

Streaming Output (Stream)

Streaming output is a data processing method that allows data to be transmitted and processed in a continuous stream, rather than sending all data at once. This method allows data to be processed and output immediately after it is generated, greatly improving real-time performance and efficiency.

In environments like the CherryStudio client, it basically means a "typewriter effect."

When disabled (non-streaming): The model generates information and outputs the entire segment at once (imagine receiving a message on WeChat).

When enabled: Outputs character by character. This can be understood as the large model sending you each character as soon as it generates it, until all characters are sent.

If certain special models do not support streaming output, this switch needs to be turned off, such as o1-mini and others that initially only supported non-streaming.

Custom Parameters

Adds additional request parameters to the request body, such as presence_penalty and other fields. Generally, most users will not need this.

Parameters like top-p, maxtokens, stream, etc., mentioned above are some of these parameters.

Input format: Parameter Name — Parameter Type (text, number, etc.) — Value. Reference documentation: Click to go.

Each model provider may have its own unique parameters. You need to consult the provider's documentation for usage methods.

  • Custom parameters have higher priority than built-in parameters. That is, if a custom parameter conflicts with a built-in parameter, the custom parameter will override the built-in parameter.

For example: if model is set to gpt-4o in custom parameters, then gpt-4o will be used in the conversation regardless of which model is selected.

  • Settings like ParameterName:undefined can be used to exclude parameters.

最后更新于

这有帮助吗?