Free Trial | Zhipu GLM-4.5-Air, a lightweight and efficient new choice!

To allow every developer and user to easily experience the capabilities of cutting-edge large models, Zhipu has made the GLM-4.5-Air model freely available to Cherry Studio users. As an efficient foundational model specifically designed for agent applications, GLM-4.5-Air strikes an excellent balance between performance and cost, making it an ideal choice for building intelligent applications.


🚀 What is GLM-4.5-Air?

GLM-4.5-Air is Zhipu's latest high-performance language model, featuring an advanced Mixture-of-Experts (MoE) architecture. It significantly reduces computational resource consumption while maintaining excellent inference capabilities.

  • Total Parameters: 106 Billion

  • Active Parameters: 12 Billion

Through its streamlined design, GLM-4.5-Air achieves higher inference efficiency, making it suitable for deployment in resource-constrained environments while still capable of handling complex tasks.


📚 Unified Training Process, Solidifying Intelligent Foundations

GLM-4.5-Air shares a consistent training process with its flagship series, ensuring it possesses a solid foundation of general capabilities:

  1. Large-scale Pre-training: Trained on up to 15 trillion tokens of general corpus to build extensive knowledge comprehension abilities;

  2. Specialized Domain Optimization: Enhanced training on key tasks such as code generation, logical reasoning, and agent interaction;

  3. Long Context Support: Context length extended to 128K tokens, capable of processing long documents, complex dialogues, or large code projects;

  4. Reinforcement Learning Enhancement: RL optimization improves the model's decision-making capabilities in inference planning, tool calling, and other aspects.

This training system endows GLM-4.5-Air with excellent generalization and task adaptation capabilities.


⚙️ Core Capabilities Optimized for Agents

GLM-4.5-Air is deeply adapted for agent application scenarios, offering the following practical capabilities:

Tool Calling Support: Can call external tools via standardized interfaces to automate tasks ✅ Web Browsing and Information Extraction: Can work with browser plugins to understand and interact with dynamic content ✅ Software Engineering Assistance: Supports requirements parsing, code generation, defect identification, and repair ✅ Front-end Development Support: Has a good understanding and generation capability for front-end technologies such as HTML, CSS, and JavaScript

This model can be flexibly integrated into code agent frameworks like Claude Code and Roo Code, or used as the core engine for any custom Agent.


💡 Intelligent "Thinking Mode" for Flexible Response to Various Requests

GLM-4.5-Air supports a hybrid inference mode, allowing users to control whether deep thinking is enabled via the thinking.type parameter:

  • ``enabled`: Enables thinking, suitable for complex tasks requiring step-by-step reasoning or planning

  • ``disabled`: Disables thinking, used for simple queries or immediate responses

  • Default setting is dynamic thinking mode, where the model automatically determines if deep analysis is needed

Task Type
Example

Simple Tasks (Thinking recommended to be disabled)

- Query "When was Zhipu AI founded?" - Translate "I love you" into Chinese

Medium Tasks (Thinking recommended to be enabled)

- Compare the pros and cons of taking a plane vs. high-speed rail from Beijing to Shanghai - Explain why Jupiter has many moons

Complex Tasks (Thinking strongly recommended to be enabled)

- Explain how experts collaborate in MoE models - Analyze whether to buy ETFs based on market information


🌟 High Efficiency, Low Cost, Easier Deployment

GLM-4.5-Air achieves an excellent balance between performance and cost, making it particularly suitable for real-world business deployment:

  • Generation speed exceeds 100 tokens/second, offering rapid response and supporting low-latency interaction

  • 💰 Extremely low API cost: Input only 0.8 RMB/million tokens, output 2 RMB/million tokens

  • 🖥️ Fewer active parameters, lower computing power requirements, easy for high-concurrency operation locally or in the cloud

Truly achieving an AI service experience that is "high-performance, low-barrier."


🧠 Focus on Practical Capabilities: Intelligent Code Generation

GLM-4.5-Air performs stably in code generation, supporting:

  • Covering mainstream languages such as Python, JavaScript, and Java

  • Generating clear, maintainable code based on natural language instructions

  • Reducing templated output, aligning closely with real development scenario needs

Applicable to high-frequency development tasks such as rapid prototyping, automated completion, and bug fixing.


Experience GLM-4.5-Air for free now and start your agent development journey! Whether you want to build automated assistants, programming companions, or explore next-generation AI applications, GLM-4.5-Air will be your efficient and reliable AI engine.

📘 Get started now and unleash your creativity!

最后更新于

这有帮助吗?