Zhipu GLM-4.5-Air

To enable every developer and user to easily experience the capabilities of cutting-edge large models, Zhipu has opened up the GLM-4.5-Air model to Cherry Studio users for free. As an efficient foundational model specifically designed for Agent applications, GLM-4.5-Air achieves an excellent balance between performance and cost, making it an ideal choice for building intelligent applications.


🚀 What is GLM-4.5-Air?

GLM-4.5-Air is Zhipu's latest high-performance language model, adopting an advanced Mixture-of-Experts (MoE) architecture, which significantly reduces computational resource consumption while maintaining excellent inference capabilities.

  • Total Parameters: 106 Billion

  • Active Parameters: 12 Billion

Through streamlined design, GLM-4.5-Air achieves higher inference efficiency, suitable for deployment in resource-constrained environments, while still capable of handling complex tasks.


📚 Unified Training Process, Solidifying the Intelligent Foundation

GLM-4.5-Air shares a consistent training process with the flagship series, ensuring it possesses a solid foundation of general capabilities:

  1. Large-scale pre-training: Trained on up to 15 trillion tokens of general corpus to build extensive knowledge understanding capabilities;

  2. Specialized domain optimization: Enhanced training on critical tasks such as code generation, logical reasoning, and agent interaction;

  3. Long context support: Context length extended to 128K tokens, capable of handling long documents, complex conversations, or large code projects;

  4. Reinforcement learning enhancement: RL optimizes the model's decision-making capabilities in areas such as inference planning and tool calling.

This training system endows GLM-4.5-Air with excellent generalization capabilities and task adaptability.


⚙️ Core Capabilities Optimized for Agents

GLM-4.5-Air is deeply adapted for agent application scenarios and possesses the following practical capabilities:

Tool Calling Support: Can call external tools via standardized interfaces to achieve task automation ✅ Web Browsing and Information Extraction: Can work with browser plugins to understand and interact with dynamic content ✅ Software Engineering Assistance: Supports requirements analysis, code generation, defect identification, and repair ✅ Frontend Development Support: Has a good understanding and generation capability for frontend technologies such as HTML, CSS, and JavaScript

This model can be flexibly integrated into code agent frameworks such as Claude Code and Roo Code, and can also be used as the core engine for any custom Agent.


💡 Intelligent "Thinking Mode," Flexible Response to Various Requests

GLM-4.5-Air supports hybrid inference mode, allowing users to control whether deep thinking is enabled via the thinking.type parameter:

  • enabled: Enables thinking, suitable for complex tasks requiring step-by-step reasoning or planning

  • disabled: Disables thinking, used for simple queries or immediate responses

  • Default setting is dynamic thinking mode, where the model automatically determines whether deep analysis is needed

Task Type
Example

Simple Tasks (thinking recommended to be off)

- Query "When was Zhipu AI founded?" - Translate "I love you" to Chinese

Medium Tasks (thinking recommended to be on)

- Compare the pros and cons of planes vs. high-speed trains from Beijing to Shanghai - Explain why Jupiter has many moons

Complex Tasks (thinking strongly recommended to be on)

- Explain how experts collaborate in MoE models - Analyze whether to buy an ETF based on market information


🌟 High Efficiency, Low Cost, Easier Deployment

GLM-4.5-Air achieves an excellent balance between performance and cost, making it particularly suitable for practical business deployment:

  • Generation speed exceeds 100 tokens/sec, rapid response, supports low-latency interaction

  • 💰 Extremely low API cost: Input only 0.8 CNY/million tokens, output 2 CNY/million tokens

  • 🖥️ Fewer active parameters, low computing power requirements, easy for high-concurrency operation locally or in the cloud

Truly achieving an AI service experience with "high performance and low barrier to entry."


🧠 Focus on Practical Capabilities: Intelligent Code Generation

GLM-4.5-Air demonstrates stable performance in code generation, supporting:

  • Covers mainstream languages such as Python, JavaScript, and Java

  • Generates cleanly structured, highly maintainable code from natural language instructions

  • Reduces templated output, closer to real development scenario needs

Suitable for high-frequency development tasks such as rapid prototyping, automated completion, and bug fixing.

Experience GLM-4.5-Air for free now and start your agent development journey! Whether you want to build automated assistants, programming companions, or explore next-generation AI applications, GLM-4.5-Air will be your efficient and reliable AI engine.

📘 Integrate now and unleash your creativity!

Last updated

Was this helpful?