Zhipu GLM-4.5-Air

To make it easy for every developer and user to experience the capabilities of cutting-edge large models,Zhipu has made the GLM-4.5-Air model available for free to Cherry Studio users. As an efficient foundation model built specifically for agent applications, GLM-4.5-Air strikes an excellent balance between performance and cost, making it an ideal choice for building intelligent applications.


🚀 What is GLM-4.5-Air?

GLM-4.5-Air is Zhipu’s latest high-performance language model, using the advancedMixture-of-Experts (MoE) architecture, which significantly reduces computing resource consumption while maintaining outstanding reasoning ability.

  • Total parameters: 106 billion

  • Activated parameters: 12 billion

Through its streamlined design, GLM-4.5-Air achieves higher inference efficiency, making it suitable for deployment in resource-constrained environments while still capable of handling complex tasks.


📚 Unified training process, building a solid intelligent foundation

GLM-4.5-Air shares the same training process as the flagship series, ensuring a solid foundation of general capabilities:

  1. Large-scale pretraining: on up to 150 trillion tokens of general-purpose corpusto build broad knowledge understanding capabilities;

  2. Specialized domain optimization: strengthened training on key tasks such as code generation, logical reasoning, and agent interaction;

  3. Long-context support: context length extended to 128K tokens, enabling it to handle long documents, complex conversations, or large code projects;

  4. Reinforcement learning enhancement: RL is used to optimize the model’s decision-making ability in reasoning, planning, tool calling, and more.

This training system gives GLM-4.5-Air outstanding generalization ability and task adaptability.


⚙️ Core capabilities optimized for agents

GLM-4.5-Air has been deeply adapted for agent application scenarios and offers the following practical capabilities:

Tool calling support: can call external tools through standardized interfaces to automate tasks ✅ Web browsing and information extraction: can work with browser plugins to understand and interact with dynamic content ✅ Software engineering assistance: supports requirement analysis, code generation, bug identification, and fixing ✅ Frontend development support: has a good understanding of and generation ability for frontend technologies such as HTML, CSS, and JavaScript

The model can be flexibly integrated into Claude Code, Roo Code and other code-agent frameworks, and can also be used as the core engine of any custom agent.


💡 Intelligent "thinking mode" to flexibly respond to various requests

GLM-4.5-Air supportshybrid reasoning mode, and users can control whether deep thinking is enabled through the thinking.type parameter:

  • enabled: enables thinking, suitable for complex tasks that require step-by-step reasoning or planning

  • disabled: disables thinking, for simple queries or instant responses

  • The default setting is dynamic thinking mode, where the model automatically determines whether in-depth analysis is needed

Task type
Example

Simple tasks(recommended to turn off thinking)

- Query the founding date of "Zhipu AI" - Translate "I love you" into Chinese

Medium tasks(recommended to enable thinking)

- Compare the advantages and disadvantages of flying versus high-speed rail from Beijing to Shanghai - Explain why Jupiter has so many moons

Complex tasks(strongly recommended to enable thinking)

- Explain how experts collaborate in an MoE model - Analyze whether to buy an ETF based on market information


🌟 High efficiency and low cost, easier deployment

GLM-4.5-Air achieves an excellent balance between performance and cost, making it especially suitable for real-world business deployment:

  • Generation speed exceeds 100 tokens/sec, with fast responses and low-latency interaction support

  • 💰 Extremely low API cost: input only 0.8 RMB/million tokens, output 2 RMB/million tokens

  • 🖥️ Fewer activated parameters, lower compute requirements, and easy to run locally or in the cloud with high concurrency

Truly delivers a "high-performance, low-barrier" AI service experience.


🧠 Focus on practical capabilities: intelligent code generation

GLM-4.5-Air performs steadily in code generation and supports:

  • Covers Python, JavaScript, Java and other mainstream languages

  • Generatesclear and maintainablecode based on natural language instructions

  • Reduces templated output and better matches the needs of real development scenarios

Suitable for high-frequency development tasks such as rapid prototyping, automatic completion, and bug fixing.


Try it for free now GLM-4.5-Air, and start your agent development journey! Whether you want to build an automated assistant, a coding companion, or explore next-generation AI applications, GLM-4.5-Air will be your efficient and reliable AI engine.

📘 Connect now and unleash your creativity!

Last updated

Was this helpful?