# Zhipu GLM-4.5-Air

To make it easy for every developer and user to experience the capabilities of cutting-edge large models,**Zhipu has made the GLM-4.5-Air model available for free to Cherry Studio users**. As an efficient foundation model built specifically for agent applications, GLM-4.5-Air strikes an excellent balance between performance and cost, making it an ideal choice for building intelligent applications.

***

**🚀 What is GLM-4.5-Air?**

GLM-4.5-Air is Zhipu’s latest high-performance language model, using the advanced**Mixture-of-Experts (MoE) architecture**, which significantly reduces computing resource consumption while maintaining outstanding reasoning ability.

* **Total parameters: 106 billion**
* **Activated parameters: 12 billion**

Through its streamlined design, GLM-4.5-Air achieves higher inference efficiency, making it suitable for deployment in resource-constrained environments while still capable of handling complex tasks.

<figure><img src="/files/ab01c751fd5ec714a22ed2bda6b6ce83d96c93fa" alt=""><figcaption></figcaption></figure>

***

**📚 Unified training process, building a solid intelligent foundation**

GLM-4.5-Air shares the same training process as the flagship series, ensuring a solid foundation of general capabilities:

1. **Large-scale pretraining**: on up to **150 trillion tokens of general-purpose corpus**to build broad knowledge understanding capabilities;
2. **Specialized domain optimization**: strengthened training on key tasks such as code generation, logical reasoning, and agent interaction;
3. **Long-context support**: context length extended to **128K tokens**, enabling it to handle long documents, complex conversations, or large code projects;
4. **Reinforcement learning enhancement**: RL is used to optimize the model’s decision-making ability in reasoning, planning, tool calling, and more.

This training system gives GLM-4.5-Air outstanding generalization ability and task adaptability.

<figure><img src="/files/3fe6b72ba59233f0ca6dec4adac8db5e8846ad0d" alt=""><figcaption></figcaption></figure>

***

**⚙️ Core capabilities optimized for agents**

GLM-4.5-Air has been deeply adapted for agent application scenarios and offers the following practical capabilities:

✅ **Tool calling support**: can call external tools through standardized interfaces to automate tasks\
✅ **Web browsing and information extraction**: can work with browser plugins to understand and interact with dynamic content\
✅ **Software engineering assistance**: supports requirement analysis, code generation, bug identification, and fixing\
✅ **Frontend development support**: has a good understanding of and generation ability for frontend technologies such as HTML, CSS, and JavaScript

The model can be flexibly integrated into **Claude Code, Roo Code** and other code-agent frameworks, and can also be used as the core engine of any custom agent.

<figure><img src="/files/fca85411cf5354172c8c7ea3a17be3a48f7b7c1d" alt=""><figcaption></figcaption></figure>

***

**💡 Intelligent "thinking mode" to flexibly respond to various requests**

GLM-4.5-Air supports**hybrid reasoning mode**, and users can control whether deep thinking is enabled through the `thinking.type` parameter:

* `enabled`: enables thinking, suitable for complex tasks that require step-by-step reasoning or planning
* `disabled`: disables thinking, for simple queries or instant responses
* The default setting is **dynamic thinking mode**, where the model automatically determines whether in-depth analysis is needed

| Task type                                                  | Example                                                                                                                                                |
| ---------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Simple tasks**(recommended to turn off thinking)         | <p>- Query the founding date of "Zhipu AI"<br>- Translate "I love you" into Chinese</p>                                                                |
| **Medium tasks**(recommended to enable thinking)           | <p>- Compare the advantages and disadvantages of flying versus high-speed rail from Beijing to Shanghai<br>- Explain why Jupiter has so many moons</p> |
| **Complex tasks**(strongly recommended to enable thinking) | <p>- Explain how experts collaborate in an MoE model<br>- Analyze whether to buy an ETF based on market information</p>                                |

***

**🌟 High efficiency and low cost, easier deployment**

GLM-4.5-Air achieves an excellent balance between performance and cost, making it especially suitable for real-world business deployment:

* ⚡ **Generation speed exceeds 100 tokens/sec**, with fast responses and low-latency interaction support
* 💰 **Extremely low API cost**: input only **0.8 RMB/million tokens**, output **2 RMB/million tokens**
* 🖥️ Fewer activated parameters, lower compute requirements, and easy to run locally or in the cloud with high concurrency

Truly delivers a "high-performance, low-barrier" AI service experience.

<figure><img src="/files/01c1ed46e07af1f951b713c2301a4400475e4515" alt=""><figcaption></figcaption></figure>

***

**🧠 Focus on practical capabilities: intelligent code generation**

GLM-4.5-Air performs steadily in code generation and supports:

* Covers **Python, JavaScript, Java** and other mainstream languages
* Generates**clear and maintainable**code based on natural language instructions
* Reduces templated output and better matches the needs of real development scenarios

Suitable for high-frequency development tasks such as rapid prototyping, automatic completion, and bug fixing.

***

Try it for free now **GLM-4.5-Air**, and start your agent development journey!\
Whether you want to build an automated assistant, a coding companion, or explore next-generation AI applications, GLM-4.5-Air will be your efficient and reliable AI engine.

📘 Connect now and unleash your creativity!


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.cherry-ai.com/docs/en-us/pre-basic/providers/cherryai/mian-fei-ti-yan-zhi-pu-glm4.5air-qing-liang-gao-xiao-xin-xuan-ze.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.