Zhipu GLM-4.5-Air

To let every developer and user easily experience the capabilities of cutting-edge large models,Zhipu offers the GLM-4.5-Air model free of charge to Cherry Studio usersAs an efficient base model specifically designed for agent applications, GLM-4.5-Air achieves an excellent balance between performance and cost, making it an ideal choice for building intelligent applications.


🚀 What is GLM-4.5-Air?

GLM-4.5-Air is Zhipu's newly launched high-performance language model that adopts advancedMixture-of-Experts (MoE) architectureand significantly reduces computational resource consumption while maintaining excellent inference capabilities.

  • Total parameters: 106 billion

  • Activated parameters: 12 billion

Through a streamlined design, GLM-4.5-Air achieves higher inference efficiency, suitable for deployment in resource-constrained environments while still handling complex tasks.


📚 Unified training pipeline, solidifying the intelligence foundation

GLM-4.5-Air shares the same training pipeline as the flagship series, ensuring it has a solid foundation of general capabilities:

  1. Large-scale pretraining: Trained on up to 150 trillion tokens of general corporato build broad knowledge and understanding;

  2. Specialized domain optimization: Enhanced training on key tasks such as code generation, logical reasoning, and agent interaction;

  3. Long context support: Context length extended to 128K tokens, able to handle long documents, complex dialogues, or large codebases;

  4. Reinforcement learning enhancement: Use RL to optimize the model's decision-making abilities in planning, tool invocation, and other aspects.

This training system gives GLM-4.5-Air excellent generalization and task adaptability.


⚙️ Core capabilities optimized for agents

GLM-4.5-Air is deeply adapted for agent application scenarios and has the following practical capabilities:

Tool invocation support: Can call external tools via standardized interfaces to achieve task automation ✅ Web browsing and information extraction: Can work with browser plugins to understand and interact with dynamic content ✅ Software engineering assistance: Supports requirement analysis, code generation, defect identification and fixing ✅ Frontend development support: Has good understanding and generation ability for frontend technologies such as HTML, CSS, JavaScript

This model can be flexibly integrated into Claude Code, Roo Code and other code agent frameworks, and can also be used as the core engine for any custom Agent.


💡 Smart “thinking modes”, flexibly responding to various requests

GLM-4.5-Air supportshybrid reasoning modes, and users can control whether to enable deep thinking via thinking.type parameter:

  • enabled: Enable thinking, suitable for complex tasks that require step-by-step reasoning or planning

  • disabled: Disable thinking, used for simple queries or immediate responses

  • Default setting is dynamic thinking mode, where the model automatically determines whether deep analysis is needed

Task type
Examples

Simple tasks(recommended to disable thinking)

- Query “the founding date of Zhipu AI” - Translate “I love you” into Chinese

Moderate tasks(recommended to enable thinking)

- Compare the pros and cons of flying versus taking a high-speed train from Beijing to Shanghai - Explain why Jupiter has more moons

Complex tasks(strongly recommended to enable thinking)

- Describe how experts collaborate in an MoE model - Analyze whether one should buy an ETF based on market information


🌟 Efficient and low-cost, easier to deploy

GLM-4.5-Air achieves an excellent balance between performance and cost, making it especially suitable for real-world business deployment:

  • Generation speed over 100 tokens/second, responsive and supports low-latency interaction

  • 💰 API costs are very low: Input is only 0.8 CNY per million tokens, output 2 CNY per million tokens

  • 🖥️ Few activated parameters, low compute requirements, easy to run at high concurrency locally or in the cloud

Truly achieves a “high-performance, low-barrier” AI service experience.


🧠 Focused on practical capabilities: intelligent code generation

GLM-4.5-Air performs steadily in code generation and supports:

  • Covering mainstream languages such as Python, JavaScript, Java and others

  • Generate based on natural language instructionsCode that iswell-structured and highly maintainable

  • Reduces templated output and is closer to real development scenario needs

Suitable for high-frequency development tasks such as rapid prototyping, automated completion, and bug fixing.


Try it for free now GLM-4.5-Air, and start your agent development journey! Whether you want to build an automation assistant, a programming companion, or explore next-generation AI applications, GLM-4.5-Air will be your efficient and reliable AI engine.

📘 Connect now and unleash your creativity!

Last updated

Was this helpful?