Skip to main content
Z.AI offers a variety of models and agents to meet the needs of different scenarios. Choosing the right model can help you complete tasks more efficiently.

GLM-5

The latest flagship foundation model delivers open-source SOTA capabilities.

GLM-5V-Turbo

Multimodal agent model, specializing in visual programming.

GLM-Image

Supports text-to-image generation, achieving open-source state-of-the-art (SOTA) in complex scenarios

Models, Agents and Tools

To help you find the best fit for your use case, we’ve created a table outlining the core features and strengths of each model in the Z.AI family.
If you need to get pricing information, please go directly to Pricing.

Text Models

Our model matrix includes text models with built-in reasoning capabilities, as well as vision-language models (VLMs) that extend the same reasoning power to multimodal understanding.
ModelStrengthLanguageContextResource
GLM-5Programming ability
Agentic Long-Term Planning and Execution
Backend refactoring and in-depth debugging
English & Chinese200KGuide

API Reference
GLM-5-TurboOptimization of Core Requirements for OpenClaw Tasks
Improved continuity in the execution of complex tasks
English & Chinese200KGuide

API Reference
GLM-4.7SOTA Performance
Enhanced General Capabilities
Optimized Agentic Coding
English & Chinese200KGuide

API Reference
GLM-4.7-FlashXEnhanced General Capabilities
Optimized Agentic Coding
Lightweight & High-Speed
English & Chinese200KGuide

API Reference
GLM-4.6High Performance
Strong Coding
More Versatile
English & Chinese200KGuide

API Reference
GLM-4.5Better Performance
Strong Reasoning
More Versatile
English & Chinese128KGuide

API Reference
GLM-4.5-XGood Performance
Strong Reasoning
Ultra-Fast Response
English & Chinese128KGuide

API Reference
GLM-4.5-AirCost-Effective
Lightweight
High Performance
English & Chinese128KGuide

API Reference
GLM-4.5-AirXLightweight
High Performance
Ultra-Fast Response
English & Chinese128KGuide

API Reference
GLM-4-32B-0414-128KHigh intelligence at
unmatched cost-efficiency
English & Chinese128KGuide

API Reference
GLM-4.7-FlashFree, Lightweight
High Performance
English & Chinese200KGuide

API Reference
GLM-4.5-FlashFree, Lightweight
Strong Reasoning
English & Chinese200KGuide

API Reference

Vision Models

Visual models process images or videos for recognition and analysis.
ModelStrengthLanguageContextResource
GLM-5V-TurboMultimodal Coding Capabilities
Context Size Increased to 200K
Deep Integration with Agent Workflows
English & Chinese200KGuide

API Reference
GLM-4.6VNative Function Call Support
Thinking Mode Switch Support
English & Chinese128KGuide

API Reference
GLM-OCRDocument Parsing
Information Extraction
Multiple/Guide

API Reference
GLM-4.6V-FlashXNative Function Call Support
Thinking Mode Switch Support
Lightweight & High-Speed
English & Chinese128KGuide

API Reference
GLM-4.5VMultimodal
Flexible Reasoning
English & Chinese64KGuide

API Reference
GLM-4.6V-FlashFree, Native Function Call SupportEnglish & Chinese128KGuide

API Reference

Built-in Tools

A suite of built-in tools designed to streamline workflows and boost productivity.
ToolCapability
Web Search- Provide real-time, concise, direct answers
- Accurately parse complex HTML and converts it into clean Markdown or JSON

Image Generation Models

Image Generation Models learn from massive image data to automatically generate high-quality images from text.
ModelStrengthLanguageResolutionResource
GLM-Image- Stronger in complex instruction and knowledge-intensive scenarios
- Open-source SOTA in text rendering
English & Chinesemultiple resolutionsGuide

API Reference
CogView-4- High-quality image generation
- Diverse styles
- Rich in detail
English & Chinesemultiple resolutionsGuide

API Reference

Video Generation Models

Video Generation Models turn text, images, or clips into dynamic video content, accelerating creativity for film, virtual avatars, animation, and marketing.
ModelStrengthLanguageResolutionResource
CogVideoX-3Significant improvements in image quality, stability, and physical realism simulationEnglish & Chinesemultiple resolutionsGuide

API Reference
ViduQ1Theatrical quality with seamless temporal flowEnglish & Chinese1080PGuide

API Reference
Vidu2Fast delivery with smart style preservationEnglish & Chinese720PGuide

API Reference

Audio Models

Audio models are a class of multimodal models that process audio and video signals, enabling the understanding, generation, or editing of audiovisual content.
ModelStrengthMultimodal SupportResource
GLM-ASR-2512- CER as low as 0.0717
- Support user-defined vocabularies
- Support multiple mainstream languages and dialects
AudioGuide

API Reference

Agents

A set of ready-made agents empower users to create and communicate effortlessly.
ToolCapabilityResource
GLM Slide/Poster Agent(beta)Combine content generation with professional designGuide
General-Purpose TranslationSupport 40+ languages, flexible strategies, and terminology customizationGuide
Popular Special Effects Video TemplatesSpecial effects video templates like French_Kiss, BodyShake, and Sexy_MeGuide