GLM-4.7

Overview

The GLM Coding Plan is a subscription package designed specifically for AI-powered coding. GLM-4.7 is now available in top coding tools, starting at just $3/month — powering Claude Code, Cline, OpenCode, Roo Code and more. The package is designed to make coding faster, smarter, and more reliable.

GLM-4.7 is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.

Input Modalities

Text

Output Modalitie

Text

Context Length

200K

Maximum Output Tokens

128K

Capability

Thinking Mode

Offering multiple thinking modes for different scenarios

Streaming Output

Support real-time streaming responses to enhance user interaction experience

Function Call

Powerful tool invocation capabilities, enabling integration with various external toolsets

Context Caching

Intelligent caching mechanism to optimize performance in long conversations

Structured Output

Support for structured output formats like JSON, facilitating system integration

Usage

Agentic Coding

GLM-4.7 focuses on “task completion” rather than single-point code generation. It autonomously accomplishes requirement comprehension, solution decomposition, and multi-technology stack integration starting from target descriptions. In complex scenarios involving frontend-backend coordination, real-time interaction, and peripheral device calls, it directly generates structurally complete, executable code frameworks. This significantly reduces manual assembly and iterative debugging costs, making it ideal for complex demos, prototype validation, and automated development workflows.

Multimodal Interaction and Real-Time Application Development

In scenarios requiring cameras, real-time input, and interactive controls, GLM-4.7 demonstrates superior system-level comprehension. It integrates visual recognition, logic control, and application code into unified solutions, enabling rapid construction of interactive applications like gesture control and real-time feedback. This accelerates the journey from concept to operational application.

Web UI Generation and Visual Aesthetic Optimization

Significantly enhanced understanding of visual code and UI specifications. GLM-4.7 provides more aesthetically pleasing and consistent default solutions for layout structures, color harmony, and component styling, reducing time spent on repetitive “fine-tuning” of styles. It is well-suited for low-code platforms, AI frontend generation tools, and rapid prototyping scenarios.

High-Quality Dialogue and Complex Problem Collaboration

Maintains context and constraints more reliably during multi-turn conversations. Responds more directly to simple queries while continuously clarifying objectives and advancing resolution paths for complex issues. GLM-4.7 functions as a collaborative “problem-solving partner,” ideal for high-frequency collaboration scenarios like development support, solution discussions, and decision-making assistance.

Immersive Writing & Character-Driven Creation

Delivers more nuanced, vividly descriptive prose that builds atmosphere through sensory details like scent, sound, and light. In role-playing and narrative creation, it maintains consistent adherence to world-building and character archetypes, advancing plots with natural tension. Ideal for interactive storytelling, IP content creation, and character-based applications.

Professional-Grade PPT/Poster Generation

In office creation, GLM-4.7 demonstrates significantly enhanced layout consistency and aesthetic stability. It reliably adapts to mainstream aspect ratios like 16:9, minimizes template-like elements in typography hierarchy, white space, and color schemes, and produces near-ready-to-use results. This makes it ideal for AI presentation tools, enterprise office systems, and automated content generation scenarios.

Intelligent Search and Deep Research

Enhanced capabilities in user intent understanding, information retrieval, and result integration. For complex queries and research tasks, GLM-4.7 not only returns information but also performs structured organization and cross-source consolidation. Through multi-round interactions, it progressively narrows in on core conclusions, making it suitable for in-depth research and decision-support scenarios.

Introducing GLM-4.7

Comprehensive Coding Capability Enhancement GLM-4.7 achieves significant breakthroughs across three dimensions: programming, reasoning, and agent capabilities:

Core Coding: GLM-4.7 brings clear gains, compared to its predecessor GLM-4.6, in multilingual agentic coding and terminal-based tasks, including (73.8%, +5.8%) on SWE-bench, (66.7%, +12.9%) on SWE-bench Multilingual, and (41%, +10.0%) on Terminal Bench. GLM-4.7 also supports thinking before acting, with significant improvements on complex tasks in mainstream agent frameworks such as Claude Code, Kilo Code, Cline, and Roo Code.
Vibe Coding: GLM-4.7 takes a major step forward in UI quality. It produces cleaner, more modern webpages and generates better-looking slides with more accurate layout and sizing.
Tool Using: Tool using is significantly improved. GLM-4.7 achieves open-source SOTA results on multi-step tool using benchmarks such as τ²-Bench and on web browsing via BrowserComp.
Complex Reasoning: GLM-4.7 delivers a substantial boost in mathematical and reasoning capabilities, achieving (42.8%, +12.4%) on the HLE (Humanity’s Last Exam) benchmark compared to GLM-4.6.

Resources

API Documentation: Learn how to call the API.

Quick Start

The following is a full sample code to help you onboard GLM-4. with ease.

cURL
Official Python SDK
Official Java SDK
OpenAI Python SDK

Basic Call

curl -X POST "https://api.z.ai/api/paas/v4/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key" \
  -d '{
    "model": "glm-4.7",
    "messages": [
      {
        "role": "user",
        "content": "As a marketing expert, please create an attractive slogan for my product."
      },
      {
        "role": "assistant",
        "content": "Sure, to craft a compelling slogan, please tell me more about your product."
      },
      {
        "role": "user",
        "content": "Z.AI Open Platform"
      }
    ],
    "thinking": {
      "type": "enabled"
    },
    "max_tokens": 4096,
    "temperature": 1.0
  }'

Streaming Call

curl -X POST "https://api.z.ai/api/paas/v4/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key" \
  -d '{
    "model": "glm-4.7",
    "messages": [
      {
        "role": "user",
        "content": "As a marketing expert, please create an attractive slogan for my product."
      },
      {
        "role": "assistant",
        "content": "Sure, to craft a compelling slogan, please tell me more about your product."
      },
      {
        "role": "user",
        "content": "Z.AI Open Platform"
      }
    ],
    "thinking": {
      "type": "enabled"
    },
    "stream": true,
    "max_tokens": 4096,
    "temperature": 1.0
  }'

Install SDK

# Install latest version
pip install zai-sdk

# Or specify version
pip install zai-sdk==0.1.0

Verify Installation

import zai

print(zai.__version__)

Basic Call

from zai import ZaiClient

client = ZaiClient(api_key="your-api-key")  # Your API Key

response = client.chat.completions.create(
    model="glm-4.7",
    messages=[
        {
            "role": "user",
            "content": "As a marketing expert, please create an attractive slogan for my product.",
        },
        {
            "role": "assistant",
            "content": "Sure, to craft a compelling slogan, please tell me more about your product.",
        },
        {"role": "user", "content": "Z.AI Open Platform"},
    ],
    thinking={
        "type": "enabled",
    },
    max_tokens=4096,
    temperature=1.0,
)

# Get complete response
print(response.choices[0].message)

Streaming Call

from zai import ZaiClient

client = ZaiClient(api_key="your-api-key")  # Your API Key

response = client.chat.completions.create(
    model="glm-4.7",
    messages=[
        {
            "role": "user",
            "content": "As a marketing expert, please create an attractive slogan for my product.",
        },
        {
            "role": "assistant",
            "content": "Sure, to craft a compelling slogan, please tell me more about your product.",
        },
        {"role": "user", "content": "Z.AI Open Platform"},
    ],
    thinking={
        "type": "enabled",  # Optional: "disabled" or "enabled", default is "enabled"
    },
    stream=True,
    max_tokens=4096,
    temperature=0.6,
)

# Stream response
for chunk in response:
    if chunk.choices[0].delta.reasoning_content:
        print(chunk.choices[0].delta.reasoning_content, end="", flush=True)

    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Install SDKMaven

<dependency>
    <groupId>ai.z.openapi</groupId>
    <artifactId>zai-sdk</artifactId>
    <version>0.1.3</version>
</dependency>

Gradle (Groovy)

implementation 'ai.z.openapi:zai-sdk:0.1.3'

Basic Call

import ai.z.openapi.ZaiClient;
import ai.z.openapi.service.model.ChatCompletionCreateParams;
import ai.z.openapi.service.model.ChatCompletionResponse;
import ai.z.openapi.service.model.ChatMessage;
import ai.z.openapi.service.model.ChatMessageRole;
import ai.z.openapi.service.model.ChatThinking;
import java.util.Arrays;

public class BasicChat {
    public static void main(String[] args) {
        // Initialize client
        ZaiClient client = ZaiClient.builder().apiKey("your-api-key").build();

        // Create chat completion request
        ChatCompletionCreateParams request =
                ChatCompletionCreateParams.builder()
                        .model("glm-4.7")
                        .messages(
                                Arrays.asList(
                                        ChatMessage.builder()
                                                .role(ChatMessageRole.USER.value())
                                                .content(
                                                        "As a marketing expert, please create an attractive slogan for my product.")
                                                .build(),
                                        ChatMessage.builder()
                                                .role(ChatMessageRole.ASSISTANT.value())
                                                .content(
                                                        "Sure, to craft a compelling slogan, please tell me more about your product.")
                                                .build(),
                                        ChatMessage.builder()
                                                .role(ChatMessageRole.USER.value())
                                                .content("Z.AI Open Platform")
                                                .build()))
                        .thinking(ChatThinking.builder().type("enabled").build())
                        .maxTokens(4096)
                        .temperature(1.0f)
                        .build();

        // Send request
        ChatCompletionResponse response = client.chat().createChatCompletion(request);

        // Get response
        if (response.isSuccess()) {
            Object reply = response.getData().getChoices().get(0).getMessage();
            System.out.println("AI Response: " + reply);
        } else {
            System.err.println("Error: " + response.getMsg());
        }
    }
}

Streaming Call

import ai.z.openapi.ZaiClient;
import ai.z.openapi.service.model.ChatCompletionCreateParams;
import ai.z.openapi.service.model.ChatCompletionResponse;
import ai.z.openapi.service.model.ChatMessage;
import ai.z.openapi.service.model.ChatMessageRole;
import ai.z.openapi.service.model.ChatThinking;
import ai.z.openapi.service.model.Delta;
import java.util.Arrays;

public class StreamingChat {
    public static void main(String[] args) {
        // Initialize client
        ZaiClient client = ZaiClient.builder().apiKey("your-api-key").build();

        // Create streaming chat completion request
        ChatCompletionCreateParams request =
                ChatCompletionCreateParams.builder()
                        .model("glm-4.7")
                        .messages(
                                Arrays.asList(
                                        ChatMessage.builder()
                                                .role(ChatMessageRole.USER.value())
                                                .content(
                                                        "As a marketing expert, please create an attractive slogan for my product.")
                                                .build(),
                                        ChatMessage.builder()
                                                .role(ChatMessageRole.ASSISTANT.value())
                                                .content(
                                                        "Sure, to craft a compelling slogan, please tell me more about your product.")
                                                .build(),
                                        ChatMessage.builder()
                                                .role(ChatMessageRole.USER.value())
                                                .content("Z.AI Open Platform")
                                                .build()))
                        .thinking(ChatThinking.builder().type("enabled").build())
                        .stream(true) // Enable streaming output
                        .maxTokens(4096)
                        .temperature(1.0f)
                        .build();

        ChatCompletionResponse response = client.chat().createChatCompletion(request);

        if (response.isSuccess()) {
            response.getFlowable()
                    .subscribe(
                            // Process streaming message data
                            data -> {
                                if (data.getChoices() != null && !data.getChoices().isEmpty()) {
                                    Delta delta = data.getChoices().get(0).getDelta();
                                    System.out.print(delta + "\n");
                                }
                            },
                            // Process streaming response error
                            error -> System.err.println("\nStream error: " + error.getMessage()),
                            // Process streaming response completion event
                            () -> System.out.println("\nStreaming response completed"));
        } else {
            System.err.println("Error: " + response.getMsg());
        }
    }
}

Install SDK

# Install or upgrade to latest version
pip install --upgrade 'openai>=1.0'

Verify Installation

python -c "import openai; print(openai.__version__)"

Usage Example

from openai import OpenAI

client = OpenAI(
    api_key="your-Z.AI-api-key",
    base_url="https://api.z.ai/api/paas/v4/",
)

completion = client.chat.completions.create(
    model="glm-4.7",
    messages=[
        {"role": "system", "content": "You are a smart and creative novelist"},
        {
            "role": "user",
            "content": "Please write a short fairy tale story as a fairy tale master",
        },
    ],
)

print(completion.choices[0].message.content)

Get Started

Language Models

Vision Language Models

Image Generation Models

Video Generation Models

Image Generation Models

Audio Models

Capabilities

Tools

Agents

Overview

Input Modalities

Output Modalitie

Context Length

Maximum Output Tokens

Capability

Thinking Mode

Streaming Output

Function Call

Context Caching

Structured Output

Usage

Introducing GLM-4.7

Resources

Quick Start

Get Started

Language Models

Vision Language Models

Image Generation Models

Video Generation Models

Image Generation Models

Audio Models

Capabilities

Tools

Agents

​ Overview

Input Modalities

Output Modalitie

Context Length

Maximum Output Tokens

​ Capability

Thinking Mode

Streaming Output

Function Call

Context Caching

Structured Output

​ Usage

​ Introducing GLM-4.7

​ Resources

​ Quick Start

Overview

Capability

Usage

Introducing GLM-4.7

Resources

Quick Start