Stream Tool Call

Stream Tool Call is a unique feature of Z.ai’s latest GLM-4.6 model, allowing real-time access to reasoning processes, response content, and tool call information during tool invocation, providing better user experience and real-time feedback.

Features

Tool calling in the latest GLM-4.6 model now supports streaming output for responses. This allows developers to stream tool usage parameters without buffering or JSON validation when calling chat.completions, thereby reducing call latency and providing a better user experience.

Core Parameter Description

stream=True: Enable streaming output, must be set to True
tool_stream=True: Enable tool call streaming output
model: Use a model that supports tool calling, limited to glm-4.6

Response Parameter Description

The delta object in streaming responses contains the following fields:

reasoning_content: Text content of the model’s reasoning process
content: Text content of the model’s response
tool_calls: Tool call information, including function names and parameters

Code Example

By setting the tool_stream=True parameter, you can enable streaming tool call functionality:

Install SDK

# Install latest version
pip install zai-sdk

# Or specify version
pip install zai-sdk==0.0.4

Verify Installation

import zai
print(zai.__version__)

Complete Example

from zai import ZaiClient

# Initialize client
client = ZaiClient(api_key='Your API key')

# Create streaming tool call request
response = client.chat.completions.create(
    model="glm-4.6",  # Use model that supports tool calling
    messages=[
        {"role": "user", "content": "How's the weather in Beijing?"},
    ],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current weather conditions for a specified location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {"type": "string", "description": "City, e.g.: Beijing, Shanghai"},
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                    },
                    "required": ["location"]
                }
            }
        }
    ],
    stream=True,        # Enable streaming output
    tool_stream=True    # Enable tool call streaming output
)

# Initialize variables to collect streaming data
reasoning_content = ""      # Reasoning process content
content = ""               # Response content
final_tool_calls = {}      # Tool call information
reasoning_started = False  # Reasoning process start flag
content_started = False    # Content output start flag

# Process streaming response
for chunk in response:
    if not chunk.choices:
        continue

    delta = chunk.choices[0].delta

    # Handle streaming reasoning process output
    if hasattr(delta, 'reasoning_content') and delta.reasoning_content:
        if not reasoning_started and delta.reasoning_content.strip():
            print("\n🧠 Thinking Process:")
            reasoning_started = True
        reasoning_content += delta.reasoning_content
        print(delta.reasoning_content, end="", flush=True)

    # Handle streaming response content output
    if hasattr(delta, 'content') and delta.content:
        if not content_started and delta.content.strip():
            print("\n\n💬 Response Content:")
            content_started = True
        content += delta.content
        print(delta.content, end="", flush=True)

    # Handle streaming tool call information
    if delta.tool_calls:
        for tool_call in delta.tool_calls:
            index = tool_call.index
            if index not in final_tool_calls:
                # New tool call
                final_tool_calls[index] = tool_call
                final_tool_calls[index].function.arguments = tool_call.function.arguments
            else:
                # Append tool call parameters (streaming construction)
                final_tool_calls[index].function.arguments += tool_call.function.arguments

# Output final tool call information
if final_tool_calls:
    print("\n📋 Function Calls Triggered:")
    for index, tool_call in final_tool_calls.items():
        print(f"  {index}: Function Name: {tool_call.function.name}, Parameters: {tool_call.function.arguments}")

Get Started

Language Models

Visual Language Models

Video Generation Models

Tools

Agents

Features

Core Parameter Description

Response Parameter Description

Code Example

Use Cases

Intelligent Customer Service System

Code Assistant

Get Started

Language Models

Visual Language Models

Video Generation Models

Tools

Agents

​Features

​Core Parameter Description

​Response Parameter Description

​Code Example

​Use Cases

Intelligent Customer Service System

Code Assistant

Features

Core Parameter Description

Response Parameter Description

Code Example

Use Cases