PriceInput ModalityOutput Modality
$0.2 / videoImage / Text / Start and End FrameVideo
  1. Advertising and Marketing: Input product images or copy to quickly generate dynamic ads in multiple styles, supporting scene transitions and realistic lighting rendering.
  2. Short Video Creation: Convert single-frame images or text scripts into smooth, naturally animated short videos, covering both realistic and 3D styles.
  3. Tourism Promotion: Upload scenic spot photos and promotional text to generate immersive tourism short videos with realistic natural landscapes.
  4. Film and TV Production: Input storyboards or character design images to automatically generate dynamic preview clips, simulating seamless camera movements and realistic physical interactions.

Resources

Detailed Description

  1. Significantly Improved Subject Clarity

Videos generated by CogVideoX-3 feature clear subjects, stable frames, reduced distortion issues, and support for subjects to move extensively, resulting in more natural and fluid dynamic performance.
PromptVideo
The petals were blown by the wind, spinning continuously and transforming into a person.
Nezha happily took a sip of wine, then showed off the brand of wine.wine
  1. Better Command Compliance and Realistic Physical Simulation

Deeply understands the intent of text commands and accurately reproduces creative requirements. Whether it is having a character perform specific actions or simulating natural physical phenomena, it can be presented in accordance with real-world logic.
PromptVideo
A pair of hands holding a fruit knife, slicing a whole red tomato into slices.
In an open-plan office, an employee is looking down at his phone. Suddenly, the manager appears and taps him on the shoulder. Startled, he quickly puts his phone away.
  1. Enhanced High-Definition Realistic-Style Scenes and 3D-Style Scene Rendering

For realistic styles, it can create high-definition textures akin to real-life photography; when switching to 3D styles, it can precisely shape three-dimensional forms and scene atmospheres, effortlessly handling multiple styles.
PromptVideo
A high-angle shot captures Dou E and the sky. Dou E was an innocent woman from ancient China who was wrongfully accused. At this moment, she is looking up and shouting. Under the scorching June sun, white snow falls from the sky, scattering upon contact with bloodstains. Her clothes flutter slightly, accompanied by a 3D particle wind.
A stylish anthropomorphic snow leopard wearing a white leopard-print fashion coat, super fluffy, plush, thick, and luxurious, walking the runway in ultra-high definition with a cinematic feel, reminiscent of a blockbuster movie, like a Victoria’s Secret fashion show. The runway is lined with spectators taking photos on both sides.
  1. Added Start and End Frame Generation

Supports users providing start and end frames, automatically generating seamless transition content, allowing static frames to naturally connect into dynamic narratives, and linking complete creative concepts.
PromptStart FrameEnd FrameVideo
The Dragon King transforms into Ao Bing, with ink wash-style shading. The main character slowly transforms, highlighting the details of the transformation. The camera rotates smoothly, creating a natural and fluid transition.
The character holds a gun in both hands and shoots wildly at the computer screen. The computer catches fire, explodes, and shatters into pieces, sending debris flying everywhere. The office lights flicker.

Example

from zai import ZaiClient
client = ZaiClient(api_key="your-api-key")

# Generate video
response = client.videos.generations(
    model="cogvideox-3",
    prompt="A cat is playing with a ball.",
    quality="quality",  # Output mode, "quality" for quality priority, "speed" for speed priority
    with_audio=True, # Whether to include audio
    size="1920x1080",  # Video resolution, supports up to 4K (e.g., "3840x2160")
    fps=30,  # Frame rate, can be 30 or 60
)
print(response)

# Get video result
result = client.videos.retrieve_videos_result(id=response.id)
print(result)