Skip to main content
POST
/
paas
/
v4
/
videos
/
generations
curl --request POST \
--url https://api.z.ai/api/paas/v4/videos/generations \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"model": "cogvideox-3",
"prompt": "A cat is playing with a ball.",
"quality": "quality",
"with_audio": true,
"size": "1920x1080",
"fps": 30
}'
{
  "model": "<string>",
  "id": "<string>",
  "request_id": "<string>",
  "task_status": "<string>"
}

Authorizations

Authorization
string
header
required

Use the following format for authentication: Bearer <your api key>

Headers

Accept-Language
enum<string>
default:en-US,en

Config desired response language for HTTP requests.

Available options:
en-US,en
Example:

"en-US,en"

Body

application/json
  • CogVideoX-3
  • Vidu: Text to Video
  • Vidu: Image to Video
  • Vidu: First & Last Frame to Video
  • Vidu: Ref to Video
model
enum<string>
required

The model code to be called.

Available options:
cogvideox-3
prompt
string

Text description of the video, maximum input length of 512 characters. Either image_url or prompt must be provided, or both.

quality
enum<string>

Output mode, default is speed.

  • quality: Prioritizes quality, higher generation quality.
  • speed: Prioritizes speed, faster generation time, relatively lower quality.
Available options:
speed,
quality
Example:

"speed"

with_audio
boolean

Whether to generate AI sound effects. Default: false (no sound effects).

Example:

false

image_url
(string<uri> | string<byte>)[]

Provide an image based on which content will be generated. If this parameter is passed, the system will operate based on this image. Supports passing images via URL or Base64 encoding. Image requirements: images support .png, .jpeg, .jpg formats; image size: no more than 5M. Either image_url and prompt can be used, or both can be passed simultaneously. First and last frames: supports inputting two images. The first uploaded image is regarded as the first frame, and the second image is regarded as the last frame. The model will generate the video based on the images passed in this parameter. First and last frame mode only supports speed mode

size
enum<string>

Default value: if not specified, the short side of the generated video is 1080 by default, and the long side is determined according to the original image ratio. Maximum support for 4K resolution. Resolution options: "1280x720", "720x1280", "1024x1024", "1080x1920", "2048x1080", "3840x2160"

Available options:
1280x720,
720x1280,
1024x1024,
1920x1080,
1080x1920,
2048x1080,
3840x2160
Example:

"1920x1080"

fps
enum<integer>

Video frame rate (FPS), optional values are 30 or 60. Default: 30.

Available options:
30,
60
Example:

30

duration
enum<integer>

Video duration, default is 5 seconds, supports 5 and 10 seconds.

Available options:
5,
10
Example:

5

request_id
string

Provided by the client, must be unique; used to distinguish each request’s unique identifier. If not provided by the client, the platform will generate one by default.

user_id
string

Unique ID of the end-user, assists the platform in intervening in end-user violations, generating illegal or inappropriate information, or other abusive behaviors. ID length requirement: minimum 6 characters, maximum 128 characters.

Response

Processing successful.

model
string

Model name used in this call.

id
string

Task order number generated by the Z.AI, use this order number when calling the request result interface.

request_id
string

Task number submitted by the user during the client request or generated by the platform.

task_status
string

Processing status, PROCESSING (processing),SUCCESS (success), FAIL (failure). Results need to be obtained via query.