POST /v2/videos/generations

Method: POSTEndpoint: /v2/videos/generations

The Tongyi Wanxiang text-to-video model generates a smooth video from a text prompt. Supported capabilities include:

Core capabilities: flexible durations (5s/10s), specified video resolution (480P/720P/1080P), smart prompt rewriting, and watermark support.

Audio capabilities: supports automatic dubbing or a custom audio file for audio-video synchronization. Available only on wan2.5.

Request Parameters

Header Parameters

text

Authorization
string
Optional
Default Value:
Bearer {{YOUR_API_KEY}}

Body Parameters application/json

text

prompt
string
Required
The text prompt supports Chinese and English, with a maximum length of 800 characters. Each Chinese character or letter counts as one character. Content that exceeds this limit will be truncated.
Text prompts support both Chinese and English, with a maximum length of 800 characters. Each Chinese character or letter counts as one character. Content beyond this limit is truncated.
Example: A kitten running in the moonlight.
Example: a kitten runs under the moonlight.
model
enum<string>
Required
Model name. Example: wan2.1-t2v-turbo.
Value:
wan2.5-t2v-preview
  // provider-specific example normalized for English documentation
  // provider-specific example normalized for English documentation
duration
enum<integer>
Optional
Duration of the generated video in seconds. This parameter is fixed at 5 and cannot be changed. The model always generates a 5-second video.
Enum Values:
5
10
audio_url
string
Optional
Supported only by wan2.5-t2v-preview. Audio file URL used by the model to generate the video. See audio settings for usage.
Supports HTTP or HTTPS. Local files can be uploaded first to obtain a temporary URL.
Audio limits:
Formats: wav and mp3.
Duration: 3 to 30 seconds.
File size: up to 15 MB.
Over-limit handling: if the audio is longer than the duration value of 5s or 10s, only the first 5s or 10s are kept and the rest is discarded. If the audio is shorter than the video duration, the remaining part of the video is silent. For example, if the audio is 3s and the video is 5s, the output has sound for the first 3s and is silent for the last 2s.
Example Value：
https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250923/hbiayh/%E4%BB%8E%E5%86%9B%E8%A1%8C.mp3。
audio
string
Optional
Supported only by wan2.5-t2v-preview. Whether to add audio. Parameter priority: audio_url > audio, and it applies only when audio_url is empty.
true: default value, automatically adds audio to the video.
  // provider-specific example normalized for English documentation
size
string
Optional
480P tier: optional video resolutions and corresponding aspect ratios are:
832*480：16:9。
480*832：9:16。
624*624：1:1。
720P tier: optional video resolutions and corresponding aspect ratios are:
1280*720：16:9。
720*1280：9:16。
960*960：1:1。
1088*832：4:3。
832*1088：3:4。
1080P tier: optional video resolutions and corresponding aspect ratios are:
1920*1080： 16:9。
1080*1920： 9:16。
1440*1440： 1:1。
1632*1248： 4:3。
1248*1632： 3:4。
watermark
boolean
Optional
Specifies whether to add a watermark. The watermark appears in the lower-right corner and reads "Generated by AI".
template
string
Optional
negative_prompt
string
Optional
A negative prompt is used to describe content that you do not want to appear in the video, which lets you restrict the video content.
The negative prompt describes content you do not want to appear in the video, helping you constrain the result.
It supports Chinese and English, with a maximum length of 500 characters. Content that exceeds this limit will be truncated.
It supports both Chinese and English, with a maximum length of 500 characters. Content beyond that limit is truncated.
Examples: low resolution, error, worst quality, low quality, defects, extra fingers, poor proportions.
Example: low resolution, errors, worst quality, low quality, defects, extra fingers, bad proportions.
prompt_extend
boolean
Optional
Specifies whether prompt rewriting is enabled. When enabled, a large language model (LLM) intelligently rewrites the input prompt. This significantly improves results for shorter prompts but increases processing time.
seed
integer
Optional
A random seed used to control the randomness of the generated content. The value must be in the range [0, 2147483647].
If this parameter is not provided, the algorithm automatically generates a random seed. To keep the generated content relatively stable, reuse the same seed value.
Example
{
"model"
:
"wan2.5-t2v-preview"
,
"prompt"
:
"An epically cute scene. A tiny cartoon kitten general in detailed golden armor and an oversized helmet stands bravely on a cliff. Riding a small but heroic warhorse, he declares: \"Dark clouds gather above the Snow Mountain, and from the lone city we gaze toward Yumenguan. After a hundred battles in yellow sand, the golden armor is worn through; we will not return until Loulan is broken.\" Below the cliff, a vast and endless army of mice charges forward with improvised weapons. It is a dramatic large-scale battle scene inspired by ancient Chinese war epics. Dark clouds hang above the distant Snow Mountain, blending comedy, cuteness, and epic grandeur."
,
"audio_url"
:
"https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250923/hbiayh/%E4%BB%8E%E5%86%9B%E8%A1%8C.mp3"
,
"size"
:
"832*480"
,
"prompt_extend"
:
true
,
"duration"
:
10
}

Example Request

Shell

bash

curl --location --request POST '/v2/videos/generations' \
--header 'Authorization: Bearer {{YOUR_API_KEY}}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "wan2.5-t2v-preview",
    "prompt": "An epically cute scene. A tiny cartoon kitten general in detailed golden armor and an oversized helmet stands bravely on a cliff. Riding a small but heroic warhorse, he declares: \"Dark clouds gather above the Snow Mountain, and from the lone city we gaze toward Yumenguan. After a hundred battles in yellow sand, the golden armor is worn through; we will not return until Loulan is broken.\" Below the cliff, a vast and endless army of mice charges forward with improvised weapons. It is a dramatic large-scale battle scene inspired by ancient Chinese war epics. Dark clouds hang above the distant Snow Mountain, blending comedy, cuteness, and epic grandeur.",
    "audio_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250923/hbiayh/%E4%BB%8E%E5%86%9B%E8%A1%8C.mp3",
    "size": "832*480",
    "prompt_extend": true,
    "duration": 10
}'

Response

🟢 200 Success

Content Type: application/json

Response Schema

text

task_id
string
Required

Example

json

{
    "task_id": "e7bed961-d1b9-4b3f-8ef9-5f441bde28c8"
}

Google-Veo

阿里Wan(万相视频

Grok 视频

Seedance(即梦视频

简单版

官方接口格式

任务查询

GoAmzAI格式(兼容版，开发接入请勿对接

官方格式

简单版(goamz/rocket

General版

统一格式

换脸任务提交

任务提交

任务查询(免费

即梦4

OpenAI Chat 格式

OpenAI Dalle 格式

Replicate 官方格式

Bfl 官方格式

POST /v2/videos/generations

Request Parameters

Header Parameters

Body Parameters application/json

Example Request

Shell

Response

🟢 200 Success

Response Schema

Example

任务查询

POST /v2/videos/generations ​

Request Parameters ​

Header Parameters ​

Body Parameters application/json ​

Example Request ​

Shell ​

Response ​

🟢 200 Success ​

Response Schema ​

Example ​

POST /v2/videos/generations

Request Parameters

Header Parameters

Body Parameters application/json

Example Request

Shell

Response

🟢 200 Success

Response Schema

Example