Submit a video task
Submit an async video-generation task.
The request body shape depends on the model prefix:
kling/...— Kling video request. Body fields areprompt + image + metadata.{mode, aspect_ratio, duration, image_list, video_list, sound, multi_shot, ...}. Endpoint variant (text-to-video / image-to-video / Omni-Video) is selected by which metadata fields you supply.byteplus/...— Seedance video request. Body fields areprompt + metadata.{content[], ratio, duration, generate_audio, watermark, seed, service_tier, return_last_frame, callback_url, resolution}. The variant (text-to-video / image-to-video / multimodal reference / video editing) is selected by whichcontent[]items androlemarkers you supply.
Pick the schema variant from the request body dropdown below to see each shape’s fields and try them in the playground.
Authorizations
OrcaRouter API keys look like sk-orca-.... Pass them in the
Authorization: Bearer sk-orca-... header.
Body
- Kling video request
- Seedance video request
Pick the variant matching your model prefix:
kling/...→Kling video requestbyteplus/...→Seedance video request
Kling video model (customer-facing name with kling/ namespace
prefix). The endpoint Kling actually serves (text2video /
image2video / omni-video) is determined by the metadata fields
you pass, not by which model name you pick — but only
kling/kling-video-o1 and kling/kling-v3-omni accept the
multi-source reference fields (image_list / video_list).
kling/kling-v2-master, kling/kling-v2-1-master, kling/kling-v2-5-turbo, kling/kling-v2-6, kling/kling-v3, kling/kling-video-o1, kling/kling-v3-omni "kling/kling-v3-omni"
Required. Kling rejects empty / whitespace-only prompts.
Optional first-frame image for image-to-video (URL or base64
data URI). Mutually informative with metadata.image_tail.
Free-form parameter bag honored by Kling.
Universal (all endpoints):
mode(string):std(720P) /pro(1080P) /4k.4konly onkling/kling-v3andkling/kling-v3-omni. Default isstdfor text/image-to-video,profor Omni-Video.aspect_ratio(string):16:9/9:16/1:1.duration(string): Length in seconds, default"5".kling/kling-v3-omniandkling/kling-v3accept"3"-"15"; v2 family andkling/kling-video-o1accept"5"or"10".
Text-to-video and image-to-video only (NOT Omni-Video):
negative_prompt(string): Things to avoid. Max 2500 chars.cfg_scale(number): Range[0, 1], default0.5. Higher = stricter prompt adherence. Not supported on v2.x models.image_tail(string): Last-frame image for first/last-frame image-to-video.
Multi-source reference (Omni endpoint, kling/kling-video-o1
/ kling/kling-v3-omni only):
image_list(array):[{image_url, type}]— multi-image reference. Refer to images in prompt with<<<image_1>>>etc.video_list(array):[{video_url, refer_type, keep_original_sound}]. Onkling/kling-v3-omnilimited to 3-10s and std/pro mode (not 4K).
Advanced features (model-dependent — see Capability Map):
multi_shot(bool) +shot_type(customize/intelligence)multi_prompt([{index, prompt, duration}]): multi-shot mode. Available onkling/kling-v3andkling/kling-v3-omni.
sound(string):"on"/"off"— native audio. Available onkling/kling-v3andkling/kling-v3-omni(any mode), andkling/kling-v2-6(pro mode only).watermark_info(object):{enabled: bool}. Universal.
Response
Task accepted (async — poll /v1/video/generations/{task_id})
OpenAI-style submit response. Returned by POST /v1/video/generations
(and the OpenAI-symmetric alias POST /v1/videos).
Task ID. Same value as task_id (kept for legacy clients).
video Model name as the customer sent it (alias / namespace prefix preserved, not the upstream-resolved name).
Always queued on a successful submit.
queued 0 <= x <= 100Unix timestamp when the task was submitted.
