Skip to main content
POST
/
video
/
generations
Submit a video task
curl --request POST \
  --url https://api.orcarouter.ai/v1/video/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "kling/kling-v3-omni",
  "prompt": "<string>",
  "image": "<string>",
  "metadata": {}
}
'
{
  "id": "<string>",
  "task_id": "<string>",
  "model": "<string>",
  "progress": 50,
  "created_at": 123
}

Authorizations

Authorization
string
header
required

OrcaRouter API keys look like sk-orca-.... Pass them in the Authorization: Bearer sk-orca-... header.

Body

application/json

Pick the variant matching your model prefix:

  • kling/...Kling video request
  • byteplus/...Seedance video request
model
enum<string>
required

Kling video model (customer-facing name with kling/ namespace prefix). The endpoint Kling actually serves (text2video / image2video / omni-video) is determined by the metadata fields you pass, not by which model name you pick — but only kling/kling-video-o1 and kling/kling-v3-omni accept the multi-source reference fields (image_list / video_list).

Available options:
kling/kling-v2-master,
kling/kling-v2-1-master,
kling/kling-v2-5-turbo,
kling/kling-v2-6,
kling/kling-v3,
kling/kling-video-o1,
kling/kling-v3-omni
Example:

"kling/kling-v3-omni"

prompt
string
required

Required. Kling rejects empty / whitespace-only prompts.

image
string

Optional first-frame image for image-to-video (URL or base64 data URI). Mutually informative with metadata.image_tail.

metadata
object

Free-form parameter bag honored by Kling.

Universal (all endpoints):

  • mode (string): std (720P) / pro (1080P) / 4k. 4k only on kling/kling-v3 and kling/kling-v3-omni. Default is std for text/image-to-video, pro for Omni-Video.
  • aspect_ratio (string): 16:9 / 9:16 / 1:1.
  • duration (string): Length in seconds, default "5". kling/kling-v3-omni and kling/kling-v3 accept "3"-"15"; v2 family and kling/kling-video-o1 accept "5" or "10".

Text-to-video and image-to-video only (NOT Omni-Video):

  • negative_prompt (string): Things to avoid. Max 2500 chars.
  • cfg_scale (number): Range [0, 1], default 0.5. Higher = stricter prompt adherence. Not supported on v2.x models.
  • image_tail (string): Last-frame image for first/last-frame image-to-video.

Multi-source reference (Omni endpoint, kling/kling-video-o1 / kling/kling-v3-omni only):

  • image_list (array): [{image_url, type}] — multi-image reference. Refer to images in prompt with <<<image_1>>> etc.
  • video_list (array): [{video_url, refer_type, keep_original_sound}]. On kling/kling-v3-omni limited to 3-10s and std/pro mode (not 4K).

Advanced features (model-dependent — see Capability Map):

  • multi_shot (bool) + shot_type (customize / intelligence)
    • multi_prompt ([{index, prompt, duration}]): multi-shot mode. Available on kling/kling-v3 and kling/kling-v3-omni.
  • sound (string): "on" / "off" — native audio. Available on kling/kling-v3 and kling/kling-v3-omni (any mode), and kling/kling-v2-6 (pro mode only).
  • watermark_info (object): {enabled: bool}. Universal.

Response

200 - application/json

Task accepted (async — poll /v1/video/generations/{task_id})

OpenAI-style submit response. Returned by POST /v1/video/generations (and the OpenAI-symmetric alias POST /v1/videos).

id
string

Task ID. Same value as task_id (kept for legacy clients).

task_id
string
object
enum<string>
Available options:
video
model
string

Model name as the customer sent it (alias / namespace prefix preserved, not the upstream-resolved name).

status
enum<string>

Always queued on a successful submit.

Available options:
queued
progress
integer
Required range: 0 <= x <= 100
created_at
integer

Unix timestamp when the task was submitted.