OrcaRouter speaks the Seedance video models on the same submit-and-poll
endpoint as Kling. You send model: byteplus/dreamina-seedance-2-0-260128,
OrcaRouter routes the request to the upstream Ark
/contents/generations/tasks API, and you poll the same task ID back
through OrcaRouter once it’s done (typically 30 seconds to 4 minutes
depending on duration / resolution / generate_audio).
Currently available model. Only Seedance 2.0 is provisioned right now,
under the backend name byteplus/dreamina-seedance-2-0-260128. The
capability table below lists the rest of the Seedance family for
reference, but they are not yet selectable in the playground or routable
through OrcaRouter — use byteplus/dreamina-seedance-2-0-260128 for every
request for now.
The submit endpoint POST /v1/video/generations and the fetch endpoint
GET /v1/video/generations/{task_id} are shared with Kling.
What changes is the request body: Kling uses prompt + image + metadata.{mode, aspect_ratio, image_list, ...}, Seedance uses prompt + metadata.{content[], ratio, duration, generate_audio, watermark, ...}. The prefix on model
selects which schema is honored.
Models
| Model | T2V | I2V (first) | I2V (first+last) | Multimodal ref¹ | Video edit² | Generate audio³ | Duration | Available |
|---|
byteplus/dreamina-seedance-2-0-260128 (2.0) | ✓ | ✓ | ✓ | ✓ full | ✓ | ✓ | 4 – 15 s | ✓ |
byteplus/seedance-2.0-fast | ✓ | ✓ | ✓ | ✓ full | ✓ | ✓ | 4 – 15 s | planned |
byteplus/seedance-1-5-pro | ✓ | ✓ | ✓ | image only | | ✓ | 4 – 12 s | planned |
byteplus/seedance-1-0-pro | ✓ | ✓ | ✓ | image only | | | 2 – 12 s | planned |
byteplus/seedance-1-0-pro-fast | ✓ | ✓ | | image only | | | 2 – 12 s | planned |
byteplus/seedance-1-0-lite-i2v | | ✓ | ✓ | image only | | | 2 – 12 s | planned |
byteplus/seedance-1-0-lite-t2v | ✓ | | | image only | | | 2 – 12 s | planned |
¹ Multimodal reference = the metadata.content[] array can carry
image_url / video_url / audio_url items with role markers
(reference_image / reference_video / reference_audio).
“Full” means combinations of image + video + audio are accepted.
² Video editing = pass a video_url content item to apply prompt-driven
edits to the source video (subject swap, region inpainting, etc.).
³ Native audio = upstream auto-generates a soundtrack matching the video.
Toggle via metadata.generate_audio: true.
The submit endpoint is the same for every model — POST /v1/video/generations.
What changes is which metadata fields the upstream honors per the table
above. See the upstream Seedance capability matrix
for the authoritative per-model feature list.
Submit a task
Send a POST to /v1/video/generations with model, prompt, and any
upstream-specific parameters under metadata:
curl https://api.orcarouter.ai/v1/video/generations \
-H "Authorization: Bearer sk-orca-..." \
-H "Content-Type: application/json" \
-d '{
"model": "byteplus/dreamina-seedance-2-0-260128",
"prompt": "A girl holding a fox, the girl opens her eyes, looks gently at the camera, the fox hugs affectionately, the camera slowly pulls out, the girl'\''s hair is blown by the wind",
"metadata": {
"content": [
{ "type": "image_url", "image_url": { "url": "https://example.com/foxgirl.png" } }
],
"ratio": "16:9",
"duration": 5,
"generate_audio": true,
"watermark": false
}
}'
Response carries the task ID (same envelope as Kling — OrcaRouter normalizes
across providers):
{
"id": "task_9q9oz6tjtgABYWC1QIqoz3sscgVz7ycw",
"task_id": "task_9q9oz6tjtgABYWC1QIqoz3sscgVz7ycw",
"object": "video",
"model": "byteplus/dreamina-seedance-2-0-260128",
"status": "queued",
"progress": 0,
"created_at": 1777975188
}
OrcaRouter wraps your prompt as the text item inside Seedance’s
content[] array automatically — you don’t need to pass a {type: "text"}
item yourself. Any text item you supply in metadata.content[] is replaced
by your top-level prompt. Other content items (image_url, video_url,
audio_url) pass through unchanged.
Body fields
These fields go inside metadata. Arrange them per the variant tables below.
| Field | Type | Notes |
|---|
content | array | Multimodal reference items. Each item: {type, image_url? | video_url? | audio_url?, role?}. Skip if pure text-to-video. |
ratio | string | Aspect ratio. 16:9 / 9:16 / 1:1 / 4:3 / 3:4 / 21:9 / adaptive. adaptive infers from input. |
duration | integer | Seconds. Allowed range depends on model — see the table above. |
resolution | string | 480p / 720p / 1080p. Default 720p. 1080p only on seedance-2.0 / seedance-2.0-fast / seedance-1-5-pro / seedance-1-0-pro / seedance-1-0-pro-fast. |
generate_audio | boolean | Auto-generate a synced soundtrack. Default false. Only on seedance-2.0 / 2.0-fast / 1-5-pro. |
watermark | boolean | Imprint the upstream watermark. Default upstream-defined. |
seed | integer | Random seed for reproducibility. |
service_tier | string | default (online) or flex (offline / lower priority, higher quota). Defaults to default. |
return_last_frame | boolean | Return the final frame as an image alongside the MP4. Default false. |
callback_url | string | Webhook URL — receives status changes instead of (or alongside) polling. |
content[] item shape
Each item in metadata.content is one of four shapes:
{ "type": "image_url", "image_url": { "url": "https://..." }, "role": "first_frame" }
{ "type": "video_url", "video_url": { "url": "https://..." }, "role": "reference_video" }
{ "type": "audio_url", "audio_url": { "url": "https://..." }, "role": "reference_audio" }
{ "type": "text", "text": "..." } // automatically replaced by top-level prompt
role values:
role | Purpose |
|---|
first_frame | Anchor this image as the first frame of the generated video. |
end_frame | Anchor this image as the last frame (use with first_frame for first+last frame i2v). |
reference_image | Style / subject reference (Multimodal reference variant; can pass multiple). |
reference_video | Style / motion reference, or the source video for editing / extension. |
reference_audio | Background music or voice reference (audio-video generation). |
Reference items inside the prompt with [Image 1], [Video 1], [Audio 1]
syntax. The index matches the array order (1-based, scoped per type).
Poll for results
Use the task ID returned at submit time:
curl https://api.orcarouter.ai/v1/video/generations/task_9q9oz6tjtgABYWC1QIqoz3sscgVz7ycw \
-H "Authorization: Bearer sk-orca-..."
Response shape is wrapped (identical to Kling):
{
"code": "success",
"message": "",
"data": {
"task_id": "task_9q9oz6tjtgABYWC1QIqoz3sscgVz7ycw",
"status": "SUCCESS",
"progress": "100%",
"result_url": "https://ark-content-generation-ap-southeast-1.tos-ap-southeast-1.volces.com/.../video.mp4",
"submit_time": 1777975188,
"start_time": 1777975241,
"finish_time": 1777975277,
"fail_reason": ""
}
}
Status values are normalized to uppercase across providers:
| Status | Upstream Seedance status | Meaning |
|---|
NOT_START | (transient) | Task row created, not yet dispatched |
SUBMITTED | queued | Sent to upstream, waiting in the queue |
IN_PROGRESS | running | Upstream is rendering |
SUCCESS | succeeded | Done. data.result_url carries the MP4 |
FAILURE | failed | Failed. data.fail_reason has the reason |
Progress is a percent string ("50%", "100%"), not an integer.
Poll every 5 - 10 seconds. A 5-second 720p clip typically completes in 30 - 60
seconds; 1080p with audio or 15-second / multimodal-reference clips can take
3 - 5 minutes.
The result_url is an upstream-signed TOS URL with a short TTL — download
or rehost promptly if you need long retention.
Endpoint variants
All variants share POST /v1/video/generations. Which Seedance feature path
the upstream serves is determined by the metadata.content[] items and
role markers — not by URL.
Text-to-video
Just model + prompt + optional metadata. No content items means pure
text-to-video:
curl https://api.orcarouter.ai/v1/video/generations \
-H "Authorization: Bearer sk-orca-..." \
-H "Content-Type: application/json" \
-d '{
"model": "byteplus/dreamina-seedance-2-0-260128",
"prompt": "Photorealistic style: under a clear blue sky, a vast expanse of white daisy fields stretches out. The camera gradually zooms in on a single daisy with glistening dewdrops on its petals.",
"metadata": {
"ratio": "16:9",
"duration": 5,
"watermark": true
}
}'
Image-to-video — first frame
Pass one image item with role: "first_frame":
curl https://api.orcarouter.ai/v1/video/generations \
-H "Authorization: Bearer sk-orca-..." \
-H "Content-Type: application/json" \
-d '{
"model": "byteplus/dreamina-seedance-2-0-260128",
"prompt": "the cat starts dancing energetically",
"metadata": {
"content": [
{ "type": "image_url", "image_url": { "url": "https://example.com/cat.png" }, "role": "first_frame" }
],
"ratio": "adaptive",
"duration": 5,
"generate_audio": true
}
}'
Image-to-video — first and last frame
Two image items, one each for first_frame and end_frame:
curl https://api.orcarouter.ai/v1/video/generations \
-H "Authorization: Bearer sk-orca-..." \
-H "Content-Type: application/json" \
-d '{
"model": "byteplus/dreamina-seedance-2-0-260128",
"prompt": "Create a 360-degree orbiting camera shot from start to end frame.",
"metadata": {
"content": [
{ "type": "image_url", "image_url": { "url": "https://example.com/start.jpg" }, "role": "first_frame" },
{ "type": "image_url", "image_url": { "url": "https://example.com/end.jpg" }, "role": "end_frame" }
],
"ratio": "16:9",
"duration": 6
}
}'
Multimodal reference — image + video + audio
Combine reference_image / reference_video / reference_audio items.
Reference them in the prompt with [Image N] / [Video N] / [Audio N]
indices (1-based, per type):
curl https://api.orcarouter.ai/v1/video/generations \
-H "Authorization: Bearer sk-orca-..." \
-H "Content-Type: application/json" \
-d '{
"model": "byteplus/dreamina-seedance-2-0-260128",
"prompt": "Use the first-person POV framing from [Video 1] throughout, and use [Audio 1] as the background music. First-person POV fruit tea promotional ad: [Image 1] hands pick a dew-covered apple; [Image 2] holds the finished drink up to the camera.",
"metadata": {
"content": [
{ "type": "image_url", "image_url": { "url": "https://example.com/tea_pic1.jpg" }, "role": "reference_image" },
{ "type": "image_url", "image_url": { "url": "https://example.com/tea_pic2.jpg" }, "role": "reference_image" },
{ "type": "video_url", "video_url": { "url": "https://example.com/tea_video1.mp4" }, "role": "reference_video" },
{ "type": "audio_url", "audio_url": { "url": "https://example.com/tea_audio1.mp3" }, "role": "reference_audio" }
],
"ratio": "16:9",
"duration": 11,
"generate_audio": true,
"watermark": false
}
}'
Available on seedance-2.0 and seedance-2.0-fast (full image + video +
audio combinations); seedance-1-5-pro and seedance-1-0-* accept only
reference_image items.
Video editing / extension
Pass {type: "video_url", role: "reference_video"} and ask the prompt to
modify or extend it:
curl https://api.orcarouter.ai/v1/video/generations \
-H "Authorization: Bearer sk-orca-..." \
-H "Content-Type: application/json" \
-d '{
"model": "byteplus/dreamina-seedance-2-0-260128",
"prompt": "Change all the fruits in [Video 1] into fresh fruits.",
"metadata": {
"content": [
{ "type": "video_url", "video_url": { "url": "https://example.com/source.mp4" }, "role": "reference_video" }
],
"ratio": "adaptive",
"duration": 6
}
}'
Available on seedance-2.0 and seedance-2.0-fast only.
Webhooks
Pass metadata.callback_url: "https://your.domain/webhook" to receive a
POST when the task transitions to SUCCESS or FAILURE. The payload mirrors
the polling response. If you set both polling and a callback, you’ll get
both — they’re independent.
Billing
OrcaRouter passes through upstream’s per-task token charge with no markup.
Final cost matches ByteDance Ark’s published rate card (the upstream
completion_tokens / total_tokens from the task result are converted to
quota at the model’s per-token rate set in your Channel Margin config).
A small pre-consume hold is reserved at submit time; the difference settles
on success. See Operations / Billing & Usage.
See also