Kling AI Provider
The Kling AI provider contains support for Kling AI's video generation models, including text-to-video, image-to-video, motion control, and multi-shot video generation.
Setup
The Kling AI provider is available in the @ai-sdk/klingai module. You can install it with
pnpm add @ai-sdk/klingai
Provider Instance
You can import the default provider instance klingai from @ai-sdk/klingai:
import { klingai } from '@ai-sdk/klingai';If you need a customized setup, you can import createKlingAI from @ai-sdk/klingai and create a provider instance with your settings:
import { createKlingAI } from '@ai-sdk/klingai';
const klingai = createKlingAI({ accessKey: 'your-access-key', secretKey: 'your-secret-key',});You can use the following optional settings to customize the Kling AI provider instance:
-
accessKey string
Kling AI access key. Defaults to the
KLINGAI_ACCESS_KEYenvironment variable. -
secretKey string
Kling AI secret key. Defaults to the
KLINGAI_SECRET_KEYenvironment variable. -
baseURL string
Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is
https://api-singapore.klingai.com. -
headers Record<string,string>
Custom headers to include in the requests.
-
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing.
Video Models
You can create Kling AI video models using the .video() factory method.
For more on video generation with the AI SDK see generateVideo().
This provider currently supports three video generation modes: text-to-video, image-to-video, and motion control.
Not all options are supported by every model version and mode combination. See the KlingAI Capability Map for detailed compatibility across models.
Text-to-Video
Generate videos from text prompts:
import { klingai, type KlingAIVideoModelOptions } from '@ai-sdk/klingai';import { experimental_generateVideo as generateVideo } from 'ai';
const { videos } = await generateVideo({ model: klingai.video('kling-v2.6-t2v'), prompt: 'A chicken flying into the sunset in the style of 90s anime.', aspectRatio: '16:9', duration: 5, providerOptions: { klingai: { mode: 'std', } satisfies KlingAIVideoModelOptions, },});Image-to-Video
Generate videos from a start frame image with an optional text prompt. The popular start+end frame feature is available via the imageTail option:
import { klingai, type KlingAIVideoModelOptions } from '@ai-sdk/klingai';import { experimental_generateVideo as generateVideo } from 'ai';
const { videos } = await generateVideo({ model: klingai.video('kling-v2.6-i2v'), prompt: { image: 'https://example.com/start-frame.png', text: 'The cat slowly turns its head and blinks', }, duration: 5, providerOptions: { klingai: { // Pro mode required for start+end frame control mode: 'pro', // Optional: end frame image imageTail: 'https://example.com/end-frame.png', } satisfies KlingAIVideoModelOptions, },});Multi-Shot Video Generation
Generate videos with multiple storyboard shots, each with its own prompt and duration (Kling v3.0+):
import { klingai, type KlingAIVideoModelOptions } from '@ai-sdk/klingai';import { experimental_generateVideo as generateVideo } from 'ai';
const { videos } = await generateVideo({ model: klingai.video('kling-v3.0-t2v'), prompt: '', aspectRatio: '16:9', duration: 10, providerOptions: { klingai: { mode: 'pro', multiShot: true, shotType: 'customize', multiPrompt: [ { index: 1, prompt: 'A sunrise over a calm ocean, warm golden light.', duration: '4', }, { index: 2, prompt: 'A flock of seagulls take flight from the beach.', duration: '3', }, { index: 3, prompt: 'Waves crash against rocky cliffs at sunset.', duration: '3', }, ], sound: 'on', } satisfies KlingAIVideoModelOptions, },});Multi-shot also works with image-to-video by combining a start frame image with per-shot prompts.
Motion Control
Generate video by transferring motion from a reference video to a character image:
import { klingai, type KlingAIVideoModelOptions } from '@ai-sdk/klingai';import { experimental_generateVideo as generateVideo } from 'ai';
const { videos } = await generateVideo({ model: klingai.video('kling-v2.6-motion-control'), prompt: { image: 'https://example.com/character.png', text: 'The character performs a smooth dance move', }, providerOptions: { klingai: { videoUrl: 'https://example.com/reference-motion.mp4', characterOrientation: 'image', mode: 'std', } satisfies KlingAIVideoModelOptions, },});Video Provider Options
The following provider options are available via providerOptions.klingai. Options vary by mode — see the
KlingAI Capability Map for per-model support.
Common Options
-
mode 'std' | 'pro'
Video generation mode.
'std'is cost-effective.'pro'produces higher quality but takes longer. -
pollIntervalMs number
Polling interval in milliseconds for checking task status. Defaults to 5000.
-
pollTimeoutMs number
Maximum wait time in milliseconds for video generation. Defaults to 600000 (10 minutes).
Text-to-Video and Image-to-Video Options
-
negativePrompt string
A description of what to avoid in the generated video (max 2500 characters).
-
sound 'on' | 'off'
Whether to generate audio simultaneously. Only V2.6 and subsequent models support this, and requires
mode: 'pro'. -
cfgScale number
Flexibility in video generation. Higher values mean stronger prompt adherence. Range: [0, 1]. Not supported by V2.x models.
-
cameraControl object
Camera movement control with a
typepreset ('simple','down_back','forward_up','right_turn_forward','left_turn_forward') and optionalconfigwithhorizontal,vertical,pan,tilt,roll,zoomvalues (range: [-10, 10]). -
multiShot boolean
Enable multi-shot video generation (Kling v3.0+). When true, the video is split into up to 6 storyboard shots with individual prompts and durations.
-
shotType 'customize' | 'intelligence'
Storyboard method for multi-shot generation.
'customize'usesmultiPromptfor user-defined shots.'intelligence'lets the model auto-segment based on the main prompt. Required whenmultiShotis true. -
multiPrompt Array<{index, prompt, duration}>
Per-shot details for multi-shot generation. Each shot has an
index(number),prompt(string, max 512 characters), andduration(string, in seconds). Shot durations must sum to the total duration. Required whenmultiShotis true andshotTypeis'customize'. -
voiceList Array<{voice_id: string}>
Voice references for voice control (Kling v3.0+). Up to 2 voices. Reference via
<<<voice_1>>>template syntax in the prompt. Requiressound: 'on'. Cannot coexist withelementListon the I2V endpoint.
Image-to-Video Only Options
-
imageTail string
End frame image for start+end frame control. Accepts an image URL or raw base64-encoded data. Requires
mode: 'pro'for most models. -
staticMask string
Static brush mask image for motion brush. Accepts an image URL or raw base64-encoded data.
-
dynamicMasks Array
Dynamic brush configurations for motion brush. Up to 6 groups, each with a
mask(image URL or base64) andtrajectories(array of{x, y}coordinates). -
elementList Array<{element_id: number}>
Reference elements for element control (Kling v3.0+ I2V). Supports video character elements and multi-image elements. Up to 3 reference elements. Cannot coexist with
voiceList.
Motion Control Only Options
-
videoUrl string (required)
URL of the reference motion video. Supports .mp4/.mov, max 100MB, duration 3–30 seconds.
-
characterOrientation 'image' | 'video' (required)
Orientation of the characters in the generated video.
'image'matches the reference image orientation (max 10s video).'video'matches the reference video orientation (max 30s video). -
keepOriginalSound 'yes' | 'no'
Whether to keep the original sound from the reference video. Defaults to
'yes'. -
watermarkEnabled boolean
Whether to generate watermarked results simultaneously.
Video generation is an asynchronous process that can take several minutes.
Consider setting pollTimeoutMs to at least 10 minutes (600000ms) for
reliable operation.
Video Model Capabilities
Text-to-Video
| Model | Description |
|---|---|
kling-v3.0-t2v | Latest v3.0, multi-shot, voice control, sound (3-15s) |
kling-v2.6-t2v | V2.6, sound in pro mode |
kling-v2.5-turbo-t2v | Optimized for speed, std and pro |
kling-v2.1-master-t2v | High-quality generation, pro only |
kling-v2-master-t2v | Master-quality generation |
kling-v1.6-t2v | V1.6 generation, std and pro |
kling-v1-t2v | Original V1 model, supports camera control (std) |
Image-to-Video
| Model | Description |
|---|---|
kling-v3.0-i2v | Latest v3.0, multi-shot, element/voice control, sound (3-15s) |
kling-v2.6-i2v | V2.6, sound and end-frame in pro mode |
kling-v2.5-turbo-i2v | Optimized for speed, end-frame in pro |
kling-v2.1-master-i2v | High-quality generation, pro only |
kling-v2.1-i2v | V2.1 generation, end-frame in pro |
kling-v2-master-i2v | Master-quality generation |
kling-v1.6-i2v | V1.6 generation, end-frame in pro |
kling-v1.5-i2v | V1.5 generation, end-frame and motion brush in pro |
kling-v1-i2v | Original V1 model, end-frame and motion brush in std/pro |
Motion Control
| Model | Description |
|---|---|
kling-v2.6-motion-control | Transfers motion from a reference video to a character image |
You can also pass any available provider model ID as a string if needed.