Kling AI
This page adapts the original AI SDK documentation: Kling AI.
The Kling AI provider contains support for Kling AI’s video generation models, including text-to-video, image-to-video, motion control, and multi-shot video generation.
The Kling AI provider is available in the KlingAIProvider module. Add it to your Swift package:
// Package.swift (excerpt)dependencies: [ .package(url: "https://github.com/teunlao/swift-ai-sdk", from: "0.17.5")],targets: [ .target( name: "YourTarget", dependencies: [ .product(name: "SwiftAISDK", package: "swift-ai-sdk"), .product(name: "KlingAIProvider", package: "swift-ai-sdk") ] )]Provider Instance
Section titled “Provider Instance”You can import the default provider instance klingai from KlingAIProvider:
import SwiftAISDKimport KlingAIProvider
let model = klingai.video("kling-v2.6-t2v")If you need a customized setup, you can use createKlingAI and create a provider instance with your settings:
import KlingAIProvider
let klingai = createKlingAI(settings: KlingAIProviderSettings( accessKey: "your-access-key", // optional, defaults to KLINGAI_ACCESS_KEY secretKey: "your-secret-key" // optional, defaults to KLINGAI_SECRET_KEY))You can use the following optional settings to customize the Kling AI provider instance:
-
accessKey
StringKling AI access key. Defaults to the
KLINGAI_ACCESS_KEYenvironment variable. -
secretKey
StringKling AI secret key. Defaults to the
KLINGAI_SECRET_KEYenvironment variable. -
baseURL
StringUse a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is
https://api-singapore.klingai.com. -
headers
[String: String]Custom headers to include in the requests.
-
fetch
FetchFunctionCustom fetch implementation. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing.
Video Models
Section titled “Video Models”You can create Kling AI video models using the .video() factory method.
For more on video generation with the Swift AI SDK see Video Generation.
This provider currently supports three video generation modes: text-to-video, image-to-video, and motion control.
Text-to-Video
Section titled “Text-to-Video”Generate videos from text prompts:
import SwiftAISDKimport KlingAIProviderimport Foundation
let result = try await experimental_generateVideo( model: klingai.video("kling-v2.6-t2v"), prompt: "A chicken flying into the sunset in the style of 90s anime.", aspectRatio: "16:9", duration: 5, providerOptions: ["klingai": [ "mode": "std" ]])
try result.video.data.write(to: URL(fileURLWithPath: "video.mp4"))Image-to-Video
Section titled “Image-to-Video”Generate videos from a start frame image with an optional text prompt. The popular start+end frame feature is available via the imageTail option:
import SwiftAISDKimport KlingAIProviderimport Foundation
let result = try await experimental_generateVideo( model: klingai.video("kling-v2.6-i2v"), prompt: .imageToVideo( image: .string("https://example.com/start-frame.png"), text: "The cat slowly turns its head and blinks" ), duration: 5, providerOptions: ["klingai": [ // Pro mode required for start+end frame control (most models) "mode": "pro", // Optional: end frame image "imageTail": "https://example.com/end-frame.png" ]])
try result.video.data.write(to: URL(fileURLWithPath: "video.mp4"))Multi-Shot Video Generation
Section titled “Multi-Shot Video Generation”Generate videos with multiple storyboard shots, each with its own prompt and duration (Kling v3.0+):
import SwiftAISDKimport KlingAIProviderimport Foundation
let result = try await experimental_generateVideo( model: klingai.video("kling-v3.0-t2v"), prompt: "", aspectRatio: "16:9", duration: 10, providerOptions: ["klingai": [ "mode": "pro", "multiShot": true, "shotType": "customize", "multiPrompt": [ [ "index": 1, "prompt": "A sunrise over a calm ocean, warm golden light.", "duration": "4" ], [ "index": 2, "prompt": "A flock of seagulls take flight from the beach.", "duration": "3" ], [ "index": 3, "prompt": "Waves crash against rocky cliffs at sunset.", "duration": "3" ] ], "sound": "on" ]])
try result.video.data.write(to: URL(fileURLWithPath: "video.mp4"))Multi-shot also works with image-to-video by combining a start frame image with per-shot prompts.
Motion Control
Section titled “Motion Control”Generate video by transferring motion from a reference video to a character image:
import SwiftAISDKimport KlingAIProviderimport Foundation
let result = try await experimental_generateVideo( model: klingai.video("kling-v2.6-motion-control"), prompt: .imageToVideo( image: .string("https://example.com/character.png"), text: "The character performs a smooth dance move" ), providerOptions: ["klingai": [ "videoUrl": "https://example.com/reference-motion.mp4", "characterOrientation": "image", "mode": "std" ]])
try result.video.data.write(to: URL(fileURLWithPath: "video.mp4"))Video Provider Options
Section titled “Video Provider Options”The following provider options are available via providerOptions["klingai"]. Options vary by mode — see the
KlingAI Capability Map for per-model support.
Common Options
Section titled “Common Options”-
mode ‘std’ | ‘pro’
Video generation mode.
'std'is cost-effective.'pro'produces higher quality but takes longer. -
pollIntervalMs number
Polling interval in milliseconds for checking task status. Defaults to 5000.
-
pollTimeoutMs number
Maximum wait time in milliseconds for video generation. Defaults to 600000 (10 minutes).
Text-to-Video and Image-to-Video Options
Section titled “Text-to-Video and Image-to-Video Options”-
negativePrompt string
A description of what to avoid in the generated video (max 2500 characters).
-
sound ‘on’ | ‘off’
Whether to generate audio simultaneously. Only V2.6 and subsequent models support this, and requires
mode: 'pro'. -
cfgScale number
Flexibility in video generation. Higher values mean stronger prompt adherence. Range: [0, 1]. Not supported by V2.x models.
-
cameraControl object
Camera movement control with a
typepreset ('simple','down_back','forward_up','right_turn_forward','left_turn_forward') and optionalconfigwithhorizontal,vertical,pan,tilt,roll,zoomvalues (range: [-10, 10]). -
multiShot boolean
Enable multi-shot video generation (Kling v3.0+). When true, the video is split into up to 6 storyboard shots with individual prompts and durations.
-
shotType ‘customize’ | ‘intelligence’
Storyboard method for multi-shot generation.
'customize'usesmultiPromptfor user-defined shots.'intelligence'lets the model auto-segment based on the main prompt. Required whenmultiShotis true. -
multiPrompt Array<{index, prompt, duration}>
Per-shot details for multi-shot generation. Each shot has an
index(number),prompt(string, max 512 characters), andduration(string, in seconds). Shot durations must sum to the total duration. Required whenmultiShotis true andshotTypeis'customize'. -
voiceList Array<{voice_id: string}>
Voice references for voice control (Kling v3.0+). Up to 2 voices. Reference via
<<<voice_1>>>template syntax in the prompt. Requiressound: 'on'. Cannot coexist withelementListon the I2V endpoint.
Image-to-Video Only Options
Section titled “Image-to-Video Only Options”-
imageTail string
End frame image for start+end frame control. Accepts an image URL or raw base64-encoded data. Requires
mode: 'pro'for most models. -
staticMask string
Static brush mask image for motion brush. Accepts an image URL or raw base64-encoded data.
-
dynamicMasks Array
Dynamic brush configurations for motion brush. Up to 6 groups, each with a
mask(image URL or base64) andtrajectories(array of{x, y}coordinates). -
elementList Array<{element_id: number}>
Reference elements for element control (Kling v3.0+ I2V). Supports video character elements and multi-image elements. Up to 3 reference elements. Cannot coexist with
voiceList.
Motion Control Only Options
Section titled “Motion Control Only Options”-
videoUrl string (required)
URL of the reference motion video. Supports .mp4/.mov, max 100MB, duration 3–30 seconds.
-
characterOrientation ‘image’ | ‘video’ (required)
Orientation of the characters in the generated video.
'image'matches the reference image orientation (max 10s video).'video'matches the reference video orientation (max 30s video). -
keepOriginalSound ‘yes’ | ‘no’
Whether to keep the original sound from the reference video. Defaults to
'yes'. -
watermarkEnabled boolean
Whether to generate watermarked results simultaneously.
Video Model Capabilities
Section titled “Video Model Capabilities”Text-to-Video
Section titled “Text-to-Video”| Model | Description |
|---|---|
kling-v3.0-t2v | Latest v3.0, multi-shot, voice control, sound (3-15s) |
kling-v2.6-t2v | V2.6, sound in pro mode |
kling-v2.5-turbo-t2v | Optimized for speed, std and pro |
kling-v2.1-master-t2v | High-quality generation, pro only |
kling-v2-master-t2v | Master-quality generation |
kling-v1.6-t2v | V1.6 generation, std and pro |
kling-v1-t2v | Original V1 model, supports camera control (std) |
Image-to-Video
Section titled “Image-to-Video”| Model | Description |
|---|---|
kling-v3.0-i2v | Latest v3.0, multi-shot, element/voice control, sound (3-15s) |
kling-v2.6-i2v | V2.6, sound and end-frame in pro mode |
kling-v2.5-turbo-i2v | Optimized for speed, end-frame in pro |
kling-v2.1-master-i2v | High-quality generation, pro only |
kling-v2.1-i2v | V2.1 generation, end-frame in pro |
kling-v2-master-i2v | Master-quality generation |
kling-v1.6-i2v | V1.6 generation, end-frame in pro |
kling-v1.5-i2v | V1.5 generation, end-frame and motion brush in pro |
kling-v1-i2v | Original V1 model, end-frame and motion brush in std/pro |
Motion Control
Section titled “Motion Control”| Model | Description |
|---|---|
kling-v2.6-motion-control | Transfers motion from a reference video to a character image |