Alibaba

This page adapts the original AI SDK documentation: Alibaba.

Alibaba Cloud Model Studio provides access to the Qwen model series, including advanced reasoning capabilities.

API keys can be obtained from the Console.

Setup

The Alibaba provider is available in the AlibabaProvider module. Add it to your Swift package:

// Package.swift (excerpt)
dependencies: [
  .package(url: "https://github.com/teunlao/swift-ai-sdk", from: "0.17.6")
],
targets: [
  .target(
    name: "YourTarget",
    dependencies: [
      .product(name: "SwiftAISDK", package: "swift-ai-sdk"),
      .product(name: "AlibabaProvider", package: "swift-ai-sdk")
    ]
  )
]

Provider Instance

You can import the default provider instance alibaba from AlibabaProvider:

import AlibabaProvider

let model = alibaba("qwen-plus")

For custom configuration, you can import createAlibaba (alias of createAlibabaProvider) and create a provider instance with your settings:

import AlibabaProvider

let customAlibaba = createAlibaba(
  settings: AlibabaProviderSettings(
    apiKey: "your-api-key",
    baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
    videoBaseURL: "https://dashscope-intl.aliyuncs.com"
  )
)

You can use the following optional settings to customize the Alibaba provider instance:

baseURL string

Use a different URL prefix for API calls, e.g. to use proxy servers or regional endpoints. The default prefix is https://dashscope-intl.aliyuncs.com/compatible-mode/v1.
videoBaseURL string

Use a different URL prefix for video generation API calls. The video API uses the DashScope native endpoint (not the OpenAI-compatible endpoint). The default prefix is https://dashscope-intl.aliyuncs.com.
apiKey string

API key that is being sent using the Authorization header. It defaults to the ALIBABA_API_KEY environment variable.
headers Record<string,string>

Custom headers to include in the requests.
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>

Custom fetch implementation.
includeUsage boolean

Include usage information in streaming responses. When enabled, token usage will be included in the final chunk. Defaults to true.

Language Models

You can create language models using a provider instance:

import SwiftAISDK
import AlibabaProvider

let result = try await generateText(
  model: alibaba("qwen-plus"),
  prompt: "Write a vegetarian lasagna recipe for 4 people."
)
print(result.text)

You can also use the .chatModel() or .languageModel() factory methods:

let model = alibaba.chatModel("qwen-plus")
// or
let model = alibaba.languageModel("qwen-plus")

Alibaba language models can be used in generateText, streamText, generateObject, and streamObject (see AI SDK Core).

The following optional provider options are available for Alibaba models:

enableThinking boolean

Enable thinking/reasoning mode for supported models. When enabled, the model generates reasoning content before the response. Defaults to false.
thinkingBudget number

Maximum number of reasoning tokens to generate. Limits the length of thinking content.
parallelToolCalls boolean

Whether to enable parallel function calling during tool use. Defaults to true.

Thinking Mode

Alibaba’s Qwen models support thinking/reasoning mode for complex problem-solving:

import SwiftAISDK
import AlibabaProvider

let result = try await generateText(
  model: alibaba("qwen3-max"),
  providerOptions: [
    "alibaba": [
      "enableThinking": true,
      "thinkingBudget": 2048
    ]
  ],
  prompt: #"How many "r"s are in the word "strawberry"?"#
)

print("Reasoning:", result.reasoning ?? "")
print("Answer:", result.text)

For models that are thinking-only (like qwen3-235b-a22b-thinking-2507), thinking mode is enabled by default.

Tool Calling

Alibaba models support tool calling with parallel execution:

import SwiftAISDK
import AlibabaProvider
import AISDKProvider
import AISDKProviderUtils

let weatherSchema = FlexibleSchema<JSONValue>(jsonSchema([
  "type": "object",
  "properties": [
    "location": ["type": "string"]
  ],
  "required": ["location"]
]))

let weather = tool(
  description: "Get the weather in a location",
  inputSchema: weatherSchema,
  execute: { input, _ in
    guard
      case let .object(obj) = input,
      case let .string(location)? = obj["location"]
    else {
      return .value(["error": "missing location"])
    }

    let temperature = 72 + Int.random(in: -10...10)
    return .value([
      "location": location,
      "temperature": temperature
    ])
  }
)

let result = try await generateText(
  model: alibaba("qwen-plus"),
  tools: ["weather": weather],
  prompt: "What is the weather in San Francisco?"
)
print(result.text)

Prompt Caching

Alibaba supports both implicit and explicit prompt caching to reduce costs for repeated prompts.

Implicit caching works automatically - the provider caches appropriate content without any configuration. For more control, you can use explicit caching by marking specific messages with cacheControl:

Single message cache control

import SwiftAISDK
import AlibabaProvider
import AISDKProviderUtils

let result = try await generateText(
  model: alibaba("qwen-plus"),
  messages: [
    .system(SystemModelMessage(
      content: "You are a helpful assistant. [... long system prompt ...]",
      providerOptions: [
        "alibaba": [
          "cacheControl": ["type": "ephemeral"]
        ]
      ]
    ))
  ]
)
print(result.text)

Multi-part message cache control

import SwiftAISDK
import AlibabaProvider
import AISDKProviderUtils

let longDocument = "... large document content ..."

let result = try await generateText(
  model: alibaba("qwen-plus"),
  messages: [
    .user(UserModelMessage(content: .parts([
      .text(TextPart(text: "Context: Please analyze this document.")),
      .text(TextPart(
        text: longDocument,
        providerOptions: [
          "alibaba": [
            "cacheControl": ["type": "ephemeral"]
          ]
        ]
      ))
    ])))
  ]
)
print(result.text)

Note: The minimum content length for a cache block is 1,024 tokens.

Video Models

You can create Wan video models that call the Alibaba Cloud DashScope API using the .video() factory method. For more on video generation with the Swift AI SDK see Video Generation.

Alibaba supports three video generation modes: text-to-video, image-to-video (first frame), and reference-to-video.

Text-to-Video

Generate videos from text prompts:

import SwiftAISDK
import AlibabaProvider
import Foundation

let result = try await experimental_generateVideo(
  model: alibaba.video("wan2.6-t2v"),
  prompt: "A serene mountain lake at sunset with gentle ripples on the water.",
  resolution: "1280x720",
  duration: 5,
  providerOptions: [
    "alibaba": [
      "promptExtend": true,
      "pollTimeoutMs": 600_000 // 10 minutes
    ]
  ]
)

try result.video.data.write(to: URL(fileURLWithPath: "video.mp4"))

Image-to-Video

Generate videos from a first-frame image and optional text prompt:

import SwiftAISDK
import AlibabaProvider
import AISDKProviderUtils
import Foundation

let imageData = try Data(contentsOf: URL(string: "https://example.com/landscape.jpg")!)

let result = try await experimental_generateVideo(
  model: alibaba.video("wan2.6-i2v"),
  prompt: .imageToVideo(
    image: .data(imageData),
    text: "Camera slowly pans across the landscape"
  ),
  duration: 5,
  providerOptions: [
    "alibaba": [
      "pollTimeoutMs": 600_000 // 10 minutes
    ]
  ]
)

try result.video.data.write(to: URL(fileURLWithPath: "video.mp4"))

Reference-to-Video

Generate videos using reference images and/or videos for character consistency. Use character identifiers (character1, character2, etc.) in your prompt to reference them:

import SwiftAISDK
import AlibabaProvider
import Foundation

let result = try await experimental_generateVideo(
  model: alibaba.video("wan2.6-r2v-flash"),
  prompt: "character1 walks through a beautiful garden and waves at the camera",
  resolution: "1280x720",
  duration: 5,
  providerOptions: [
    "alibaba": [
      "referenceUrls": ["https://example.com/character-reference.jpg"],
      "pollTimeoutMs": 600_000 // 10 minutes
    ]
  ]
)

try result.video.data.write(to: URL(fileURLWithPath: "video.mp4"))

Video Provider Options

The following provider options are available via providerOptions.alibaba:

negativePrompt string

A description of what to avoid in the generated video (max 500 characters).
audioUrl string

URL to an audio file for audio-video sync (WAV/MP3, 3-30 seconds, max 15MB).
promptExtend boolean

Enable prompt extension/rewriting for better generation quality. Defaults to true.
shotType 'single' | 'multi'

Shot type for video generation. 'multi' enables multi-shot cinematic narrative (wan2.6 models only).
watermark boolean

Whether to add a watermark to the generated video. Defaults to false.
audio boolean

Whether to generate audio (for I2V and R2V models that support it).
referenceUrls string[]

Array of reference image/video URLs for reference-to-video mode. Supports 0-5 images and 0-3 videos, max 5 total.
pollIntervalMs number

Polling interval in milliseconds for checking task status. Defaults to 5000.
pollTimeoutMs number

Maximum wait time in milliseconds for video generation. Defaults to 600000 (10 minutes).

Note: Video generation is an asynchronous process that can take several minutes. Consider setting pollTimeoutMs to at least 10 minutes (600000 ms) for reliable operation.

Video Model Capabilities

Text-to-Video

Model	Audio	Resolution	Duration
`wan2.6-t2v`	Yes	720P, 1080P	2-15s
`wan2.5-t2v-preview`	Yes	480P, 720P, 1080P	5s, 10s

Image-to-Video (First Frame)

Model	Audio	Resolution	Duration
`wan2.6-i2v-flash`	Optional	720P, 1080P	2-15s
`wan2.6-i2v`	Yes	720P, 1080P	2-15s

Reference-to-Video

Model	Audio	Resolution	Duration
`wan2.6-r2v-flash`	Optional	720P, 1080P	2-10s
`wan2.6-r2v`	Yes	720P, 1080P	2-10s

Note: The tables above list models available in the Singapore International region. You can also pass any available provider model ID as a string if needed.

Model Capabilities

Please see the Alibaba Cloud Model Studio docs for a full list of available models. You can also pass any available provider model ID as a string if needed.