Skip to content

Image Generation

This page adapts the original AI SDK documentation: Image Generation.

Warning Image generation is an experimental feature.

The AI SDK provides the generateImage function to generate images based on a given prompt using an image model.

import SwiftAISDK
import OpenAIProvider
let result = try await generateImage(
model: openai.image("dall-e-3"),
prompt: "Santa Claus driving a Cadillac"
)
let image = result.image

You can access the image data using the base64 or data helpers:

let base64 = image.base64 // Base64 image data
let data = image.data // `Data` binary payload

Depending on the model, you can either specify the size or the aspect ratio.

The size is specified as a string in the format {width}x{height}. Models only support a few sizes, and the supported sizes are different for each model and provider.

let sized = try await generateImage(
model: openai.image("dall-e-3"),
prompt: "Santa Claus driving a Cadillac",
size: "1024x1024"
)

The aspect ratio is specified as a string in the format {width}:{height}. Models only support a few aspect ratios, and the supported aspect ratios are different for each model and provider.

import GoogleProvider
let wide = try await generateImage(
model: GoogleProvider.image("imagen-3.0-generate-002"),
prompt: "Santa Claus driving a Cadillac",
aspectRatio: "16:9"
)

generateImage also supports generating multiple images at once:

let multiple = try await generateImage(
model: openai.image("dall-e-2"),
prompt: "Santa Claus driving a Cadillac",
n: 4
)
let images = multiple.images

Note generateImage automatically issues additional calls (in parallel when supported) to satisfy numberOfImages.

Each image model has an internal limit on how many images it can generate in a single API call. The AI SDK manages this automatically by batching requests appropriately when you request multiple images using the numberOfImages parameter. By default, the SDK uses provider-documented limits (for example, DALL-E 3 can only generate 1 image per call, while DALL-E 2 supports up to 10).

If needed, you can override this behavior using the maxImagesPerCall setting when generating your image. This is particularly useful when working with new or custom models where the default batch size might not be optimal:

let forcedBatch = try await generateImage(
model: openai.image("dall-e-2"),
prompt: "Santa Claus driving a Cadillac",
maxImagesPerCall: 5,
n: 10
)

You can provide a seed to the generateImage function to control the output of the image generation process. If supported by the model, the same seed will always produce the same image.

let seeded = try await generateImage(
model: openai.image("dall-e-3"),
prompt: "Santa Claus driving a Cadillac",
seed: 1_234_567_890
)

Image models often have provider- or even model-specific settings. You can pass such settings to the generateImage function using the providerOptions parameter. The options for the provider become request body properties.

let vivid = try await generateImage(
model: openai.image("dall-e-3"),
prompt: "Santa Claus driving a Cadillac",
size: "1024x1024",
providerOptions: ["openai": [
"style": "vivid",
"quality": "hd"
]]
)

generateImage accepts an optional abortSignal closure of type @Sendable () -> Bool that you can use to abort the image generation process or set a timeout.

let deadline = Date().addingTimeInterval(1)
let timed = try await generateImage(
model: openai.image("dall-e-3"),
prompt: "Santa Claus driving a Cadillac",
abortSignal: { Date() >= deadline }
)

generateImage accepts an optional headers parameter of type [String: String] that you can use to add custom headers to the image generation request.

let withHeaders = try await generateImage(
model: openai.image("dall-e-3"),
prompt: "Santa Claus driving a Cadillac",
headers: ["X-Custom-Header": "custom-value"]
)

If the model returns warnings, e.g. for unsupported parameters, they will be available in the warnings property of the response.

let warned = try await generateImage(
model: openai.image("dall-e-3"),
prompt: "Santa Claus driving a Cadillac"
)
print(warned.warnings ?? [])

Some providers expose additional metadata for the result overall or per image.

let prompt = "Santa Claus driving a Cadillac"
let generated = try await generateImage(
model: openai.image("dall-e-3"),
prompt: prompt
)
if let openAI = generated.providerMetadata["openai"],
let first = openAI.images.first,
case let .object(meta) = first,
case let .string(revised) = meta["revisedPrompt"] {
print(["prompt": prompt, "revised": revised])
}

The outer key of the returned providerMetadata is the provider name. The inner values are the metadata. An images key is always present in the metadata and is an array with the same length as the top level images key.

When generateImage cannot generate a valid image, it throws a NoImageGeneratedError.

This error occurs when the AI provider fails to generate an image. It can arise due to the following reasons:

  • The model failed to generate a response
  • The model generated a response that could not be parsed

The error preserves the following information to help you log the issue:

  • responses: Metadata about the image model responses, including timestamp, model, and headers.
  • cause: The cause of the error. You can use this for more detailed error handling
import SwiftAISDK
let promptText = "Santa Claus driving a Cadillac"
do {
_ = try await generateImage(model: openai.image("dall-e-3"), prompt: promptText)
} catch let error as NoImageGeneratedError {
print("NoImageGeneratedError")
print("Cause:", error.cause ?? "none")
print("Responses:", error.responses)
}

Some language models such as Google gemini-2.5-flash-image-preview support multi-modal outputs including images. With such models, you can access the generated images using the files property of the response.

import SwiftAISDK
import GoogleProvider
let result = try await generateText(
model: google("gemini-2.5-flash-image-preview"),
prompt: "Generate an image of a comic cat"
)
for file in result.files {
if file.mediaType.starts(with: "image/") {
// Access image data: file.base64(), file.data(), etc.
}
}
ProviderModelSupport sizes (width x height) or aspect ratios (width : height)
xAI Grokgrok-2-image1024x768 (default)
OpenAIgpt-image-11024x1024, 1536x1024, 1024x1536
OpenAIdall-e-31024x1024, 1792x1024, 1024x1792
OpenAIdall-e-2256x256, 512x512, 1024x1024
Amazon Bedrockamazon.nova-canvas-v1:0320-4096 (multiples of 16), 1:4 to 4:1, max 4.2M pixels
Falfal-ai/flux/dev1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9
Falfal-ai/flux-lora1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9
Falfal-ai/fast-sdxl1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9
Falfal-ai/flux-pro1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9
Falfal-ai/flux-pro-1.11:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9
Google Vertex AIimagen-3.0-generate-00216:9, 9:16, 1:1, 4:3, 3:4, 2:3, 3:2, 5:4, 4:5
Google Vertex AIimagen-3.0-fast-generate-00216:9, 9:16, 1:1, 4:3, 3:4, 2:3, 3:2, 5:4, 4:5
Google Vertex AIimagen-2.0-generate-0011024x1024, 512x512
Google Vertex AIimagen-2.0-fast-generate-0011024x1024, 512x512
Stability AIstable-image-ultra1:1, 2:3, 3:2, 3:4, 4:3, 5:4, 4:5, 9:16, 16:9, 1:2
Stability AIstable-image-core1:1, 2:3, 3:2, 3:4, 4:3, 5:4, 4:5, 9:16, 16:9, 1:2
Stability AIsd3.5-large1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5
Stability AIsd3.5-large-turbo1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5
Stability AIsd3.5-medium1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5
Stability AIsd3.5-medium-turbo1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5
Stability AIsd3-large1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5
Stability AIsd3-large-turbo1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5
Stability AIstable-diffusion-xl-lightning1:1, 16:9, 9:16
Stability AIstable-diffusion-xl-base-1.01:1, 16:9, 9:16
Mistralmistral-small-latest1:1, 3:4, 4:3, 9:16, 16:9
Mistralmistral-medium-latest1:1, 3:4, 4:3, 9:16, 16:9
Mistralmistral-large-latest1:1, 3:4, 4:3, 9:16, 16:9
Replicateblack-forest-labs/flux-schnell1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9
Replicateblack-forest-labs/flux-pro1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9
Replicatestability-ai/stable-diffusion-xl1:1, 16:9, 9:16
Replicatestability-ai/stable-diffusion-31:1, 16:9, 9:16
Replicaterecraft-ai/recraft-v31:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9
Replicateinvoke-ai/invokeai1:1, 16:9, 9:16
Perplexityllama-3.1-70b-versatile1:1, 3:4, 4:3, 9:16, 16:9