Image Generation
This page adapts the original AI SDK documentation: Image Generation.
Warning Image generation is an experimental feature.
The AI SDK provides the generateImage
function to generate images based on a given prompt using an image model.
import SwiftAISDKimport OpenAIProvider
let result = try await generateImage( model: openai.image("dall-e-3"), prompt: "Santa Claus driving a Cadillac")
let image = result.imageYou can access the image data using the base64 or data helpers:
let base64 = image.base64 // Base64 image datalet data = image.data // `Data` binary payloadSettings
Section titled “Settings”Size and Aspect Ratio
Section titled “Size and Aspect Ratio”Depending on the model, you can either specify the size or the aspect ratio.
The size is specified as a string in the format {width}x{height}.
Models only support a few sizes, and the supported sizes are different for each model and provider.
let sized = try await generateImage( model: openai.image("dall-e-3"), prompt: "Santa Claus driving a Cadillac", size: "1024x1024")Aspect Ratio
Section titled “Aspect Ratio”The aspect ratio is specified as a string in the format {width}:{height}.
Models only support a few aspect ratios, and the supported aspect ratios are different for each model and provider.
import GoogleProvider
let wide = try await generateImage( model: GoogleProvider.image("imagen-3.0-generate-002"), prompt: "Santa Claus driving a Cadillac", aspectRatio: "16:9")Generating Multiple Images
Section titled “Generating Multiple Images”generateImage also supports generating multiple images at once:
let multiple = try await generateImage( model: openai.image("dall-e-2"), prompt: "Santa Claus driving a Cadillac", n: 4)
let images = multiple.imagesNote
generateImageautomatically issues additional calls (in parallel when supported) to satisfynumberOfImages.
Each image model has an internal limit on how many images it can generate in a single API call. The AI SDK manages this automatically by batching requests appropriately when you request multiple images using the numberOfImages parameter. By default, the SDK uses provider-documented limits (for example, DALL-E 3 can only generate 1 image per call, while DALL-E 2 supports up to 10).
If needed, you can override this behavior using the maxImagesPerCall setting when generating your image. This is particularly useful when working with new or custom models where the default batch size might not be optimal:
let forcedBatch = try await generateImage( model: openai.image("dall-e-2"), prompt: "Santa Claus driving a Cadillac", maxImagesPerCall: 5, n: 10)Providing a Seed
Section titled “Providing a Seed”You can provide a seed to the generateImage function to control the output of the image generation process.
If supported by the model, the same seed will always produce the same image.
let seeded = try await generateImage( model: openai.image("dall-e-3"), prompt: "Santa Claus driving a Cadillac", seed: 1_234_567_890)Provider-specific Settings
Section titled “Provider-specific Settings”Image models often have provider- or even model-specific settings.
You can pass such settings to the generateImage function using the providerOptions parameter. The options for the provider become request body properties.
let vivid = try await generateImage( model: openai.image("dall-e-3"), prompt: "Santa Claus driving a Cadillac", size: "1024x1024", providerOptions: ["openai": [ "style": "vivid", "quality": "hd" ]])Abort Signals and Timeouts
Section titled “Abort Signals and Timeouts”generateImage accepts an optional abortSignal closure of type @Sendable () -> Bool
that you can use to abort the image generation process or set a timeout.
let deadline = Date().addingTimeInterval(1)
let timed = try await generateImage( model: openai.image("dall-e-3"), prompt: "Santa Claus driving a Cadillac", abortSignal: { Date() >= deadline })Custom Headers
Section titled “Custom Headers”generateImage accepts an optional headers parameter of type [String: String]
that you can use to add custom headers to the image generation request.
let withHeaders = try await generateImage( model: openai.image("dall-e-3"), prompt: "Santa Claus driving a Cadillac", headers: ["X-Custom-Header": "custom-value"])Warnings
Section titled “Warnings”If the model returns warnings, e.g. for unsupported parameters, they will be available in the warnings property of the response.
let warned = try await generateImage( model: openai.image("dall-e-3"), prompt: "Santa Claus driving a Cadillac")
print(warned.warnings ?? [])Additional provider-specific metadata
Section titled “Additional provider-specific metadata”Some providers expose additional metadata for the result overall or per image.
let prompt = "Santa Claus driving a Cadillac"
let generated = try await generateImage( model: openai.image("dall-e-3"), prompt: prompt)
if let openAI = generated.providerMetadata["openai"], let first = openAI.images.first, case let .object(meta) = first, case let .string(revised) = meta["revisedPrompt"] { print(["prompt": prompt, "revised": revised])}The outer key of the returned providerMetadata is the provider name. The inner values are the metadata. An images key is always present in the metadata and is an array with the same length as the top level images key.
Error Handling
Section titled “Error Handling”When generateImage cannot generate a valid image, it throws a NoImageGeneratedError.
This error occurs when the AI provider fails to generate an image. It can arise due to the following reasons:
- The model failed to generate a response
- The model generated a response that could not be parsed
The error preserves the following information to help you log the issue:
responses: Metadata about the image model responses, including timestamp, model, and headers.cause: The cause of the error. You can use this for more detailed error handling
import SwiftAISDK
let promptText = "Santa Claus driving a Cadillac"
do { _ = try await generateImage(model: openai.image("dall-e-3"), prompt: promptText)} catch let error as NoImageGeneratedError { print("NoImageGeneratedError") print("Cause:", error.cause ?? "none") print("Responses:", error.responses)}Generating Images with Language Models
Section titled “Generating Images with Language Models”Some language models such as Google gemini-2.5-flash-image-preview support multi-modal outputs including images.
With such models, you can access the generated images using the files property of the response.
import SwiftAISDKimport GoogleProvider
let result = try await generateText( model: google("gemini-2.5-flash-image-preview"), prompt: "Generate an image of a comic cat")
for file in result.files { if file.mediaType.starts(with: "image/") { // Access image data: file.base64(), file.data(), etc. }}Image Models
Section titled “Image Models”| Provider | Model | Support sizes (width x height) or aspect ratios (width : height) |
|---|---|---|
| xAI Grok | grok-2-image | 1024x768 (default) |
| OpenAI | gpt-image-1 | 1024x1024, 1536x1024, 1024x1536 |
| OpenAI | dall-e-3 | 1024x1024, 1792x1024, 1024x1792 |
| OpenAI | dall-e-2 | 256x256, 512x512, 1024x1024 |
| Amazon Bedrock | amazon.nova-canvas-v1:0 | 320-4096 (multiples of 16), 1:4 to 4:1, max 4.2M pixels |
| Fal | fal-ai/flux/dev | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| Fal | fal-ai/flux-lora | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| Fal | fal-ai/fast-sdxl | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| Fal | fal-ai/flux-pro | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| Fal | fal-ai/flux-pro-1.1 | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| Google Vertex AI | imagen-3.0-generate-002 | 16:9, 9:16, 1:1, 4:3, 3:4, 2:3, 3:2, 5:4, 4:5 |
| Google Vertex AI | imagen-3.0-fast-generate-002 | 16:9, 9:16, 1:1, 4:3, 3:4, 2:3, 3:2, 5:4, 4:5 |
| Google Vertex AI | imagen-2.0-generate-001 | 1024x1024, 512x512 |
| Google Vertex AI | imagen-2.0-fast-generate-001 | 1024x1024, 512x512 |
| Stability AI | stable-image-ultra | 1:1, 2:3, 3:2, 3:4, 4:3, 5:4, 4:5, 9:16, 16:9, 1:2 |
| Stability AI | stable-image-core | 1:1, 2:3, 3:2, 3:4, 4:3, 5:4, 4:5, 9:16, 16:9, 1:2 |
| Stability AI | sd3.5-large | 1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5 |
| Stability AI | sd3.5-large-turbo | 1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5 |
| Stability AI | sd3.5-medium | 1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5 |
| Stability AI | sd3.5-medium-turbo | 1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5 |
| Stability AI | sd3-large | 1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5 |
| Stability AI | sd3-large-turbo | 1:1, 16:9, 9:16, 3:4, 4:3, 5:4, 4:5 |
| Stability AI | stable-diffusion-xl-lightning | 1:1, 16:9, 9:16 |
| Stability AI | stable-diffusion-xl-base-1.0 | 1:1, 16:9, 9:16 |
| Mistral | mistral-small-latest | 1:1, 3:4, 4:3, 9:16, 16:9 |
| Mistral | mistral-medium-latest | 1:1, 3:4, 4:3, 9:16, 16:9 |
| Mistral | mistral-large-latest | 1:1, 3:4, 4:3, 9:16, 16:9 |
| Replicate | black-forest-labs/flux-schnell | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| Replicate | black-forest-labs/flux-pro | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| Replicate | stability-ai/stable-diffusion-xl | 1:1, 16:9, 9:16 |
| Replicate | stability-ai/stable-diffusion-3 | 1:1, 16:9, 9:16 |
| Replicate | recraft-ai/recraft-v3 | 1:1, 3:4, 4:3, 9:16, 16:9, 9:21, 21:9 |
| Replicate | invoke-ai/invokeai | 1:1, 16:9, 9:16 |
| Perplexity | llama-3.1-70b-versatile | 1:1, 3:4, 4:3, 9:16, 16:9 |