OpenAI
The OpenAI provider contains language model support for the OpenAI responses, chat, and completion APIs, as well as embedding model support for the OpenAI embeddings API.
The OpenAI provider is available in the OpenAIProvider module. Add it to your Swift package:
dependencies: [ .package(url: "https://github.com/teunlao/swift-ai-sdk", from: "1.0.0")],targets: [ .target( name: "YourTarget", dependencies: [ .product(name: "SwiftAISDK", package: "swift-ai-sdk"), .product(name: "OpenAIProvider", package: "swift-ai-sdk") ] )]Provider Instance
Section titled “Provider Instance”You can import the default provider instance openai from OpenAIProvider:
import SwiftAISDKimport OpenAIProvider
// Uses OPENAI_API_KEY environment variable automaticallylet model = openai("gpt-4o")If you need a customized setup, you can use createOpenAIProvider with your settings:
import OpenAIProvider
let provider = createOpenAIProvider( OpenAIProviderSettings( apiKey: "sk-...", organization: "org-...", headers: ["Custom-Header": "value"] ))
let model = provider("gpt-4o")You can use the following optional settings to customize the OpenAI provider instance:
-
baseURL string
Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is
https://api.openai.com/v1. -
apiKey string
API key that is being sent using the
Authorizationheader. It defaults to theOPENAI_API_KEYenvironment variable. -
name string
The provider name. You can set this when using OpenAI compatible providers to change the model provider property. Defaults to
openai. -
organization string
OpenAI Organization.
-
project string
OpenAI project.
-
headers [String: String]
Custom headers to include in the requests.
-
fetch FetchFunction
Custom fetch implementation. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for testing.
Language Models
Section titled “Language Models”The OpenAI provider instance is a function that you can invoke to create a language model:
let model = openai("gpt-5")It automatically selects the correct API based on the model id. You can also pass additional settings in the second argument:
let model = openai("gpt-5", /* additional settings */)The available options depend on the API that’s automatically chosen for the model (see below).
If you want to explicitly select a specific model API, you can use .responses, .chat, or .completion.
Example
Section titled “Example”You can use OpenAI language models to generate text with the generateText function:
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai("gpt-5"), prompt: "Write a vegetarian lasagna recipe for 4 people.")
print(result.text)OpenAI language models can also be used in the streamText, generateObject, and streamObject functions
(see AI SDK Core).
Responses Models
Section titled “Responses Models”You can use the OpenAI responses API with the openai(modelId) or openai.responses(modelId) factory methods. It is the default API that is used by the OpenAI provider (since AI SDK 5).
let model = openai("gpt-5")Further configuration can be done using OpenAI provider options.
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai("gpt-5"), // or openai.responses(modelId: "gpt-5") providerOptions: [ "openai": [ "parallelToolCalls": false, "store": false, "user": "user_123" ] ], prompt: "...")The following provider options are available:
-
parallelToolCalls boolean Whether to use parallel tool calls. Defaults to
true. -
store boolean
Whether to store the generation. Defaults to
true. -
maxToolCalls integer The maximum number of total calls to built-in tools that can be processed in a response. This maximum number applies across all built-in tool calls, not per individual tool. Any further attempts to call a tool by the model will be ignored.
-
metadata Record<string, string> Additional metadata to store with the generation.
-
previousResponseId string The ID of the previous response. You can use it to continue a conversation. Defaults to
undefined. -
instructions string Instructions for the model. They can be used to change the system or developer message when continuing a conversation using the
previousResponseIdoption. Defaults toundefined. -
user string A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Defaults to
undefined. -
reasoningEffort ‘minimal’ | ‘low’ | ‘medium’ | ‘high’ Reasoning effort for reasoning models. Defaults to
medium. If you useproviderOptionsto set thereasoningEffortoption, this model setting will be ignored. -
reasoningSummary ‘auto’ | ‘detailed’ Controls whether the model returns its reasoning process. Set to
'auto'for a condensed summary,'detailed'for more comprehensive reasoning. Defaults toundefined(no reasoning summaries). When enabled, reasoning summaries appear in the stream as events with type'reasoning'and in non-streaming responses within thereasoningfield. -
strictJsonSchema boolean Whether to use strict JSON schema validation. Defaults to
false. -
serviceTier ‘auto’ | ‘flex’ | ‘priority’ | ‘default’ Service tier for the request. Set to ‘flex’ for 50% cheaper processing at the cost of increased latency (available for o3, o4-mini, and gpt-5 models). Set to ‘priority’ for faster processing with Enterprise access (available for gpt-4, gpt-5, gpt-5-mini, o3, o4-mini; gpt-5-nano is not supported).
Defaults to ‘auto’.
-
textVerbosity ‘low’ | ‘medium’ | ‘high’ Controls the verbosity of the model’s response. Lower values result in more concise responses, while higher values result in more verbose responses. Defaults to
'medium'. -
include Array<string> Specifies additional content to include in the response. Supported values:
['file_search_call.results']for including file search results in responses.['message.output_text.logprobs']for logprobs. Defaults toundefined. -
promptCacheKey string A cache key for manual prompt caching control. Used by OpenAI to cache responses for similar requests to optimize your cache hit rates.
-
safetyIdentifier string A stable identifier used to help detect users of your application that may be violating OpenAI’s usage policies. The IDs should be a string that uniquely identifies each user.
The OpenAI responses provider also returns provider-specific metadata:
let result = try await generateText( model: openai.responses(modelId: "gpt-5"))
let openaiMetadata = result.providerMetadata?.openaiThe following OpenAI-specific metadata is returned:
-
responseId string The ID of the response. Can be used to continue a conversation.
-
cachedPromptTokens number The number of prompt tokens that were a cache hit.
-
reasoningTokens number The number of reasoning tokens that the model generated.
Reasoning Output
Section titled “Reasoning Output”For reasoning models like gpt-5, you can enable reasoning summaries to see the model’s thought process. Different models support different summarizers—for example, o4-mini supports detailed summaries. Set reasoningSummary: "auto" to automatically receive the richest level available.
import SwiftAISDKimport OpenAIProvider
let result = try streamText( model: openai("gpt-5"), prompt: "Tell me about the Mission burrito debate in San Francisco.", providerOptions: [ "openai": [ "reasoningSummary": "detailed" // 'auto' for condensed or 'detailed' for comprehensive ] ])
for try await part in result.fullStream { switch part { case .reasoning(let delta): print("Reasoning: \(delta)") case .textDelta(let delta): print(delta, terminator: "") default: break }}For non-streaming calls with generateText, the reasoning summaries are available in the reasoning field of the response:
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai("gpt-5"), prompt: "Tell me about the Mission burrito debate in San Francisco.", providerOptions: [ "openai": [ "reasoningSummary": "auto" ] ])
print("Reasoning:", result.reasoning ?? "")Learn more about reasoning summaries in the OpenAI documentation.
Verbosity Control
Section titled “Verbosity Control”You can control the length and detail of model responses using the textVerbosity parameter:
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai("gpt-5-mini"), prompt: "Write a poem about a boy and his first pet dog.", providerOptions: [ "openai": [ "textVerbosity": "low" // 'low' for concise, 'medium' (default), or 'high' for verbose ] ])The textVerbosity parameter scales output length without changing the underlying prompt:
'low': Produces terse, minimal responses'medium': Balanced detail (default)'high': Verbose responses with comprehensive detail
Web Search Tool
Section titled “Web Search Tool”The OpenAI responses API supports web search through the openai.tools.webSearch tool.
let result = try await generateText( model: openai("gpt-5"), prompt: "What happened in San Francisco last week?", tools: [ "web_search": openai.tools.webSearch( OpenAIWebSearchArgs( searchContextSize: "high", userLocation: OpenAIWebSearchArgs.UserLocation( city: "San Francisco", region: "California" ) ) ) ], // Force web search tool (optional): toolChoice: ["type": "tool", "toolName": "web_search"])
// URL sourceslet sources = result.sourcesFile Search Tool
Section titled “File Search Tool”The OpenAI responses API supports file search through the openai.tools.fileSearch tool.
You can force the use of the file search tool by setting the toolChoice parameter to { type: 'tool', toolName: 'file_search' }.
let result = try await generateText( model: openai("gpt-5"), prompt: "What does the document say about user authentication?", tools: [ "file_search": openai.tools.fileSearch( OpenAIFileSearchArgs( vectorStoreIds: ["vs_123"], maxNumResults: 5, ranking: OpenAIFileSearchArgs.RankingOptions( ranker: "auto", scoreThreshold: 0.5 ), filters: .object([ "key": .string("author"), "type": .string("eq"), "value": .string("Jane Smith") ]) ) ) ], providerOptions: [ "openai": [ // optional: include results "include": ["file_search_call.results"] ] ])Image Generation Tool
Section titled “Image Generation Tool”OpenAI’s Responses API supports multi-modal image generation as a provider-defined tool.
Availability is restricted to specific models (for example, gpt-5 variants).
You can use the image tool with either generateText or streamText:
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai("gpt-5"), prompt: "Generate an image of an echidna swimming across the Mozambique channel.", tools: [ "image_generation": openai.tools.imageGeneration( OpenAIImageGenerationArgs(outputFormat: "webp") ) ])
for toolResult in result.staticToolResults { if toolResult.toolName == "image_generation" { let base64Image = toolResult.output.result }}import SwiftAISDKimport OpenAIProvider
let result = try streamText( model: openai("gpt-5"), prompt: "Generate an image of an echidna swimming across the Mozambique channel.", tools: [ "image_generation": openai.tools.imageGeneration( OpenAIImageGenerationArgs( outputFormat: "webp", quality: "low" ) ) ])
for try await part in result.fullStream { if case .toolResult(let toolResult) = part, !toolResult.dynamic { let base64Image = toolResult.output.result }}For complete details on model availability, image quality controls, supported sizes, and tool-specific parameters, refer to the OpenAI documentation:
- Image generation overview and models: OpenAI Image Generation
- Image generation tool parameters (background, size, quality, format, etc.): Image Generation Tool Options
Code Interpreter Tool
Section titled “Code Interpreter Tool”The OpenAI responses API supports the code interpreter tool through the openai.tools.codeInterpreter tool.
This allows models to write and execute Python code.
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai("gpt-5"), prompt: "Write and run Python code to calculate the factorial of 10", tools: [ "code_interpreter": openai.tools.codeInterpreter( OpenAICodeInterpreterArgs( container: .auto(fileIds: ["file-123", "file-456"]) ) ) ])The code interpreter tool can be configured with:
- container: Either a container ID string or an object with
fileIdsto specify uploaded files that should be available to the code interpreter
Local Shell Tool
Section titled “Local Shell Tool”The OpenAI responses API support the local shell tool for Codex models through the openai.tools.localShell tool.
Local shell is a tool that allows agents to run shell commands locally on a machine you or the user provides.
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai.responses(modelId: "gpt-5-codex"), tools: [ "local_shell": openai.tools.localShell() ], prompt: "List the files in my home directory.", stopWhen: stepCountIs(2))Image Inputs
Section titled “Image Inputs”The OpenAI Responses API supports Image inputs for appropriate models. You can pass Image files as part of the message content using the ‘image’ type:
let result = try await generateText( model: openai("gpt-5"), messages: [ [ "role": "user", "content": [ [ "type": "text", "text": "Please describe the image." ], [ "type": "image", "image": try Data(contentsOf: URL(fileURLWithPath: "./data/image.png")) ] ] ] ])The model will have access to the image and will respond to questions about it.
The image should be passed using the image field.
You can also pass a file-id from the OpenAI Files API.
[ "type": "image", "image": "file-8EFBcWHsQxZV7YGezBC1fq"]You can also pass the URL of an image.
[ "type": "image", "image": "https://sample.edu/image.png"]PDF Inputs
Section titled “PDF Inputs”The OpenAI Responses API supports reading PDF files.
You can pass PDF files as part of the message content using the file type:
let result = try await generateText( model: openai("gpt-5"), messages: [ [ "role": "user", "content": [ [ "type": "text", "text": "What is an embedding model?" ], [ "type": "file", "data": try Data(contentsOf: URL(fileURLWithPath: "./data/ai.pdf")), "mediaType": "application/pdf", "filename": "ai.pdf" // optional ] ] ] ])You can also pass a file-id from the OpenAI Files API.
[ "type": "file", "data": "file-8EFBcWHsQxZV7YGezBC1fq", "mediaType": "application/pdf"]You can also pass the URL of a pdf.
[ "type": "file", "data": "https://sample.edu/example.pdf", "mediaType": "application/pdf", "filename": "ai.pdf" // optional]The model will have access to the contents of the PDF file and
respond to questions about it.
The PDF file should be passed using the data field,
and the mediaType should be set to 'application/pdf'.
Structured Outputs
Section titled “Structured Outputs”The OpenAI Responses API supports structured outputs. You can enforce structured outputs using generateObject or streamObject, which expose a schema option. Additionally, you can pass a Zod or JSON Schema object to the experimental_output option when using generateText or streamText.
// Using generateObjectimport SwiftAISDKimport OpenAIProviderstruct Ingredient: Codable, Sendable { let name: String; let amount: String }struct Recipe: Codable, Sendable { let name: String let ingredients: [Ingredient] let steps: [String]}
let result = try await generateObject( model: openai("gpt-4.1"), schema: Recipe.self, prompt: "Generate a lasagna recipe.", schemaName: "recipe", schemaDescription: "A recipe for lasagna.").objectChat Models
Section titled “Chat Models”You can create models that call the OpenAI chat API using the .chat() factory method.
The first argument is the model id, e.g. gpt-4.
The OpenAI chat models support tool calls and some have multi-modal capabilities.
let model = openai.chat(modelId: "gpt-5")OpenAI chat models support also some model specific provider options that are not part of the standard call settings.
You can pass them in the providerOptions argument:
import SwiftAISDKimport OpenAIProvider
let model = openai.chat(modelId: "gpt-5")
let result = try await generateText( model: model, prompt: "Hello!", providerOptions: [ "openai": [ "logitBias": [ "50256": -100 // optional likelihood for specific tokens ], "user": "test-user" // optional unique user identifier ] ])The following optional provider options are available for OpenAI chat models:
-
logitBias Record<number, number>
Modifies the likelihood of specified tokens appearing in the completion.
Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
As an example, you can pass
{"50256": -100}to prevent the token from being generated. -
logprobs boolean | number
Return the log probabilities of the tokens. Including logprobs will increase the response size and can slow down response times. However, it can be useful to better understand how the model is behaving.
Setting to true will return the log probabilities of the tokens that were generated.
Setting to a number will return the log probabilities of the top n tokens that were generated.
-
parallelToolCalls boolean
Whether to enable parallel function calling during tool use. Defaults to
true. -
user string
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more.
-
reasoningEffort ‘minimal’ | ‘low’ | ‘medium’ | ‘high’
Reasoning effort for reasoning models. Defaults to
medium. If you useproviderOptionsto set thereasoningEffortoption, this model setting will be ignored. -
structuredOutputs boolean
Whether to use structured outputs. Defaults to
true.When enabled, tool calls and object generation will be strict and follow the provided schema.
-
maxCompletionTokens number
Maximum number of completion tokens to generate. Useful for reasoning models.
-
store boolean
Whether to enable persistence in Responses API.
-
metadata Record<string, string>
Metadata to associate with the request.
-
prediction Record<string, any>
Parameters for prediction mode.
-
serviceTier ‘auto’ | ‘flex’ | ‘priority’ | ‘default’
Service tier for the request. Set to ‘flex’ for 50% cheaper processing at the cost of increased latency (available for o3, o4-mini, and gpt-5 models). Set to ‘priority’ for faster processing with Enterprise access (available for gpt-4, gpt-5, gpt-5-mini, o3, o4-mini; gpt-5-nano is not supported).
Defaults to ‘auto’.
-
strictJsonSchema boolean
Whether to use strict JSON schema validation. Defaults to
false. -
textVerbosity ‘low’ | ‘medium’ | ‘high’
Controls the verbosity of the model’s responses. Lower values will result in more concise responses, while higher values will result in more verbose responses.
-
promptCacheKey string
A cache key for manual prompt caching control. Used by OpenAI to cache responses for similar requests to optimize your cache hit rates.
-
safetyIdentifier string
A stable identifier used to help detect users of your application that may be violating OpenAI’s usage policies. The IDs should be a string that uniquely identifies each user.
Reasoning
Section titled “Reasoning”OpenAI has introduced the o1,o3, and o4 series of reasoning models.
Currently, o4-mini, o3, o3-mini, and o1 are available via both the chat and responses APIs. The
models codex-mini-latest and computer-use-preview are available only via the responses API.
Reasoning models currently only generate text, have several limitations, and are only supported using generateText and streamText.
They support additional settings and response metadata:
-
You can use
providerOptionsto set- the
reasoningEffortoption (or alternatively thereasoningEffortmodel setting), which determines the amount of reasoning the model performs.
- the
-
You can use response
providerMetadatato access the number of reasoning tokens that the model generated.
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai.chat(modelId: "gpt-5"), prompt: "Invent a new holiday and describe its traditions.", providerOptions: [ "openai": [ "reasoningEffort": "low" ] ])
print(result.text)print("Usage:", result.usage)print("Reasoning tokens:", result.providerMetadata?["openai"]?["reasoningTokens"] ?? 0)Structured Outputs
Section titled “Structured Outputs”Structured outputs are enabled by default.
You can disable them by setting the structuredOutputs option to false.
import SwiftAISDKimport OpenAIProviderimport AISDKProviderUtils
let result = try await generateObject( model: openai.chat(modelId: "gpt-4o-2024-08-06"), schema: Recipe.self, // reuse struct Recipe from snippet above prompt: "Generate a lasagna recipe.", schemaName: "recipe", schemaDescription: "A recipe for lasagna.", providerOptions: [ "openai": [ "structuredOutputs": false ] ])
print(result.object)Logprobs
Section titled “Logprobs”OpenAI provides logprobs information for completion/chat models.
You can access it in the providerMetadata object.
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai.chat(modelId: "gpt-5"), prompt: "Write a vegetarian lasagna recipe for 4 people.", providerOptions: [ "openai": [ // this can also be a number, // refer to logprobs provider options section for more "logprobs": true ] ])
let openaiMetadata = result.providerMetadata?["openai"]let logprobs = openaiMetadata?["logprobs"]Image Support
Section titled “Image Support”The OpenAI Chat API supports Image inputs for appropriate models. You can pass Image files as part of the message content using the ‘image’ type:
let result = try await generateText( model: openai.chat(modelId: "gpt-5"), messages: [ [ "role": "user", "content": [ [ "type": "text", "text": "Please describe the image." ], [ "type": "image", "image": try Data(contentsOf: URL(fileURLWithPath: "./data/image.png")) ] ] ] ])The model will have access to the image and will respond to questions about it.
The image should be passed using the image field.
You can also pass the URL of an image.
[ "type": "image", "image": "https://sample.edu/image.png"]PDF support
Section titled “PDF support”The OpenAI Chat API supports reading PDF files.
You can pass PDF files as part of the message content using the file type:
let result = try await generateText( model: openai.chat(modelId: "gpt-5"), messages: [ [ "role": "user", "content": [ [ "type": "text", "text": "What is an embedding model?" ], [ "type": "file", "data": try Data(contentsOf: URL(fileURLWithPath: "./data/ai.pdf")), "mediaType": "application/pdf", "filename": "ai.pdf" // optional ] ] ] ])The model will have access to the contents of the PDF file and
respond to questions about it.
The PDF file should be passed using the data field,
and the mediaType should be set to 'application/pdf'.
You can also pass a file-id from the OpenAI Files API.
[ "type": "file", "data": "file-8EFBcWHsQxZV7YGezBC1fq", "mediaType": "application/pdf"]You can also pass the URL of a PDF.
[ "type": "file", "data": "https://sample.edu/example.pdf", "mediaType": "application/pdf", "filename": "ai.pdf" // optional]Predicted Outputs
Section titled “Predicted Outputs”OpenAI supports predicted outputs for gpt-4o and gpt-4o-mini.
Predicted outputs help you reduce latency by allowing you to specify a base text that the model should modify.
You can enable predicted outputs by adding the prediction option to the providerOptions.openai object:
let result = try streamText( model: openai.chat(modelId: "gpt-5"), messages: [ [ "role": "user", "content": "Replace the Username property with an Email property." ], [ "role": "user", "content": existingCode ] ], providerOptions: [ "openai": [ "prediction": [ "type": "content", "content": existingCode ] ] ])OpenAI provides usage information for predicted outputs (acceptedPredictionTokens and rejectedPredictionTokens).
You can access it in the providerMetadata object.
let openaiMetadata = try await result.providerMetadata?.openai
let acceptedPredictionTokens = openaiMetadata?.acceptedPredictionTokenslet rejectedPredictionTokens = openaiMetadata?.rejectedPredictionTokensImage Detail
Section titled “Image Detail”You can use the openai provider option to set the image input detail to high, low, or auto:
let result = try await generateText( model: openai.chat(modelId: "gpt-5"), messages: [ [ "role": "user", "content": [ ["type": "text", "text": "Describe the image in detail."], [ "type": "image", "image": "https://github.com/vercel/ai/blob/main/examples/ai-core/data/comic-cat.png?raw=true",
// OpenAI specific options - image detail: "providerOptions": [ "openai": ["imageDetail": "low"] ] ] ] ] ])Distillation
Section titled “Distillation”OpenAI supports model distillation for some models.
If you want to store a generation for use in the distillation process, you can add the store option to the providerOptions.openai object.
This will save the generation to the OpenAI platform for later use in distillation.
import SwiftAISDKimport OpenAIProvider
func main() async throws { let result = try await generateText( model: openai.chat(modelId: "gpt-4o-mini"), prompt: "Who worked on the original macintosh?", providerOptions: [ "openai": [ "store": true, "metadata": [ "custom": "value" ] ] ] )
print(result.text) print() print("Usage:", result.usage)}
try await main()Prompt Caching
Section titled “Prompt Caching”OpenAI has introduced Prompt Caching for supported models
including gpt-4o and gpt-4o-mini.
- Prompt caching is automatically enabled for these models, when the prompt is 1024 tokens or longer. It does not need to be explicitly enabled.
- You can use response
providerMetadatato access the number of prompt tokens that were a cache hit. - Note that caching behavior is dependent on load on OpenAI’s infrastructure. Prompt prefixes generally remain in the cache following 5-10 minutes of inactivity before they are evicted, but during off-peak periods they may persist for up to an hour.
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai.chat(modelId: "gpt-4o-mini"), prompt: "A 1024-token or longer prompt...")
print("usage:", [ "promptTokens": result.usage.promptTokens, "completionTokens": result.usage.completionTokens, "totalTokens": result.usage.totalTokens, "cachedPromptTokens": result.providerMetadata?.openai?.cachedPromptTokens as Any])To improve cache hit rates, you can manually control caching using the promptCacheKey option:
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai.chat(modelId: "gpt-5"), prompt: "A 1024-token or longer prompt...", providerOptions: [ "openai": [ "promptCacheKey": "my-custom-cache-key-123" ] ])
print("usage:", [ "promptTokens": result.usage.promptTokens, "completionTokens": result.usage.completionTokens, "totalTokens": result.usage.totalTokens, "cachedPromptTokens": result.providerMetadata?.openai?.cachedPromptTokens as Any])Audio Input
Section titled “Audio Input”With the gpt-4o-audio-preview model, you can pass audio files to the model.
import SwiftAISDKimport OpenAIProvider
let result = try await generateText( model: openai.chat(modelId: "gpt-4o-audio-preview"), messages: [ [ "role": "user", "content": [ ["type": "text", "text": "What is the audio saying?"], [ "type": "file", "mediaType": "audio/mpeg", "data": try Data(contentsOf: URL(fileURLWithPath: "./data/galileo.mp3")) ] ] ] ])Completion Models
Section titled “Completion Models”You can create models that call the OpenAI completions API using the .completion() factory method.
The first argument is the model id.
Currently only gpt-3.5-turbo-instruct is supported.
let model = openai.completion("gpt-3.5-turbo-instruct")OpenAI completion models support also some model specific settings that are not part of the standard call settings. You can pass them as an options argument:
let model = openai.completion("gpt-3.5-turbo-instruct")
try await model.doGenerate( providerOptions: [ "openai": [ "echo": true, // optional, echo the prompt in addition to the completion "logitBias": [ // optional likelihood for specific tokens "50256": -100 ], "suffix": "some text", // optional suffix that comes after a completion of inserted text "user": "test-user" // optional unique user identifier ] ])The following optional provider options are available for OpenAI completion models:
-
echo: boolean
Echo back the prompt in addition to the completion.
-
logitBias Record<number, number>
Modifies the likelihood of specified tokens appearing in the completion.
Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
As an example, you can pass
{"50256": -100}to prevent the <|endoftext|> token from being generated. -
logprobs boolean | number
Return the log probabilities of the tokens. Including logprobs will increase the response size and can slow down response times. However, it can be useful to better understand how the model is behaving.
Setting to true will return the log probabilities of the tokens that were generated.
Setting to a number will return the log probabilities of the top n tokens that were generated.
-
suffix string
The suffix that comes after a completion of inserted text.
-
user string
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more.
Model Capabilities
Section titled “Model Capabilities”| Model | Image Input | Audio Input | Object Generation | Tool Usage |
|---|---|---|---|---|
gpt-5-pro | ||||
gpt-5 | ||||
gpt-5-mini | ||||
gpt-5-nano | ||||
gpt-5-codex | ||||
gpt-5-chat-latest | ||||
gpt-4.1 | ||||
gpt-4.1-mini | ||||
gpt-4.1-nano | ||||
gpt-4o | ||||
gpt-4o-mini |
Embedding Models
Section titled “Embedding Models”You can create models that call the OpenAI embeddings API
using the .textEmbedding() factory method.
let model = openai.textEmbedding("text-embedding-3-large")OpenAI embedding models support several additional provider options. You can pass them as an options argument:
import SwiftAISDKimport OpenAIProvider
let result = try await embed( model: openai.textEmbedding("text-embedding-3-large"), value: "sunny day at the beach", providerOptions: [ "openai": [ "dimensions": 512, // optional, number of dimensions for the embedding "user": "test-user" // optional unique user identifier ] ])let embedding = result.embeddingThe following optional provider options are available for OpenAI embedding models:
-
dimensions: number
The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.
-
user string
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more.
Model Capabilities
Section titled “Model Capabilities”| Model | Default Dimensions | Custom Dimensions |
|---|---|---|
text-embedding-3-large | 3072 | |
text-embedding-3-small | 1536 | |
text-embedding-ada-002 | 1536 |
Image Models
Section titled “Image Models”You can create models that call the OpenAI image generation API
using the .image() factory method.
let model = openai.image("dall-e-3")Model Capabilities
Section titled “Model Capabilities”| Model | Sizes |
|---|---|
gpt-image-1-mini | 1024x1024, 1536x1024, 1024x1536 |
gpt-image-1 | 1024x1024, 1536x1024, 1024x1536 |
dall-e-3 | 1024x1024, 1792x1024, 1024x1792 |
dall-e-2 | 256x256, 512x512, 1024x1024 |
You can pass optional providerOptions to the image model. These are prone to change by OpenAI and are model dependent. For example, the gpt-image-1 model supports the quality option:
let result = try await generateImage( model: openai.image("gpt-image-1"), prompt: "A salamander at sunrise in a forest pond in the Seychelles.", providerOptions: [ "openai": ["quality": "high"] ])let image = result.imagelet providerMetadata = result.providerMetadataFor more on generateImage() see Image Generation.
OpenAI’s image models may return a revised prompt for each image. It can be access at providerMetadata.openai.images[0]?.revisedPrompt.
For more information on the available OpenAI image model options, see the OpenAI API reference.
Transcription Models
Section titled “Transcription Models”You can create models that call the OpenAI transcription API
using the .transcription() factory method.
The first argument is the model id e.g. whisper-1.
let model = openai.transcription(modelId: "whisper-1")You can also pass additional provider-specific options using the providerOptions argument. For example, supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency.
import SwiftAISDKimport OpenAIProvider
let result = try await transcribe( model: openai.transcription(modelId: "whisper-1"), audio: Data([1, 2, 3, 4]), providerOptions: ["openai": ["language": "en"]])To get word-level timestamps, specify the granularity:
import SwiftAISDKimport OpenAIProvider
let result = try await transcribe( model: openai.transcription(modelId: "whisper-1"), audio: Data([1, 2, 3, 4]), providerOptions: [ "openai": [ //"timestampGranularities": ["word"], "timestampGranularities": ["segment"] ] ])
// Access word-level timestampsprint(result.segments) // Array of segments with startSecond/endSecondThe following provider options are available:
-
timestampGranularities string[] The granularity of the timestamps in the transcription. Defaults to
['segment']. Possible values are['word'],['segment'], and['word', 'segment']. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. -
language string The language of the input audio. Supplying the input language in ISO-639-1 format (e.g. ‘en’) will improve accuracy and latency. Optional.
-
prompt string An optional text to guide the model’s style or continue a previous audio segment. The prompt should match the audio language. Optional.
-
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. Defaults to 0. Optional.
-
include string[] Additional information to include in the transcription response.
Model Capabilities
Section titled “Model Capabilities”| Model | Transcription | Duration | Segments | Language |
|---|---|---|---|---|
whisper-1 | ||||
gpt-4o-mini-transcribe | ||||
gpt-4o-transcribe |
Speech Models
Section titled “Speech Models”You can create models that call the OpenAI speech API
using the .speech() factory method.
The first argument is the model id e.g. tts-1.
let model = openai.speech(modelId: "tts-1")You can also pass additional provider-specific options using the providerOptions argument. For example, supplying a voice to use for the generated audio.
import SwiftAISDKimport OpenAIProvider
let result = try await generateSpeech( model: openai.speech(modelId: "tts-1"), text: "Hello, world!", providerOptions: ["openai": [:]])-
instructions string Control the voice of your generated audio with additional instructions e.g. “Speak in a slow and steady tone”. Does not work with
tts-1ortts-1-hd. Optional. -
response_format string The format to audio in. Supported formats are
mp3,opus,aac,flac,wav, andpcm. Defaults tomp3. Optional. -
speed number The speed of the generated audio. Select a value from 0.25 to 4.0. Defaults to 1.0. Optional.
Model Capabilities
Section titled “Model Capabilities”| Model | Instructions |
|---|---|
tts-1 | |
tts-1-hd | |
gpt-4o-mini-tts |