The OpenAI provider contains language model support for the OpenAI responses, chat, and completion APIs, as well as embedding model support for the OpenAI embeddings API.
The OpenAI provider instance is a function that you can invoke to create a language model:
let model =openai("gpt-5")
It automatically selects the correct API based on the model id.
You can also pass additional settings in the second argument:
let model =openai("gpt-5", /* additional settings */)
The available options depend on the API that’s automatically chosen for the model (see below).
If you want to explicitly select a specific model API, you can use .responses, .chat, or .completion.
You can use the OpenAI responses API with the openai(modelId) or openai.responses(modelId) factory methods. It is the default API that is used by the OpenAI provider (since AI SDK 5).
let model =openai("gpt-5")
Further configuration can be done using OpenAI provider options.
import SwiftAISDK
import OpenAIProvider
let result =tryawaitgenerateText(
model: openai("gpt-5"), // or openai.responses("gpt-5")
providerOptions: [
"openai": [
"parallelToolCalls":false,
"store":false,
"user":"user_123"
]
],
prompt: "..."
)
The following provider options are available:
parallelToolCallsboolean
Whether to use parallel tool calls. Defaults to true.
storeboolean
Whether to store the generation. Defaults to true.
maxToolCallsinteger
The maximum number of total calls to built-in tools that can be processed in a response.
This maximum number applies across all built-in tool calls, not per individual tool.
Any further attempts to call a tool by the model will be ignored.
metadataRecord<string, string>
Additional metadata to store with the generation.
previousResponseIdstring
The ID of the previous response. You can use it to continue a conversation. Defaults to undefined.
instructionsstring
Instructions for the model.
They can be used to change the system or developer message when continuing a conversation using the previousResponseId option.
Defaults to undefined.
userstring
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Defaults to undefined.
reasoningEffort‘minimal’ | ‘low’ | ‘medium’ | ‘high’
Reasoning effort for reasoning models. Defaults to medium. If you use providerOptions to set the reasoningEffort option, this model setting will be ignored.
reasoningSummary‘auto’ | ‘detailed’
Controls whether the model returns its reasoning process. Set to 'auto' for a condensed summary, 'detailed' for more comprehensive reasoning. Defaults to undefined (no reasoning summaries). When enabled, reasoning summaries appear in the stream as events with type 'reasoning' and in non-streaming responses within the reasoning field.
strictJsonSchemaboolean
Whether to use strict JSON schema validation. Defaults to false.
serviceTier‘auto’ | ‘flex’ | ‘priority’ | ‘default’
Service tier for the request. Set to ‘flex’ for 50% cheaper processing
at the cost of increased latency (available for o3, o4-mini, and gpt-5 models).
Set to ‘priority’ for faster processing with Enterprise access (available for gpt-4, gpt-5, gpt-5-mini, o3, o4-mini; gpt-5-nano is not supported).
Defaults to ‘auto’.
textVerbosity‘low’ | ‘medium’ | ‘high’
Controls the verbosity of the model’s response. Lower values result in more concise responses,
while higher values result in more verbose responses. Defaults to 'medium'.
includeArray<string>
Specifies additional content to include in the response. Supported values:
['file_search_call.results'] for including file search results in responses.
['message.output_text.logprobs'] for logprobs.
Defaults to undefined.
promptCacheKeystring
A cache key for manual prompt caching control. Used by OpenAI to cache responses for similar requests to optimize your cache hit rates.
safetyIdentifierstring
A stable identifier used to help detect users of your application that may be violating OpenAI’s usage policies. The IDs should be a string that uniquely identifies each user.
The OpenAI responses provider also returns provider-specific metadata:
let result =tryawaitgenerateText(
model: openai.responses("gpt-5")
)
let openaiMetadata = result.providerMetadata?.openai
The following OpenAI-specific metadata is returned:
responseIdstring
The ID of the response. Can be used to continue a conversation.
cachedPromptTokensnumber
The number of prompt tokens that were a cache hit.
reasoningTokensnumber
The number of reasoning tokens that the model generated.
For reasoning models like gpt-5, you can enable reasoning summaries to see the model’s thought process. Different models support different summarizers—for example, o4-mini supports detailed summaries. Set reasoningSummary: "auto" to automatically receive the richest level available.
import SwiftAISDK
import OpenAIProvider
let result =trystreamText(
model: openai("gpt-5"),
prompt: "Tell me about the Mission burrito debate in San Francisco.",
providerOptions: [
"openai": [
"reasoningSummary":"detailed"// 'auto' for condensed or 'detailed' for comprehensive
]
]
)
fortryawait part in result.fullStream {
switch part {
case .reasoning(let delta):
print("Reasoning: \(delta)")
case .textDelta(let delta):
print(delta, terminator: "")
default:
break
}
}
For non-streaming calls with generateText, the reasoning summaries are available in the reasoning field of the response:
import SwiftAISDK
import OpenAIProvider
let result =tryawaitgenerateText(
model: openai("gpt-5"),
prompt: "Tell me about the Mission burrito debate in San Francisco.",
OpenAI’s Responses API supports multi-modal image generation as a provider-defined tool.
Availability is restricted to specific models (for example, gpt-5 variants).
You can use the image tool with either generateText or streamText:
import SwiftAISDK
import OpenAIProvider
let result =tryawaitgenerateText(
model: openai("gpt-5"),
prompt: "Generate an image of an echidna swimming across the Mozambique channel.",
tools: [
"image_generation": openai.tools.imageGeneration(
OpenAIImageGenerationArgs(outputFormat: "webp")
)
]
)
for toolResult in result.staticToolResults {
if toolResult.toolName =="image_generation" {
let base64Image = toolResult.output.result
}
}
import SwiftAISDK
import OpenAIProvider
let result =trystreamText(
model: openai("gpt-5"),
prompt: "Generate an image of an echidna swimming across the Mozambique channel.",
The OpenAI responses API supports the code interpreter tool through the openai.tools.codeInterpreter tool.
This allows models to write and execute Python code.
import SwiftAISDK
import OpenAIProvider
let result =tryawaitgenerateText(
model: openai("gpt-5"),
prompt: "Write and run Python code to calculate the factorial of 10",
The OpenAI responses API support the local shell tool for Codex models through the openai.tools.localShell tool.
Local shell is a tool that allows agents to run shell commands locally on a machine you or the user provides.
You can also pass a file-id from the OpenAI Files API.
[
"type":"file",
"data":"file-8EFBcWHsQxZV7YGezBC1fq",
"mediaType":"application/pdf"
]
You can also pass the URL of a pdf.
[
"type":"file",
"data":"https://sample.edu/example.pdf",
"mediaType":"application/pdf",
"filename":"ai.pdf"// optional
]
The model will have access to the contents of the PDF file and
respond to questions about it.
The PDF file should be passed using the data field,
and the mediaType should be set to 'application/pdf'.
The OpenAI Responses API supports structured outputs. You can enforce structured outputs using generateObject or streamObject, which expose a schema option. Additionally, you can pass a Zod or JSON Schema object to the experimental_output option when using generateText or streamText.
// Using generateObject
import SwiftAISDK
import OpenAIProvider
struct Ingredient: Codable, Sendable { let name: String; let amount: String }
You can create models that call the OpenAI chat API using the .chat() factory method.
The first argument is the model id, e.g. gpt-4.
The OpenAI chat models support tool calls and some have multi-modal capabilities.
let model = openai.chat("gpt-5")
OpenAI chat models support also some model specific provider options that are not part of the standard call settings.
You can pass them in the providerOptions argument:
import SwiftAISDK
import OpenAIProvider
let model = openai.chat("gpt-5")
let result =tryawaitgenerateText(
model: model,
prompt: "Hello!",
providerOptions: [
"openai": [
"logitBias": [
"50256":-100// optional likelihood for specific tokens
],
"user":"test-user"// optional unique user identifier
]
]
)
The following optional provider options are available for OpenAI chat models:
logitBiasRecord<number, number>
Modifies the likelihood of specified tokens appearing in the completion.
Accepts a JSON object that maps tokens (specified by their token ID in
the GPT tokenizer) to an associated bias value from -100 to 100. You
can use this tokenizer tool to convert text to token IDs. Mathematically,
the bias is added to the logits generated by the model prior to sampling.
The exact effect will vary per model, but values between -1 and 1 should
decrease or increase likelihood of selection; values like -100 or 100
should result in a ban or exclusive selection of the relevant token.
As an example, you can pass {"50256": -100} to prevent the token from being generated.
logprobsboolean | number
Return the log probabilities of the tokens. Including logprobs will increase
the response size and can slow down response times. However, it can
be useful to better understand how the model is behaving.
Setting to true will return the log probabilities of the tokens that
were generated.
Setting to a number will return the log probabilities of the top n
tokens that were generated.
parallelToolCallsboolean
Whether to enable parallel function calling during tool use. Defaults to true.
userstring
A unique identifier representing your end-user, which can help OpenAI to
monitor and detect abuse. Learn more.
Reasoning effort for reasoning models. Defaults to medium. If you use
providerOptions to set the reasoningEffort option, this
model setting will be ignored.
structuredOutputsboolean
Whether to use structured outputs.
Defaults to true.
When enabled, tool calls and object generation will be strict and follow the provided schema.
maxCompletionTokensnumber
Maximum number of completion tokens to generate. Useful for reasoning models.
Service tier for the request. Set to ‘flex’ for 50% cheaper processing
at the cost of increased latency (available for o3, o4-mini, and gpt-5 models).
Set to ‘priority’ for faster processing with Enterprise access (available for gpt-4, gpt-5, gpt-5-mini, o3, o4-mini; gpt-5-nano is not supported).
Defaults to ‘auto’.
strictJsonSchemaboolean
Whether to use strict JSON schema validation.
Defaults to false.
textVerbosity‘low’ | ‘medium’ | ‘high’
Controls the verbosity of the model’s responses. Lower values will result in more concise responses, while higher values will result in more verbose responses.
Tip: Instead of manually building providerOptions, you can use the Swift helper openai.options.responses(...) to produce the same dictionary. For example:
let options = openai.options.responses(
include: [.fileSearchCallResults],
serviceTier: "auto",
reasoningEffort: "high"
)
promptCacheKeystring
A cache key for manual prompt caching control. Used by OpenAI to cache responses for similar requests to optimize your cache hit rates.
safetyIdentifierstring
A stable identifier used to help detect users of your application that may be violating OpenAI’s usage policies. The IDs should be a string that uniquely identifies each user.
OpenAI has introduced the o1,o3, and o4 series of reasoning models.
Currently, o4-mini, o3, o3-mini, and o1 are available via both the chat and responses APIs. The
models codex-mini-latest and computer-use-preview are available only via the responses API.
Reasoning models currently only generate text, have several limitations, and are only supported using generateText and streamText.
They support additional settings and response metadata:
You can use providerOptions to set
the reasoningEffort option (or alternatively the reasoningEffort model setting), which determines the amount of reasoning the model performs.
You can use response providerMetadata to access the number of reasoning tokens that the model generated.
import SwiftAISDK
import OpenAIProvider
let result =tryawaitgenerateText(
model: openai.chat("gpt-5"),
prompt: "Invent a new holiday and describe its traditions.",
The model will have access to the contents of the PDF file and
respond to questions about it.
The PDF file should be passed using the data field,
and the mediaType should be set to 'application/pdf'.
You can also pass a file-id from the OpenAI Files API.
OpenAI supports predicted outputs for gpt-4o and gpt-4o-mini.
Predicted outputs help you reduce latency by allowing you to specify a base text that the model should modify.
You can enable predicted outputs by adding the prediction option to the providerOptions.openai object:
let result =trystreamText(
model: openai.chat("gpt-5"),
messages: [
[
"role":"user",
"content":"Replace the Username property with an Email property."
],
[
"role":"user",
"content": existingCode
]
],
providerOptions: [
"openai": [
"prediction": [
"type":"content",
"content": existingCode
]
]
]
)
OpenAI provides usage information for predicted outputs (acceptedPredictionTokens and rejectedPredictionTokens).
You can access it in the providerMetadata object.
let openaiMetadata =tryawait result.providerMetadata?.openai
let acceptedPredictionTokens = openaiMetadata?.acceptedPredictionTokens
let rejectedPredictionTokens = openaiMetadata?.rejectedPredictionTokens
OpenAI supports model distillation for some models.
If you want to store a generation for use in the distillation process, you can add the store option to the providerOptions.openai object.
This will save the generation to the OpenAI platform for later use in distillation.
OpenAI has introduced Prompt Caching for supported models
including gpt-4o and gpt-4o-mini.
Prompt caching is automatically enabled for these models, when the prompt is 1024 tokens or longer. It does
not need to be explicitly enabled.
You can use response providerMetadata to access the number of prompt tokens that were a cache hit.
Note that caching behavior is dependent on load on OpenAI’s infrastructure. Prompt prefixes generally remain in the
cache following 5-10 minutes of inactivity before they are evicted, but during off-peak periods they may persist for up
to an hour.
You can create models that call the OpenAI completions API using the .completion() factory method.
The first argument is the model id.
Currently only gpt-3.5-turbo-instruct is supported.
let model = openai.completion("gpt-3.5-turbo-instruct")
OpenAI completion models support also some model specific settings that are not part of the standard call settings.
You can pass them as an options argument:
let model = openai.completion("gpt-3.5-turbo-instruct")
tryawait model.doGenerate(
providerOptions: [
"openai": [
"echo":true, // optional, echo the prompt in addition to the completion
"logitBias": [
// optional likelihood for specific tokens
"50256":-100
],
"suffix":"some text", // optional suffix that comes after a completion of inserted text
"user":"test-user"// optional unique user identifier
]
]
)
The following optional provider options are available for OpenAI completion models:
echo: boolean
Echo back the prompt in addition to the completion.
logitBiasRecord<number, number>
Modifies the likelihood of specified tokens appearing in the completion.
Accepts a JSON object that maps tokens (specified by their token ID in
the GPT tokenizer) to an associated bias value from -100 to 100. You
can use this tokenizer tool to convert text to token IDs. Mathematically,
the bias is added to the logits generated by the model prior to sampling.
The exact effect will vary per model, but values between -1 and 1 should
decrease or increase likelihood of selection; values like -100 or 100
should result in a ban or exclusive selection of the relevant token.
As an example, you can pass {"50256": -100} to prevent the <|endoftext|>
token from being generated.
logprobsboolean | number
Return the log probabilities of the tokens. Including logprobs will increase
the response size and can slow down response times. However, it can
be useful to better understand how the model is behaving.
Setting to true will return the log probabilities of the tokens that
were generated.
Setting to a number will return the log probabilities of the top n
tokens that were generated.
suffixstring
The suffix that comes after a completion of inserted text.
userstring
A unique identifier representing your end-user, which can help OpenAI to
monitor and detect abuse. Learn more.
You can pass optional providerOptions to the image model. These are prone to change by OpenAI and are model dependent. For example, the gpt-image-1 model supports the quality option:
let result =tryawaitgenerateImage(
model: openai.image("gpt-image-1"),
prompt: "A salamander at sunrise in a forest pond in the Seychelles.",
You can create models that call the OpenAI transcription API
using the .transcription() factory method.
The first argument is the model id e.g. whisper-1.
let model = openai.transcription("whisper-1")
You can also pass additional provider-specific options using the providerOptions argument. For example, supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency.
import SwiftAISDK
import OpenAIProvider
let result =tryawaittranscribe(
model: openai.transcription("whisper-1"),
audio: Data([1, 2, 3, 4]),
providerOptions: ["openai": ["language":"en"]]
)
To get word-level timestamps, specify the granularity:
import SwiftAISDK
import OpenAIProvider
let result =tryawaittranscribe(
model: openai.transcription("whisper-1"),
audio: Data([1, 2, 3, 4]),
providerOptions: [
"openai": [
//"timestampGranularities": ["word"],
"timestampGranularities": ["segment"]
]
]
)
// Access word-level timestamps
print(result.segments) // Array of segments with startSecond/endSecond
The following provider options are available:
timestampGranularitiesstring[]
The granularity of the timestamps in the transcription.
Defaults to ['segment'].
Possible values are ['word'], ['segment'], and ['word', 'segment'].
Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.
languagestring
The language of the input audio. Supplying the input language in ISO-639-1 format (e.g. ‘en’) will improve accuracy and latency.
Optional.
promptstring
An optional text to guide the model’s style or continue a previous audio segment. The prompt should match the audio language.
Optional.
temperaturenumber
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.
Defaults to 0.
Optional.
includestring[]
Additional information to include in the transcription response.
You can create models that call the OpenAI speech API
using the .speech() factory method.
The first argument is the model id e.g. tts-1.
let model = openai.speech("tts-1")
You can also pass additional provider-specific options using the providerOptions argument. For example, supplying a voice to use for the generated audio.
import SwiftAISDK
import OpenAIProvider
let result =tryawaitgenerateSpeech(
model: openai.speech("tts-1"),
text: "Hello, world!",
providerOptions: ["openai": [:]]
)
instructionsstring
Control the voice of your generated audio with additional instructions e.g. “Speak in a slow and steady tone”.
Does not work with tts-1 or tts-1-hd.
Optional.
response_formatstring
The format to audio in.
Supported formats are mp3, opus, aac, flac, wav, and pcm.
Defaults to mp3.
Optional.
speednumber
The speed of the generated audio.
Select a value from 0.25 to 4.0.
Defaults to 1.0.
Optional.