Skip to content

Building Agents

This page adapts the original AI SDK documentation: Building Agents.

The Agent class provides a structured way to encapsulate LLM configuration, tools, and behavior into reusable components. It handles the agent loop for you, allowing the LLM to call tools multiple times in sequence to accomplish complex tasks. Define agents once and use them across your application.

When building AI applications, you often need to:

  • Reuse configurations - Same model settings, tools, and prompts across different parts of your application
  • Maintain consistency - Ensure the same behavior and capabilities throughout your codebase
  • Simplify API routes - Reduce boilerplate in your endpoints
  • Type safety - Get full TypeScript support for your agent’s tools and outputs

The Agent class provides a single place to define your agent’s behavior.

Define an agent by instantiating the Swift Agent with your configuration:

import SwiftAISDK
import OpenAIProvider
let myAgent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
system: "You are a helpful assistant.",
tools: [:]
))

The Agent class accepts all the same settings as generateText and streamText. Configure:

let agent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
system: "You are an expert software engineer."
))

Provide tools that the agent can use to accomplish tasks:

struct CodeInput: Codable, Sendable { let code: String }
struct CodeResult: Codable, Sendable { let output: String }
let runCode = tool(
description: "Execute Python code",
inputSchema: CodeInput.self
) { payload, _ in
// Execute code (omitted) and return result
CodeResult(output: "Code executed successfully")
}
let codeAgent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
tools: ["runCode": runCode.eraseToTool()]
))

By default, agents run for 20 steps (stopWhen: stepCountIs(20)). In each step, the model either generates text or calls a tool. If it generates text, the agent completes. If it calls a tool, the AI SDK executes that tool.

To let agents call multiple tools in sequence, configure stopWhen to allow more steps. After each tool execution, the agent triggers a new generation where the model can call another tool or generate text:

let agent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
stopWhen: [stepCountIs(20)] // Allow up to 20 steps
))

Each step represents one generation (which results in either text or a tool call). The loop continues until:

  • The model generates text instead of calling a tool, or
  • A stop condition is met

You can combine multiple conditions:

let agent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
stopWhen: [
stepCountIs(20)
// Add custom conditions as needed
]
))

Learn more about loop control and stop conditions.

Control how the agent uses tools:

let agent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
tools: [/* your tools */],
toolChoice: .required // or .none / .auto (default)
))

You can also force the use of a specific tool:

let agent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
tools: [
"weather": weatherTool,
"cityAttractions": attractionsTool
],
toolChoice: .tool(name: "weather") // Force the weather tool to be used
))

Define structured output schemas:

// Define your structured output spec (example):
// let outputSpec = Output.Specification<MyOutput, Never>.object(...)
let analysisAgent = Agent<MyOutput, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
experimentalOutput: outputSpec,
stopWhen: [stepCountIs(10)]
))
let result: DefaultGenerateTextResult<MyOutput> = try await analysisAgent.generate(
prompt: .text("Analyze customer feedback from the last quarter")
)
let output = try result.experimentalOutput

System prompts define your agent’s behavior, personality, and constraints. They set the context for all interactions and guide how the agent responds to user queries and uses tools.

Set the agent’s role and expertise:

let agent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
system: "You are an expert data analyst. You provide clear insights from complex data."
))

Provide specific guidelines for agent behavior:

let codeReviewAgent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
system: """
You are a senior software engineer conducting code reviews.
Your approach:
- Focus on security vulnerabilities first
- Identify performance bottlenecks
- Suggest improvements for readability and maintainability
- Be constructive and educational in your feedback
- Always explain why something is an issue and how to fix it
"""
))

Set boundaries and ensure consistent behavior:

let customerSupportAgent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
system: """
You are a customer support specialist for an e-commerce platform.
Rules:
- Never make promises about refunds without checking the policy
- Always be empathetic and professional
- If you don't know something, say so and offer to escalate
- Keep responses concise and actionable
- Never share internal company information
""",
tools: [
"checkOrderStatus": checkOrderStatus,
"lookupPolicy": lookupPolicy,
"createTicket": createTicket
]
))

Guide how the agent should use available tools:

let researchAgent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
system: """
You are a research assistant with access to search and document tools.
When researching:
1. Always start with a broad search to understand the topic
2. Use document analysis for detailed information
3. Cross-reference multiple sources before drawing conclusions
4. Cite your sources when presenting information
5. If information conflicts, present both viewpoints
""",
tools: [
"webSearch": webSearch,
"analyzeDocument": analyzeDocument,
"extractQuotes": extractQuotes
]
))

Control the output format and communication style:

let technicalWriterAgent = Agent<Never, Never>(settings: .init(
model: .v3(openai("gpt-4o")),
system: """
You are a technical documentation writer.
Writing style:
- Use clear, simple language
- Avoid jargon unless necessary
- Structure information with headers and bullet points
- Include code examples where relevant
- Write in second person ("you" instead of "the user")
Always format responses in Markdown.
"""
))

Once defined, you can use your agent in three ways:

Use generate() for one-time text generation:

let result = try await myAgent.generate(prompt: .text("What is the weather like?"))
print(result.text)

Use stream() for streaming responses:

let stream = try myAgent.stream(prompt: .text("Tell me a story"))
for try await chunk in stream.textStream {
print(chunk, terminator: "")
}
print()

Use respond() to produce a UI message stream response on the server:

// Example concept for a server route handler
let response = try myAgent.respond(messages: uiMessages)
// Pipe to an HTTP response using `pipeUIMessageStreamToResponse` if desired

Note: React hook examples (useChat) are JS‑only and not part of the Swift AI SDK.

Now that you understand building agents, you can: