Building Agents

This page adapts the original AI SDK documentation: Building Agents.

Building Agents

The Agent class provides a structured way to encapsulate LLM configuration, tools, and behavior into reusable components. It handles the agent loop for you, allowing the LLM to call tools multiple times in sequence to accomplish complex tasks. Define agents once and use them across your application.

Why Use the Agent Class?

When building AI applications, you often need to:

Reuse configurations - Same model settings, tools, and prompts across different parts of your application
Maintain consistency - Ensure the same behavior and capabilities throughout your codebase
Simplify API routes - Reduce boilerplate in your endpoints
Type safety - Get full TypeScript support for your agent’s tools and outputs

The Agent class provides a single place to define your agent’s behavior.

Creating an Agent

Define an agent by instantiating the Swift Agent with your configuration:

import SwiftAISDK
import OpenAIProvider

let myAgent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  system: "You are a helpful assistant.",
  tools: [:]
))

Configuration Options

The Agent class accepts all the same settings as generateText and streamText. Configure:

Model and System Prompt

let agent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  system: "You are an expert software engineer."
))

Tools

Provide tools that the agent can use to accomplish tasks:

struct CodeInput: Codable, Sendable { let code: String }
struct CodeResult: Codable, Sendable { let output: String }

let runCode = tool(
  description: "Execute Python code",
  inputSchema: CodeInput.self
) { payload, _ in
  // Execute code (omitted) and return result
  CodeResult(output: "Code executed successfully")
}

let codeAgent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  tools: ["runCode": runCode.eraseToTool()]
))

Loop Control

By default, agents run for 20 steps (stopWhen: stepCountIs(20)). In each step, the model either generates text or calls a tool. If it generates text, the agent completes. If it calls a tool, the AI SDK executes that tool.

To let agents call multiple tools in sequence, configure stopWhen to allow more steps. After each tool execution, the agent triggers a new generation where the model can call another tool or generate text:

let agent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  stopWhen: [stepCountIs(20)] // Allow up to 20 steps
))

Each step represents one generation (which results in either text or a tool call). The loop continues until:

The model generates text instead of calling a tool, or
A stop condition is met

You can combine multiple conditions:

let agent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  stopWhen: [
    stepCountIs(20)
    // Add custom conditions as needed
  ]
))

Learn more about loop control and stop conditions.

Tool Choice

Control how the agent uses tools:

let agent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  tools: [/* your tools */],
  toolChoice: .required // or .none / .auto (default)
))

You can also force the use of a specific tool:

let agent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  tools: [
    "weather": weatherTool,
    "cityAttractions": attractionsTool
  ],
  toolChoice: .tool(name: "weather") // Force the weather tool to be used
))

Structured Output

Define structured output schemas:

// Define your structured output spec (example):
// let outputSpec = Output.Specification<MyOutput, Never>.object(...)

let analysisAgent = Agent<MyOutput, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  experimentalOutput: outputSpec,
  stopWhen: [stepCountIs(10)]
))

let result: DefaultGenerateTextResult<MyOutput> = try await analysisAgent.generate(
  prompt: .text("Analyze customer feedback from the last quarter")
)
let output = try result.experimentalOutput

Define Agent Behavior with System Prompts

System prompts define your agent’s behavior, personality, and constraints. They set the context for all interactions and guide how the agent responds to user queries and uses tools.

Basic System Prompts

Set the agent’s role and expertise:

let agent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  system: "You are an expert data analyst. You provide clear insights from complex data."
))

Detailed Behavioral Instructions

Provide specific guidelines for agent behavior:

let codeReviewAgent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  system: """
  You are a senior software engineer conducting code reviews.

  Your approach:
  - Focus on security vulnerabilities first
  - Identify performance bottlenecks
  - Suggest improvements for readability and maintainability
  - Be constructive and educational in your feedback
  - Always explain why something is an issue and how to fix it
  """
))

Constrain Agent Behavior

Set boundaries and ensure consistent behavior:

let customerSupportAgent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  system: """
  You are a customer support specialist for an e-commerce platform.

  Rules:
  - Never make promises about refunds without checking the policy
  - Always be empathetic and professional
  - If you don't know something, say so and offer to escalate
  - Keep responses concise and actionable
  - Never share internal company information
  """,
  tools: [
    "checkOrderStatus": checkOrderStatus,
    "lookupPolicy": lookupPolicy,
    "createTicket": createTicket
  ]
))

Tool Usage Instructions

Guide how the agent should use available tools:

let researchAgent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  system: """
  You are a research assistant with access to search and document tools.

  When researching:
  1. Always start with a broad search to understand the topic
  2. Use document analysis for detailed information
  3. Cross-reference multiple sources before drawing conclusions
  4. Cite your sources when presenting information
  5. If information conflicts, present both viewpoints
  """,
  tools: [
    "webSearch": webSearch,
    "analyzeDocument": analyzeDocument,
    "extractQuotes": extractQuotes
  ]
))

Format and Style Instructions

Control the output format and communication style:

let technicalWriterAgent = Agent<Never, Never>(settings: .init(
  model: .v3(openai("gpt-4o")),
  system: """
  You are a technical documentation writer.

  Writing style:
  - Use clear, simple language
  - Avoid jargon unless necessary
  - Structure information with headers and bullet points
  - Include code examples where relevant
  - Write in second person ("you" instead of "the user")

  Always format responses in Markdown.
  """
))

Using an Agent

Once defined, you can use your agent in three ways:

Generate Text

Use generate() for one-time text generation:

let result = try await myAgent.generate(prompt: .text("What is the weather like?"))
print(result.text)

Stream Text

Use stream() for streaming responses:

let stream = try myAgent.stream(prompt: .text("Tell me a story"))
for try await chunk in stream.textStream {
  print(chunk, terminator: "")
}
print()

Respond to UI Messages (server)

Use respond() to produce a UI message stream response on the server:

// Example concept for a server route handler
let response = try myAgent.respond(messages: uiMessages)
// Pipe to an HTTP response using `pipeUIMessageStreamToResponse` if desired

Note: React hook examples (useChat) are JS‑only and not part of the Swift AI SDK.

Next Steps

Now that you understand building agents, you can:

Explore workflow patterns for structured patterns using core functions
Learn about loop control for advanced execution control
See manual loop examples for custom workflow implementations