Agents
Agents are large language models (LLMs) using tools in a loop to accomplish tasks.
Each component plays a distinct role:
- LLMs process input (text) and decide what action to take next
- Tools extend what the model can do beyond text generation (e.g. reading files, calling APIs, writing to databases)
- Loop orchestrates execution through:
- Context management - Maintaining conversation history and deciding what the model sees (input) at each step
- Stopping conditions - Determining when the loop (task) is complete
Building Blocks
You combine these fundamental components to create increasingly sophisticated systems:
Single-Step Generation
One call to an LLM to get a response. Use this for straightforward tasks like classification or text generation.
import { generateText } from 'ai';
const result = await generateText({ model: 'openai/gpt-4o', prompt: 'Classify this sentiment: "I love this product!"',});
Tool Usage
Enhance LLM capabilities through tools that provide access to external systems. Tools can read data to augment context (like fetching files or querying databases) or write data to take actions (like sending emails or updating records).
import { generateText, tool } from 'ai';import { z } from 'zod';
const result = await generateText({ model: 'openai/gpt-4o', prompt: 'What is the weather in San Francisco?', tools: { weather: tool({ description: 'Get the weather in a location', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }), },});
console.log(result.toolResults);
Multi-Step Tool Usage (Agents)
For complex problems, an LLM can make multiple tool calls across multiple steps. The model decides the order and number of tool calls based on the task.
import { generateText, stepCountIs, tool } from 'ai';import { z } from 'zod';
const result = await generateText({ model: 'openai/gpt-4o', prompt: 'What is the weather in San Francisco in celsius?', tools: { weather: tool({ description: 'Get the weather in a location (in Fahrenheit)', parameters: z.object({ location: z.string().describe('The location to get the weather for'), }), execute: async ({ location }) => ({ location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }), }), convertFahrenheitToCelsius: tool({ description: 'Convert temperature from Fahrenheit to Celsius', parameters: z.object({ temperature: z.number().describe('Temperature in Fahrenheit'), }), execute: async ({ temperature }) => { const celsius = Math.round((temperature - 32) * (5 / 9)); return { celsius }; }, }), }, stopWhen: stepCountIs(10), // Stop after maximum 10 steps});
console.log(result.text); // Output: The weather in San Francisco is currently _°C.
The LLM might:
- Call the
weather
tool to get the temperature in Fahrenheit - Call the
convertFahrenheitToCelsius
tool to convert it - Generate a text response with the converted temperature
This behavior is flexible - the LLM determines the approach based on its understanding of the task.
Implementation Approaches
The AI SDK provides two approaches to build agents:
Agent Class
Object-oriented abstraction that handles the loop for you. Best when you want to:
- Reuse agent configurations
- Minimize boilerplate code
- Build consistent agent behaviors
import { Experimental_Agent as Agent } from 'ai';
const myAgent = new Agent({ model: 'openai/gpt-4o', tools: { // your tools here }, stopWhen: stepCountIs(10), // Continue for up to 10 steps});
const result = await myAgent.generate({ prompt: 'Analyze the latest sales data and create a summary report',});
console.log(result.text);
Learn more about the Agent class.
Core Functions
Use generateText
or streamText
with tools. Choose between:
Built-in Loop - Let the SDK manage the execution cycle:
import { generateText, stepCountIs } from 'ai';
const result = await generateText({ model: 'openai/gpt-4o', prompt: 'Research machine learning trends and provide key insights', tools: { // your tools here }, stopWhen: stepCountIs(10), prepareStep: ({ stepNumber }) => { // Modify settings between steps }, onStepFinish: step => { // Monitor or save progress },});
Learn more about loop control.
Manual Loop - Full control over execution:
import { generateText, ModelMessage } from 'ai';
const messages: ModelMessage[] = [{ role: 'user', content: '...' }];
let step = 0;const maxSteps = 10;
while (step < maxSteps) { const result = await generateText({ model: 'openai/gpt-4o', messages, tools: { // your tools here }, });
messages.push(...result.response.messages);
if (result.text) { break; // Stop when model generates text }
step++;}
When You Need More Control
Agents are powerful but non-deterministic. When you need reliable, repeatable outcomes, combine tool calling with standard programming patterns:
- Conditional statements for explicit branching
- Functions for reusable logic
- Error handling for robustness
- Explicit control flow for predictability
This approach gives you the benefits of AI while maintaining control over critical paths.
Next Steps
- Agent Class - Build reusable agents with the object-oriented API
- Loop Control - Control agent execution with stopWhen and prepareStep
- Workflow Patterns - Build reliable multi-agent systems
- Manual Loop Example - See a complete example of custom loop management