Subagents
A subagent is an agent that a parent agent can invoke. The parent delegates work via a tool, and the subagent executes autonomously before returning a result.
How It Works
- Define a subagent with its own model, instructions, and tools
- Create a tool that calls it for the main agent to use
- Subagent runs independently with its own context window
- Return a result (optionally streaming progress to the UI)
- Control what the model sees using
toModelOutputto summarize
When to Use Subagents
Subagents add latency and complexity. Use them when the benefits outweigh the costs:
| Use Subagents When | Avoid Subagents When |
|---|---|
| Tasks require exploring large amounts of tokens | Tasks are simple and focused |
| You need to parallelize independent research | Sequential processing suffices |
| Context would grow beyond model limits | Context stays manageable |
| You want to isolate tool access by capability | All tools can safely coexist |
Why Use Subagents?
Offloading Context-Heavy Tasks
Some tasks require exploring large amounts of information—reading files, searching codebases, or researching topics. Running these in the main agent consumes context quickly, making the agent less coherent over time.
With subagents, you can:
- Spin up a dedicated agent that uses hundreds of thousands of tokens
- Have it return only a focused summary (perhaps 1,000 tokens)
- Keep your main agent's context clean and coherent
The subagent does the heavy lifting while the main agent stays focused on orchestration.
Parallelizing Independent Work
For tasks like exploring a codebase, you can spawn multiple subagents to research different areas simultaneously. Each returns a summary, and the main agent synthesizes the findings—without paying the context cost of all that exploration.
Specialized Orchestration
A less common but valid pattern is using a main agent purely for orchestration, delegating to specialized subagents for different types of work. For example:
- An exploration subagent with read-only tools for researching codebases
- A coding subagent with file editing tools
- An integration subagent with tools for a specific platform or API
This creates a clear separation of concerns, though context offloading and parallelization are the more common motivations for subagents.
Basic Subagent Without Streaming
The simplest subagent pattern requires no special machinery. Your main agent has a tool that calls another agent in its execute function:
import { ToolLoopAgent, tool } from 'ai';import { z } from 'zod';
// Define a subagent for research tasksconst researchSubagent = new ToolLoopAgent({ model: "anthropic/claude-sonnet-4.5", instructions: `You are a research agent.Summarize your findings in your final response.`, tools: { read: readFileTool, // defined elsewhere search: searchTool, // defined elsewhere },});
// Create a tool that delegates to the subagentconst researchTool = tool({ description: 'Research a topic or question in depth.', inputSchema: z.object({ task: z.string().describe('The research task to complete'), }), execute: async ({ task }, { abortSignal }) => { const result = await researchSubagent.generate({ prompt: task, abortSignal, }); return result.text; },});
// Main agent uses the research toolconst mainAgent = new ToolLoopAgent({ model: "anthropic/claude-sonnet-4.5", instructions: 'You are a helpful assistant that can delegate research tasks.', tools: { research: researchTool, },});This works well when you don't need to show the subagent's progress in the UI. The tool call blocks until the subagent completes, then returns the final text response.
Handling Cancellation
When the user cancels a request, the abortSignal propagates to the subagent. Always pass it through to ensure cleanup:
execute: async ({ task }, { abortSignal }) => { const result = await researchSubagent.generate({ prompt: task, abortSignal, // Cancels subagent if main request is aborted }); return result.text;},If you abort the signal, the subagent stops executing and throws an AbortError. The main agent's tool execution fails, which stops the main loop.
To avoid errors about incomplete tool calls in subsequent messages, use convertToModelMessages with ignoreIncompleteToolCalls:
import { convertToModelMessages } from 'ai';
const modelMessages = await convertToModelMessages(messages, { ignoreIncompleteToolCalls: true,});This filters out tool calls that don't have corresponding results. Learn more in the convertToModelMessages reference.
Streaming Subagent Progress
When you want to show incremental progress as the subagent works, use preliminary tool results. This pattern uses a generator function that yields partial updates to the UI.
How Preliminary Tool Results Work
Change your execute function from a regular function to an async generator (async function*). Each yield sends a preliminary result to the frontend:
execute: async function* ({ /* input */ }) { // ... do work ... yield partialResult; // ... do more work ... yield updatedResult;}Building the Complete Message
Each yield replaces the previous output entirely (it does not append). This means you need a way to accumulate the subagent's response into a complete message that grows over time.
The readUIMessageStream utility handles this. It reads each chunk from the stream and builds an ever-growing UIMessage containing all parts received so far:
import { readUIMessageStream, tool } from 'ai';import { z } from 'zod';
const researchTool = tool({ description: 'Research a topic or question in depth.', inputSchema: z.object({ task: z.string().describe('The research task to complete'), }), execute: async function* ({ task }, { abortSignal }) { // Start the subagent with streaming const result = await researchSubagent.stream({ prompt: task, abortSignal, });
// Each iteration yields a complete, accumulated UIMessage for await (const message of readUIMessageStream({ stream: result.toUIMessageStream(), })) { yield message; } },});Each yielded message is a complete UIMessage containing all the subagent's parts up to that point (text, tool calls, and tool results). The frontend simply replaces its display with each new message.
Controlling What the Model Sees
Here's where subagents become powerful for context management. The full UIMessage with all the subagent's work is stored in the message history and displayed in the UI. But you can control what the main agent's model actually sees using toModelOutput.
How It Works
The toModelOutput function maps the tool's output to the tokens sent to the model:
const researchTool = tool({ description: 'Research a topic or question in depth.', inputSchema: z.object({ task: z.string().describe('The research task to complete'), }), execute: async function* ({ task }, { abortSignal }) { const result = await researchSubagent.stream({ prompt: task, abortSignal, });
for await (const message of readUIMessageStream({ stream: result.toUIMessageStream(), })) { yield message; } }, toModelOutput: ({ output: message }) => { // Extract just the final text as a summary const lastTextPart = message?.parts.findLast(p => p.type === 'text'); return { type: 'text', value: lastTextPart?.text ?? 'Task completed.', }; },});With this setup:
- Users see: The full subagent execution—every tool call, every intermediate step
- The model sees: Just the final summary text
The subagent might use 100,000 tokens exploring and reasoning, but the main agent only consumes the summary. This keeps the main agent coherent and focused.
Write Subagent Instructions for Summarization
For toModelOutput to extract a useful summary, your subagent must produce one. Add explicit instructions like this:
const researchSubagent = new ToolLoopAgent({ model: "anthropic/claude-sonnet-4.5", instructions: `You are a research agent. Complete the task autonomously.
IMPORTANT: When you have finished, write a clear summary of your findings as your final response.This summary will be returned to the main agent, so include all relevant information.`, tools: { read: readFileTool, search: searchTool, },});Without this instruction, the subagent might not produce a comprehensive summary. It could simply say "Done", leaving toModelOutput with nothing useful to extract.
Rendering Subagents in the UI (with useChat)
To display streaming progress, check the tool part's state and preliminary flag.
Tool Part States
| State | Description |
|---|---|
input-streaming | Tool input being generated |
input-available | Tool ready to execute |
output-available | Tool produced output (check preliminary) |
output-error | Tool execution failed |
Detecting Streaming vs Complete
const hasOutput = part.state === 'output-available';const isStreaming = hasOutput && part.preliminary === true;const isComplete = hasOutput && !part.preliminary;Type Safety for Subagent Output
Export types alongside your agents for use in UI components:
import { ToolLoopAgent, InferAgentUIMessage } from 'ai';
export const mainAgent = new ToolLoopAgent({ // ... configuration with researchTool});
// Export the main agent message type for the chat UIexport type MainAgentMessage = InferAgentUIMessage<typeof mainAgent>;Render Messages and Subagent Output
This example uses the types defined above to render both the main agent's messages and the subagent's streamed output:
'use client';
import { useChat } from '@ai-sdk/react';import type { MainAgentMessage } from '@/lib/agents';
export function Chat() { const { messages } = useChat<MainAgentMessage>();
return ( <div> {messages.map(message => message.parts.map((part, i) => { switch (part.type) { case 'text': return <p key={i}>{part.text}</p>; case 'tool-research': return ( <div> {part.state !== 'input-streaming' && ( <div>Research: {part.input.task}</div> )} {part.state === 'output-available' && ( <div> {part.output.parts.map((nestedPart, i) => { switch (nestedPart.type) { case 'text': return <p key={i}>{nestedPart.text}</p>; default: return null; } })} </div> )} </div> ); default: return null; } }), )} </div> );}Caveats
No Tool Approvals in Subagents
Subagent tools cannot use needsApproval. All tools must execute automatically without user confirmation.
Subagent Context is Isolated
Each subagent invocation starts with a fresh context window. This is one of the key benefits of subagents: they don't inherit the accumulated context from the main agent, which is exactly what allows them to do heavy exploration without bloating the main conversation.
If you need to give a subagent access to the conversation history, the messages are available in the tool's execute function alongside abortSignal:
execute: async ({ task }, { abortSignal, messages }) => { const result = await researchSubagent.generate({ messages: [ ...messages, // The main agent's conversation history { role: 'user', content: task }, // The specific task for this invocation ], abortSignal, }); return result.text;},Use this sparingly since passing full history defeats some of the context isolation benefits.
Streaming Adds Complexity
The basic pattern (no streaming) is simpler to implement and debug. Only add streaming when you need to show real-time progress in the UI.