Subagents

A subagent is an agent that a parent agent can invoke. The parent delegates work via a tool, and the subagent executes autonomously before returning a result.

How It Works

  1. Define a subagent with its own model, instructions, and tools
  2. Create a tool that calls it for the main agent to use
  3. Subagent runs independently with its own context window
  4. Return a result (optionally streaming progress to the UI)
  5. Control what the model sees using toModelOutput to summarize

When to Use Subagents

Subagents add latency and complexity. Use them when the benefits outweigh the costs:

Use Subagents WhenAvoid Subagents When
Tasks require exploring large amounts of tokensTasks are simple and focused
You need to parallelize independent researchSequential processing suffices
Context would grow beyond model limitsContext stays manageable
You want to isolate tool access by capabilityAll tools can safely coexist

Why Use Subagents?

Offloading Context-Heavy Tasks

Some tasks require exploring large amounts of information—reading files, searching codebases, or researching topics. Running these in the main agent consumes context quickly, making the agent less coherent over time.

With subagents, you can:

  • Spin up a dedicated agent that uses hundreds of thousands of tokens
  • Have it return only a focused summary (perhaps 1,000 tokens)
  • Keep your main agent's context clean and coherent

The subagent does the heavy lifting while the main agent stays focused on orchestration.

Parallelizing Independent Work

For tasks like exploring a codebase, you can spawn multiple subagents to research different areas simultaneously. Each returns a summary, and the main agent synthesizes the findings—without paying the context cost of all that exploration.

Specialized Orchestration

A less common but valid pattern is using a main agent purely for orchestration, delegating to specialized subagents for different types of work. For example:

  • An exploration subagent with read-only tools for researching codebases
  • A coding subagent with file editing tools
  • An integration subagent with tools for a specific platform or API

This creates a clear separation of concerns, though context offloading and parallelization are the more common motivations for subagents.

Basic Subagent Without Streaming

The simplest subagent pattern requires no special machinery. Your main agent has a tool that calls another agent in its execute function:

import { ToolLoopAgent, tool } from 'ai';
import { z } from 'zod';
// Define a subagent for research tasks
const researchSubagent = new ToolLoopAgent({
model: "anthropic/claude-sonnet-4.5",
instructions: `You are a research agent.
Summarize your findings in your final response.`,
tools: {
read: readFileTool, // defined elsewhere
search: searchTool, // defined elsewhere
},
});
// Create a tool that delegates to the subagent
const researchTool = tool({
description: 'Research a topic or question in depth.',
inputSchema: z.object({
task: z.string().describe('The research task to complete'),
}),
execute: async ({ task }, { abortSignal }) => {
const result = await researchSubagent.generate({
prompt: task,
abortSignal,
});
return result.text;
},
});
// Main agent uses the research tool
const mainAgent = new ToolLoopAgent({
model: "anthropic/claude-sonnet-4.5",
instructions: 'You are a helpful assistant that can delegate research tasks.',
tools: {
research: researchTool,
},
});

This works well when you don't need to show the subagent's progress in the UI. The tool call blocks until the subagent completes, then returns the final text response.

Handling Cancellation

When the user cancels a request, the abortSignal propagates to the subagent. Always pass it through to ensure cleanup:

execute: async ({ task }, { abortSignal }) => {
const result = await researchSubagent.generate({
prompt: task,
abortSignal, // Cancels subagent if main request is aborted
});
return result.text;
},

If you abort the signal, the subagent stops executing and throws an AbortError. The main agent's tool execution fails, which stops the main loop.

To avoid errors about incomplete tool calls in subsequent messages, use convertToModelMessages with ignoreIncompleteToolCalls:

import { convertToModelMessages } from 'ai';
const modelMessages = await convertToModelMessages(messages, {
ignoreIncompleteToolCalls: true,
});

This filters out tool calls that don't have corresponding results. Learn more in the convertToModelMessages reference.

Streaming Subagent Progress

When you want to show incremental progress as the subagent works, use preliminary tool results. This pattern uses a generator function that yields partial updates to the UI.

How Preliminary Tool Results Work

Change your execute function from a regular function to an async generator (async function*). Each yield sends a preliminary result to the frontend:

execute: async function* ({ /* input */ }) {
// ... do work ...
yield partialResult;
// ... do more work ...
yield updatedResult;
}

Building the Complete Message

Each yield replaces the previous output entirely (it does not append). This means you need a way to accumulate the subagent's response into a complete message that grows over time.

The readUIMessageStream utility handles this. It reads each chunk from the stream and builds an ever-growing UIMessage containing all parts received so far:

import { readUIMessageStream, tool } from 'ai';
import { z } from 'zod';
const researchTool = tool({
description: 'Research a topic or question in depth.',
inputSchema: z.object({
task: z.string().describe('The research task to complete'),
}),
execute: async function* ({ task }, { abortSignal }) {
// Start the subagent with streaming
const result = await researchSubagent.stream({
prompt: task,
abortSignal,
});
// Each iteration yields a complete, accumulated UIMessage
for await (const message of readUIMessageStream({
stream: result.toUIMessageStream(),
})) {
yield message;
}
},
});

Each yielded message is a complete UIMessage containing all the subagent's parts up to that point (text, tool calls, and tool results). The frontend simply replaces its display with each new message.

Controlling What the Model Sees

Here's where subagents become powerful for context management. The full UIMessage with all the subagent's work is stored in the message history and displayed in the UI. But you can control what the main agent's model actually sees using toModelOutput.

How It Works

The toModelOutput function maps the tool's output to the tokens sent to the model:

const researchTool = tool({
description: 'Research a topic or question in depth.',
inputSchema: z.object({
task: z.string().describe('The research task to complete'),
}),
execute: async function* ({ task }, { abortSignal }) {
const result = await researchSubagent.stream({
prompt: task,
abortSignal,
});
for await (const message of readUIMessageStream({
stream: result.toUIMessageStream(),
})) {
yield message;
}
},
toModelOutput: ({ output: message }) => {
// Extract just the final text as a summary
const lastTextPart = message?.parts.findLast(p => p.type === 'text');
return {
type: 'text',
value: lastTextPart?.text ?? 'Task completed.',
};
},
});

With this setup:

  • Users see: The full subagent execution—every tool call, every intermediate step
  • The model sees: Just the final summary text

The subagent might use 100,000 tokens exploring and reasoning, but the main agent only consumes the summary. This keeps the main agent coherent and focused.

Write Subagent Instructions for Summarization

For toModelOutput to extract a useful summary, your subagent must produce one. Add explicit instructions like this:

const researchSubagent = new ToolLoopAgent({
model: "anthropic/claude-sonnet-4.5",
instructions: `You are a research agent. Complete the task autonomously.
IMPORTANT: When you have finished, write a clear summary of your findings as your final response.
This summary will be returned to the main agent, so include all relevant information.`,
tools: {
read: readFileTool,
search: searchTool,
},
});

Without this instruction, the subagent might not produce a comprehensive summary. It could simply say "Done", leaving toModelOutput with nothing useful to extract.

Rendering Subagents in the UI (with useChat)

To display streaming progress, check the tool part's state and preliminary flag.

Tool Part States

StateDescription
input-streamingTool input being generated
input-availableTool ready to execute
output-availableTool produced output (check preliminary)
output-errorTool execution failed

Detecting Streaming vs Complete

const hasOutput = part.state === 'output-available';
const isStreaming = hasOutput && part.preliminary === true;
const isComplete = hasOutput && !part.preliminary;

Type Safety for Subagent Output

Export types alongside your agents for use in UI components:

lib/agents.ts
import { ToolLoopAgent, InferAgentUIMessage } from 'ai';
export const mainAgent = new ToolLoopAgent({
// ... configuration with researchTool
});
// Export the main agent message type for the chat UI
export type MainAgentMessage = InferAgentUIMessage<typeof mainAgent>;

Render Messages and Subagent Output

This example uses the types defined above to render both the main agent's messages and the subagent's streamed output:

'use client';
import { useChat } from '@ai-sdk/react';
import type { MainAgentMessage } from '@/lib/agents';
export function Chat() {
const { messages } = useChat<MainAgentMessage>();
return (
<div>
{messages.map(message =>
message.parts.map((part, i) => {
switch (part.type) {
case 'text':
return <p key={i}>{part.text}</p>;
case 'tool-research':
return (
<div>
{part.state !== 'input-streaming' && (
<div>Research: {part.input.task}</div>
)}
{part.state === 'output-available' && (
<div>
{part.output.parts.map((nestedPart, i) => {
switch (nestedPart.type) {
case 'text':
return <p key={i}>{nestedPart.text}</p>;
default:
return null;
}
})}
</div>
)}
</div>
);
default:
return null;
}
}),
)}
</div>
);
}

Caveats

No Tool Approvals in Subagents

Subagent tools cannot use needsApproval. All tools must execute automatically without user confirmation.

Subagent Context is Isolated

Each subagent invocation starts with a fresh context window. This is one of the key benefits of subagents: they don't inherit the accumulated context from the main agent, which is exactly what allows them to do heavy exploration without bloating the main conversation.

If you need to give a subagent access to the conversation history, the messages are available in the tool's execute function alongside abortSignal:

execute: async ({ task }, { abortSignal, messages }) => {
const result = await researchSubagent.generate({
messages: [
...messages, // The main agent's conversation history
{ role: 'user', content: task }, // The specific task for this invocation
],
abortSignal,
});
return result.text;
},

Use this sparingly since passing full history defeats some of the context isolation benefits.

Streaming Adds Complexity

The basic pattern (no streaming) is simpler to implement and debug. Only add streaming when you need to show real-time progress in the UI.