In this recipe, you'll learn how to build an AI agent that can interact with a knowledge base using Upstash Search. The agent will be able to both retrieve information from the knowledge base and add new resources to it, leveraging AI SDK tools.
Upstash Search offers input enrichment, reranking, semantic search, and full-text search for highly accurate results. It also provides a built-in embedding service, eliminating the need for a separate embedding provider. This makes it convenient for building and managing simple knowledge bases.
This example uses the following essay as input data (essay.txt
).
For a more in-depth guide, check out the RAG Agent Guide, which shows you how to build a RAG Agent with Next.js, Drizzle ORM, and Postgres.
Getting Started
Create an Upstash Search database on Upstash Console. Once created, you will get a REST URL and a token. Set these in your environment variables:
UPSTASH_SEARCH_REST_URL="***"UPSTASH_SEARCH_REST_TOKEN="***"
Project Setup
Create a new empty directory for your project and initialize pnpm:
mkdir knowledge-base-agentcd knowledge-base-agentpnpm init
Install the AI SDK, OpenAI provider, Upstash Search packages, and tsx as a dev dependency:
pnpm i ai zod @ai-sdk/openai @upstash/searchpnpm i -D tsx
Finally, download and save the input essay:
curl -o essay.txt https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
Setting Up the Knowledge Base
Next, let's set up the initial knowledge base by reading a file and uploading its content to Upstash Search. Create a script called setup.ts
:
import fs from 'fs';import path from 'path';import 'dotenv/config';import { Search } from '@upstash/search';
type KnowledgeContent = { text: string; section: string; title?: string;};
// Initialize Upstash Search clientconst search = new Search({ url: process.env.UPSTASH_SEARCH_REST_URL!, token: process.env.UPSTASH_SEARCH_REST_TOKEN!,});
const index = search.index<KnowledgeContent>('knowledge-base');
async function setupKnowledgeBase() { // Read and process the source file const content = fs.readFileSync(path.join(__dirname, 'essay.txt'), 'utf8');
// Split content into meaningful chunks const chunks = content .split(/\n\s*\n/) // Split by double line breaks (paragraphs) .map(chunk => chunk.trim()) .filter(chunk => chunk.length > 50); // Only keep substantial chunks
// Upload chunks to Upstash Search in batches of 100 const batchSize = 100; for (let i = 0; i < chunks.length; i += batchSize) { const batch = chunks.slice(i, i + batchSize).map((chunk, j) => ({ id: `chunk-${i + j}`, content: { text: chunk, section: `section-${Math.floor((i + j) / 10)}`, title: chunk.split('\n')[0] || `Chunk ${i + j + 1}`, }, })); await index.upsert(batch); console.log( `Upserted ${Math.min(i + batch.length, chunks.length)} chunks out of ${chunks.length} chunks`, ); }}
// Run setupsetupKnowledgeBase().catch(console.error);
Run the setup script to populate your knowledge base:
pnpm tsx setup.ts
Navigate to the Upstash Console and check the data browser of your Search database. You should see the essay has been indexed.
Building the Knowledge Base Agent
Now let's create an agent that can interact with this knowledge base. Create a new file called agent.ts
:
import { openai } from '@ai-sdk/openai';import { tool, stepCountIs, generateText, generateId } from 'ai';import { z } from 'zod';import { Search } from '@upstash/search';
import 'dotenv/config';
const search = new Search({ url: process.env.UPSTASH_SEARCH_REST_URL!, token: process.env.UPSTASH_SEARCH_REST_TOKEN!,});
type KnowledgeContent = { text: string; section: string; title?: string;};
const index = search.index<KnowledgeContent>('knowledge-base');
async function main(prompt: string) { const { text } = await generateText({ model: openai('gpt-4o'), prompt, stopWhen: stepCountIs(5), tools: { addResource: tool({ description: 'Add a new resource or piece of information to the knowledge base', inputSchema: z.object({ resource: z .string() .describe('The content or resource to add to the knowledge base'), title: z .string() .optional() .describe('Optional title for the resource'), }), execute: async ({ resource, title }) => { const id = generateId(); await index.upsert({ id, content: { text: resource, section: 'user-added', title: title || `Resource ${id.slice(0, 8)}`, }, }); return `Successfully added resource "${title || 'Untitled'}" to knowledge base with ID: ${id}`; }, }), searchKnowledge: tool({ description: 'Search the knowledge base to find relevant information for answering questions', inputSchema: z.object({ query: z .string() .describe('The search query to find relevant information'), limit: z .number() .optional() .describe('Maximum number of results to return (default: 3)'), }), execute: async ({ query, limit = 3 }) => { const results = await index.search({ query, limit, reranking: true, });
if (results.length === 0) { return 'No relevant information found in the knowledge base.'; }
return results.map((hit, i) => ({ resourceId: hit.id, rank: i + 1, title: hit.content.title || 'Untitled', content: hit.content.text || '', section: hit.content.section || 'unknown', score: hit.score, })); }, }), deleteResource: tool({ description: 'Delete a resource from the knowledge base', inputSchema: z.object({ resourceId: z.string().describe('The ID of the resource to delete'), }), execute: async ({ resourceId }) => { try { await index.delete({ ids: [resourceId] }); return `Successfully deleted resource with ID: ${resourceId}`; } catch (error) { return `Failed to delete resource: ${error instanceof Error ? error.message : 'Unknown error'}`; } }, }), }, // log out intermediate steps onStepFinish: ({ toolResults }) => { if (toolResults.length > 0) { console.log('Tool results:'); console.dir(toolResults, { depth: null }); } }, });
return text;}
const question = 'What are the two main things I worked on before college? (utilize knowledge base)';
main(question).then(console.log).catch(console.error);
Running the Agent
Now let's run the agent:
pnpm tsx agent.ts
The agent will utilize the knowledge base to answer questions, add new resources, and delete existing ones as needed. You can modify the question
variable to test different queries and interactions with the knowledge base.