Baseten Provider

Baseten is an inference platform for serving frontier, enterprise-grade open source AI modelsAPI.

Setup

The Baseten provider is available via the @ai-sdk/baseten module. You can install it with

pnpm
npm
yarn
pnpm add @ai-sdk/baseten

Provider Instance

You can import the default provider instance baseten from @ai-sdk/baseten:

import { baseten } from '@ai-sdk/baseten';

If you need a customized setup, you can import createBaseten from @ai-sdk/baseten and create a provider instance with your settings:

import { createBaseten } from '@ai-sdk/baseten';
const baseten = createBaseten({
apiKey: process.env.BASETEN_API_KEY ?? '',
});

You can use the following optional settings to customize the Baseten provider instance:

  • baseURL string

    Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is https://inference.baseten.co/v1.

  • apiKey string

    API key that is being sent using the Authorization header. It defaults to the BASETEN_API_KEY environment variable. It is recommended you set the environment variable using export so you do not need to include the field everytime. You can grab your Baseten API Key here

  • modelURL string

    Custom model URL for specific models (chat or embeddings). If not provided, the default Models API will be used.

  • headers Record<string,string>

    Custom headers to include in the requests.

  • fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>

    Custom fetch implementation.

Models API

You can select Baseten models using a provider instance. The first argument is the model id, e.g. 'deepseek-ai/DeepSeek-V3-0324': The complete supported models under models API can be found here.

const model = baseten('deepseek-ai/Deepseek-V3-0324');

Example

You can use Baseten language models to generate text with the generateText function:

import { baseten } from '@ai-sdk/baseten';
import { generateText } from 'ai';
const { text } = await generateText({
model: baseten('deepseek-ai/Deepseek-V3-0324'),
prompt: 'What is the meaning of life? Answer in one sentence.',
});

Baseten language models can also be used in the streamText function (see AI SDK Core).

Dedicated Models

Baseten supports dedicated model URLs for both chat and embedding models. You have to specify a modelURL when creating the provider:

OpenAI-Compatible Endpoints (/sync/v1)

For models deployed with Baseten's OpenAI-compatible endpoints:

import { createBaseten } from '@ai-sdk/baseten';
const baseten = createBaseten({
modelURL: 'https://model-{MODEL_ID}.api.baseten.co/sync/v1',
});
// No modelId is needed because we specified modelURL
const model = baseten();
const { text } = await generateText({
model: customChatModel as any,
prompt: 'Say hello from the OpenAI-compatible chat model!',
});

/predict Endpoints

/predict endpoints are currently NOT supported for chat models. You must use /sync/v1 endpoints for chat functionality.

Embedding Models

You can create models that call the Baseten embeddings API using the .textEmbeddingModel() factory method. The Baseten provider uses the high-performance @basetenlabs/performance-client for optimal embedding performance.

import { createBaseten } from '@ai-sdk/baseten';
import { embed, embedMany } from 'ai';
const baseten = createBaseten({
modelURL: 'https://model-{MODEL_ID}.api.baseten.co/sync',
});
const embeddingModel = baseten.textEmbeddingModel();
// Single embedding
const { embedding } = await embed({
model: embeddingModel,
value: 'sunny day at the beach',
});
// Batch embeddings
const { embeddings } = await embedMany({
model: embeddingModel,
values: [
'sunny day at the beach',
'rainy afternoon in the city',
'snowy mountain peak',
],
});

Endpoint Support for Embeddings

Supported:

  • /sync endpoints (Performance Client automatically adds /v1/embeddings)
  • /sync/v1 endpoints (automatically strips /v1 before passing to Performance Client)

Not Supported:

  • /predict endpoints (not compatible with Performance Client)

Performance Features

The embedding implementation includes:

  • High-performance client: Uses @basetenlabs/performance-client for optimal performance
  • Automatic batching: Efficiently handles multiple texts in a single request
  • Connection reuse: Performance Client is created once and reused for all requests
  • Built-in retries: Automatic retry logic for failed requests

Error Handling

The Baseten provider includes built-in error handling for common API errors:

import { baseten } from '@ai-sdk/baseten';
import { generateText } from 'ai';
try {
const { text } = await generateText({
model: baseten('deepseek-ai/DeepSeek-V3-0324'),
prompt: 'Hello, world!',
});
} catch (error) {
console.error('Baseten API error:', error.message);
}

Common Error Scenarios

// Embeddings require a modelURL
try {
baseten.textEmbeddingModel();
} catch (error) {
// Error: "No model URL provided for embeddings. Please set modelURL option for embeddings."
}
// /predict endpoints are not supported for chat models
try {
const baseten = createBaseten({
modelURL:
'https://model-{MODEL_ID}.api.baseten.co/environments/production/predict',
});
baseten(); // This will throw an error
} catch (error) {
// Error: "Not supported. You must use a /sync/v1 endpoint for chat models."
}
// /sync/v1 endpoints are now supported for embeddings
const baseten = createBaseten({
modelURL:
'https://model-{MODEL_ID}.api.baseten.co/environments/production/sync/v1',
});
const embeddingModel = baseten.textEmbeddingModel(); // This works fine!
// /predict endpoints are not supported for embeddings
try {
const baseten = createBaseten({
modelURL:
'https://model-{MODEL_ID}.api.baseten.co/environments/production/predict',
});
baseten.textEmbeddingModel(); // This will throw an error
} catch (error) {
// Error: "Not supported. You must use a /sync or /sync/v1 endpoint for embeddings."
}
// Image models are not supported
try {
baseten.imageModel('test-model');
} catch (error) {
// Error: NoSuchModelError for imageModel
}

For more information about Baseten models and deployment options, see the Baseten documentation.