Baseten Provider
Baseten is an inference platform for serving frontier, enterprise-grade open source AI modelsAPI.
Setup
The Baseten provider is available via the @ai-sdk/baseten
module. You can install it with
pnpm add @ai-sdk/baseten
Provider Instance
You can import the default provider instance baseten
from @ai-sdk/baseten
:
import { baseten } from '@ai-sdk/baseten';
If you need a customized setup, you can import createBaseten
from @ai-sdk/baseten
and create a provider instance with your settings:
import { createBaseten } from '@ai-sdk/baseten';
const baseten = createBaseten({ apiKey: process.env.BASETEN_API_KEY ?? '',});
You can use the following optional settings to customize the Baseten provider instance:
-
baseURL string
Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is
https://inference.baseten.co/v1
. -
apiKey string
API key that is being sent using the
Authorization
header. It defaults to theBASETEN_API_KEY
environment variable. It is recommended you set the environment variable usingexport
so you do not need to include the field everytime. You can grab your Baseten API Key here -
modelURL string
Custom model URL for specific models (chat or embeddings). If not provided, the default Models API will be used.
-
headers Record<string,string>
Custom headers to include in the requests.
-
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation.
Models API
You can select Baseten models using a provider instance.
The first argument is the model id, e.g. 'deepseek-ai/DeepSeek-V3-0324'
: The complete supported models under models API can be found here.
const model = baseten('deepseek-ai/Deepseek-V3-0324');
Example
You can use Baseten language models to generate text with the generateText
function:
import { baseten } from '@ai-sdk/baseten';import { generateText } from 'ai';
const { text } = await generateText({ model: baseten('deepseek-ai/Deepseek-V3-0324'), prompt: 'What is the meaning of life? Answer in one sentence.',});
Baseten language models can also be used in the streamText
function
(see AI SDK Core).
Dedicated Models
Baseten supports dedicated model URLs for both chat and embedding models. You have to specify a modelURL
when creating the provider:
OpenAI-Compatible Endpoints (/sync/v1
)
For models deployed with Baseten's OpenAI-compatible endpoints:
import { createBaseten } from '@ai-sdk/baseten';
const baseten = createBaseten({ modelURL: 'https://model-{MODEL_ID}.api.baseten.co/sync/v1',});// No modelId is needed because we specified modelURLconst model = baseten();const { text } = await generateText({ model: customChatModel as any, prompt: 'Say hello from the OpenAI-compatible chat model!',});
/predict
Endpoints
/predict
endpoints are currently NOT supported for chat models. You must use /sync/v1
endpoints for chat functionality.
Embedding Models
You can create models that call the Baseten embeddings API using the .textEmbeddingModel()
factory method. The Baseten provider uses the high-performance @basetenlabs/performance-client
for optimal embedding performance.
import { createBaseten } from '@ai-sdk/baseten';import { embed, embedMany } from 'ai';
const baseten = createBaseten({ modelURL: 'https://model-{MODEL_ID}.api.baseten.co/sync',});
const embeddingModel = baseten.textEmbeddingModel();
// Single embeddingconst { embedding } = await embed({ model: embeddingModel, value: 'sunny day at the beach',});
// Batch embeddingsconst { embeddings } = await embedMany({ model: embeddingModel, values: [ 'sunny day at the beach', 'rainy afternoon in the city', 'snowy mountain peak', ],});
Endpoint Support for Embeddings
Supported:
/sync
endpoints (Performance Client automatically adds/v1/embeddings
)/sync/v1
endpoints (automatically strips/v1
before passing to Performance Client)
Not Supported:
/predict
endpoints (not compatible with Performance Client)
Performance Features
The embedding implementation includes:
- High-performance client: Uses
@basetenlabs/performance-client
for optimal performance - Automatic batching: Efficiently handles multiple texts in a single request
- Connection reuse: Performance Client is created once and reused for all requests
- Built-in retries: Automatic retry logic for failed requests
Error Handling
The Baseten provider includes built-in error handling for common API errors:
import { baseten } from '@ai-sdk/baseten';import { generateText } from 'ai';
try { const { text } = await generateText({ model: baseten('deepseek-ai/DeepSeek-V3-0324'), prompt: 'Hello, world!', });} catch (error) { console.error('Baseten API error:', error.message);}
Common Error Scenarios
// Embeddings require a modelURLtry { baseten.textEmbeddingModel();} catch (error) { // Error: "No model URL provided for embeddings. Please set modelURL option for embeddings."}
// /predict endpoints are not supported for chat modelstry { const baseten = createBaseten({ modelURL: 'https://model-{MODEL_ID}.api.baseten.co/environments/production/predict', }); baseten(); // This will throw an error} catch (error) { // Error: "Not supported. You must use a /sync/v1 endpoint for chat models."}
// /sync/v1 endpoints are now supported for embeddingsconst baseten = createBaseten({ modelURL: 'https://model-{MODEL_ID}.api.baseten.co/environments/production/sync/v1',});const embeddingModel = baseten.textEmbeddingModel(); // This works fine!
// /predict endpoints are not supported for embeddingstry { const baseten = createBaseten({ modelURL: 'https://model-{MODEL_ID}.api.baseten.co/environments/production/predict', }); baseten.textEmbeddingModel(); // This will throw an error} catch (error) { // Error: "Not supported. You must use a /sync or /sync/v1 endpoint for embeddings."}
// Image models are not supportedtry { baseten.imageModel('test-model');} catch (error) { // Error: NoSuchModelError for imageModel}
For more information about Baseten models and deployment options, see the Baseten documentation.