LlamaGate

LlamaGate is an OpenAI-compatible API gateway providing access to 26+ open-source LLMs with competitive pricing. Perfect for indie developers and startups who want affordable access to models like Llama, Qwen, DeepSeek, and Mistral.

  • 26+ Open-Source Models: Access Llama, Mistral, DeepSeek R1, Qwen, and more
  • OpenAI-Compatible API: Drop-in replacement for existing OpenAI integrations
  • Competitive Pricing: $0.02-$0.55 per 1M tokens
  • Vision Models: Qwen VL, LLaVA for multimodal tasks
  • Reasoning Models: DeepSeek R1 for complex problem-solving
  • Code Models: CodeLlama, DeepSeek Coder, Qwen Coder
  • Embedding Models: Nomic Embed Text, Qwen 3 Embedding

Learn more about LlamaGate's capabilities in the LlamaGate Documentation.

Setup

The LlamaGate provider is available in the @llamagate/ai-sdk-provider module. You can install it with:

pnpm add @llamagate/ai-sdk-provider

Provider Instance

To create a LlamaGate provider instance, use the createLlamaGate function:

import { createLlamaGate } from '@llamagate/ai-sdk-provider';
const llamagate = createLlamaGate({
apiKey: 'YOUR_LLAMAGATE_API_KEY',
});

You can obtain your LlamaGate API key from the LlamaGate Dashboard.

Alternatively, you can use the default instance which reads from the LLAMAGATE_API_KEY environment variable:

import { llamagate } from '@llamagate/ai-sdk-provider';

Language Models

LlamaGate provides chat models via the llamagate() function or llamagate.chatModel():

// Default usage
const model = llamagate('llama-3.1-8b');
// Explicit chat model
const chatModel = llamagate.chatModel('qwen3-8b');

Available Models

Model IDDescriptionContext
llama-3.1-8bLlama 3.1 8B Instruct131K
llama-3.2-3bLlama 3.2 3B131K
qwen3-8bQwen 3 8B32K
mistral-7b-v0.3Mistral 7B v0.332K
deepseek-r1-8bDeepSeek R1 8B (Reasoning)64K
deepseek-r1-7b-qwenDeepSeek R1 Distill Qwen 7B131K
openthinker-7bOpenThinker 7B32K
dolphin3-8bDolphin 3 8B128K
qwen2.5-coder-7bQwen 2.5 Coder 7B32K
codellama-7bCodeLlama 7B16K
qwen3-vl-8bQwen 3 VL 8B (Vision)32K
llava-7bLLaVA 1.5 7B (Vision)4K
gemma3-4bGemma 3 4B (Vision)128K

You can find the full list of available models in the LlamaGate Models documentation.

Embedding Models

LlamaGate provides text embedding models via llamagate.textEmbeddingModel():

const embeddingModel = llamagate.textEmbeddingModel('nomic-embed-text');

Available Embedding Models

Model IDDescriptionContext
nomic-embed-textNomic Embed Text8K
embeddinggemma-300mEmbeddingGemma 300M2K
qwen3-embedding-8bQwen 3 Embedding 8B40K

Examples

Here are examples of using LlamaGate with the AI SDK:

generateText

import { createLlamaGate } from '@llamagate/ai-sdk-provider';
import { generateText } from 'ai';
const llamagate = createLlamaGate({
apiKey: 'YOUR_LLAMAGATE_API_KEY',
});
const { text } = await generateText({
model: llamagate('llama-3.1-8b'),
prompt: 'Explain quantum computing in simple terms.',
});
console.log(text);

streamText

import { createLlamaGate } from '@llamagate/ai-sdk-provider';
import { streamText } from 'ai';
const llamagate = createLlamaGate({
apiKey: 'YOUR_LLAMAGATE_API_KEY',
});
const result = streamText({
model: llamagate('qwen3-8b'),
prompt: 'Write a short story about a robot.',
});
for await (const chunk of result) {
console.log(chunk);
}

embed

import { createLlamaGate } from '@llamagate/ai-sdk-provider';
import { embed } from 'ai';
const llamagate = createLlamaGate({
apiKey: 'YOUR_LLAMAGATE_API_KEY',
});
const { embedding } = await embed({
model: llamagate.textEmbeddingModel('nomic-embed-text'),
value: 'The quick brown fox jumps over the lazy dog.',
});
console.log(embedding);

Vision

import { createLlamaGate } from '@llamagate/ai-sdk-provider';
import { generateText } from 'ai';
const llamagate = createLlamaGate({
apiKey: 'YOUR_LLAMAGATE_API_KEY',
});
const { text } = await generateText({
model: llamagate('qwen3-vl-8b'),
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What is in this image?' },
{ type: 'image', image: new URL('https://example.com/image.jpg') },
],
},
],
});
console.log(text);

Additional Resources