# DeepInfra Provider The [DeepInfra](https://deepinfra.com) provider contains support for state-of-the-art models through the DeepInfra API, including Llama 3, Mixtral, Qwen, and many other popular open-source models. ## Setup The DeepInfra provider is available via the `@ai-sdk/deepinfra` module. You can install it with: ## Provider Instance You can import the default provider instance `deepinfra` from `@ai-sdk/deepinfra`: ```ts import { deepinfra } from '@ai-sdk/deepinfra'; ``` If you need a customized setup, you can import `createDeepInfra` from `@ai-sdk/deepinfra` and create a provider instance with your settings: ```ts import { createDeepInfra } from '@ai-sdk/deepinfra'; const deepinfra = createDeepInfra({ apiKey: process.env.DEEPINFRA_API_KEY ?? '', }); ``` You can use the following optional settings to customize the DeepInfra provider instance: - **baseURL** _string_ Use a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is `https://api.deepinfra.com/v1`. Note: Language models and embeddings use OpenAI-compatible endpoints at `{baseURL}/openai`, while image models use `{baseURL}/inference`. - **apiKey** _string_ API key that is being sent using the `Authorization` header. It defaults to the `DEEPINFRA_API_KEY` environment variable. - **headers** _Record<string,string>_ Custom headers to include in the requests. - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise<Response>_ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation. Defaults to the global `fetch` function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing. ## Language Models You can create language models using a provider instance. The first argument is the model ID, for example: ```ts import { deepinfra } from '@ai-sdk/deepinfra'; import { generateText } from 'ai'; const { text } = await generateText({ model: deepinfra('meta-llama/Meta-Llama-3.1-70B-Instruct'), prompt: 'Write a vegetarian lasagna recipe for 4 people.', }); ``` DeepInfra language models can also be used in the `streamText` function (see [AI SDK Core](/docs/ai-sdk-core)). ## Model Capabilities | Model | Image Input | Object Generation | Tool Usage | Tool Streaming | | --------------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | | `meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8` | | | | | | `meta-llama/Llama-4-Scout-17B-16E-Instruct` | | | | | | `meta-llama/Llama-3.3-70B-Instruct-Turbo` | | | | | | `meta-llama/Llama-3.3-70B-Instruct` | | | | | | `meta-llama/Meta-Llama-3.1-405B-Instruct` | | | | | | `meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo` | | | | | | `meta-llama/Meta-Llama-3.1-70B-Instruct` | | | | | | `meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo` | | | | | | `meta-llama/Meta-Llama-3.1-8B-Instruct` | | | | | | `meta-llama/Llama-3.2-11B-Vision-Instruct` | | | | | | `meta-llama/Llama-3.2-90B-Vision-Instruct` | | | | | | `mistralai/Mixtral-8x7B-Instruct-v0.1` | | | | | | `deepseek-ai/DeepSeek-V3` | | | | | | `deepseek-ai/DeepSeek-R1` | | | | | | `deepseek-ai/DeepSeek-R1-Distill-Llama-70B` | | | | | | `deepseek-ai/DeepSeek-R1-Turbo` | | | | | | `nvidia/Llama-3.1-Nemotron-70B-Instruct` | | | | | | `Qwen/Qwen2-7B-Instruct` | | | | | | `Qwen/Qwen2.5-72B-Instruct` | | | | | | `Qwen/Qwen2.5-Coder-32B-Instruct` | | | | | | `Qwen/QwQ-32B-Preview` | | | | | | `google/codegemma-7b-it` | | | | | | `google/gemma-2-9b-it` | | | | | | `microsoft/WizardLM-2-8x22B` | | | | | The table above lists popular models. Please see the [DeepInfra docs](https://deepinfra.com) for a full list of available models. You can also pass any available provider model ID as a string if needed. ## Image Models You can create DeepInfra image models using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image). ```ts import { deepinfra } from '@ai-sdk/deepinfra'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: deepinfra.image('stabilityai/sd3.5'), prompt: 'A futuristic cityscape at sunset', aspectRatio: '16:9', }); ``` Model support for `size` and `aspectRatio` parameters varies by model. Please check the individual model documentation on [DeepInfra's models page](https://deepinfra.com/models/text-to-image) for supported options and additional parameters. ### Model-specific options You can pass model-specific parameters using the `providerOptions.deepinfra` field: ```ts import { deepinfra } from '@ai-sdk/deepinfra'; import { experimental_generateImage as generateImage } from 'ai'; const { image } = await generateImage({ model: deepinfra.image('stabilityai/sd3.5'), prompt: 'A futuristic cityscape at sunset', aspectRatio: '16:9', providerOptions: { deepinfra: { num_inference_steps: 30, // Control the number of denoising steps (1-50) }, }, }); ``` ### Model Capabilities For models supporting aspect ratios, the following ratios are typically supported: `1:1 (default), 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21` For models supporting size parameters, dimensions must typically be: - Multiples of 32 - Width and height between 256 and 1440 pixels - Default size is 1024x1024 | Model | Dimensions Specification | Notes | | ---------------------------------- | ------------------------ | -------------------------------------------------------- | | `stabilityai/sd3.5` | Aspect Ratio | Premium quality base model, 8B parameters | | `black-forest-labs/FLUX-1.1-pro` | Size | Latest state-of-art model with superior prompt following | | `black-forest-labs/FLUX-1-schnell` | Size | Fast generation in 1-4 steps | | `black-forest-labs/FLUX-1-dev` | Size | Optimized for anatomical accuracy | | `black-forest-labs/FLUX-pro` | Size | Flagship Flux model | | `stabilityai/sd3.5-medium` | Aspect Ratio | Balanced 2.5B parameter model | | `stabilityai/sdxl-turbo` | Aspect Ratio | Optimized for fast generation | For more details and pricing information, see the [DeepInfra text-to-image models page](https://deepinfra.com/models/text-to-image). ## Embedding Models You can create DeepInfra embedding models using the `.textEmbedding()` factory method. For more on embedding models with the AI SDK see [embed()](/docs/reference/ai-sdk-core/embed). ```ts import { deepinfra } from '@ai-sdk/deepinfra'; import { embed } from 'ai'; const { embedding } = await embed({ model: deepinfra.textEmbedding('BAAI/bge-large-en-v1.5'), value: 'sunny day at the beach', }); ``` ### Model Capabilities | Model | Dimensions | Max Tokens | | ----------------------------------------------------- | ---------- | ---------- | | `BAAI/bge-base-en-v1.5` | 768 | 512 | | `BAAI/bge-large-en-v1.5` | 1024 | 512 | | `BAAI/bge-m3` | 1024 | 8192 | | `intfloat/e5-base-v2` | 768 | 512 | | `intfloat/e5-large-v2` | 1024 | 512 | | `intfloat/multilingual-e5-large` | 1024 | 512 | | `sentence-transformers/all-MiniLM-L12-v2` | 384 | 256 | | `sentence-transformers/all-MiniLM-L6-v2` | 384 | 256 | | `sentence-transformers/all-mpnet-base-v2` | 768 | 384 | | `sentence-transformers/clip-ViT-B-32` | 512 | 77 | | `sentence-transformers/clip-ViT-B-32-multilingual-v1` | 512 | 77 | | `sentence-transformers/multi-qa-mpnet-base-dot-v1` | 768 | 512 | | `sentence-transformers/paraphrase-MiniLM-L6-v2` | 384 | 128 | | `shibing624/text2vec-base-chinese` | 768 | 512 | | `thenlper/gte-base` | 768 | 512 | | `thenlper/gte-large` | 1024 | 512 | For a complete list of available embedding models, see the [DeepInfra embeddings page](https://deepinfra.com/models/embeddings). ## Navigation - [AI Gateway](/v5/providers/ai-sdk-providers/ai-gateway) - [xAI Grok](/v5/providers/ai-sdk-providers/xai) - [Vercel](/v5/providers/ai-sdk-providers/vercel) - [OpenAI](/v5/providers/ai-sdk-providers/openai) - [Azure OpenAI](/v5/providers/ai-sdk-providers/azure) - [Anthropic](/v5/providers/ai-sdk-providers/anthropic) - [Amazon Bedrock](/v5/providers/ai-sdk-providers/amazon-bedrock) - [Groq](/v5/providers/ai-sdk-providers/groq) - [Fal](/v5/providers/ai-sdk-providers/fal) - [AssemblyAI](/v5/providers/ai-sdk-providers/assemblyai) - [DeepInfra](/v5/providers/ai-sdk-providers/deepinfra) - [Deepgram](/v5/providers/ai-sdk-providers/deepgram) - [Black Forest Labs](/v5/providers/ai-sdk-providers/black-forest-labs) - [Gladia](/v5/providers/ai-sdk-providers/gladia) - [LMNT](/v5/providers/ai-sdk-providers/lmnt) - [Google Generative AI](/v5/providers/ai-sdk-providers/google-generative-ai) - [Hume](/v5/providers/ai-sdk-providers/hume) - [Google Vertex AI](/v5/providers/ai-sdk-providers/google-vertex) - [Rev.ai](/v5/providers/ai-sdk-providers/revai) - [Baseten](/v5/providers/ai-sdk-providers/baseten) - [Hugging Face](/v5/providers/ai-sdk-providers/huggingface) - [Mistral AI](/v5/providers/ai-sdk-providers/mistral) - [Together.ai](/v5/providers/ai-sdk-providers/togetherai) - [Cohere](/v5/providers/ai-sdk-providers/cohere) - [Fireworks](/v5/providers/ai-sdk-providers/fireworks) - [DeepSeek](/v5/providers/ai-sdk-providers/deepseek) - [Moonshot AI](/v5/providers/ai-sdk-providers/moonshotai) - [Alibaba](/v5/providers/ai-sdk-providers/alibaba) - [Cerebras](/v5/providers/ai-sdk-providers/cerebras) - [Replicate](/v5/providers/ai-sdk-providers/replicate) - [Perplexity](/v5/providers/ai-sdk-providers/perplexity) - [Luma](/v5/providers/ai-sdk-providers/luma) - [ElevenLabs](/v5/providers/ai-sdk-providers/elevenlabs) [Full Sitemap](/sitemap.md)