LLM Providers
Configure AI providers for Anthropic, OpenAI, Azure Foundry, or build your own custom provider.
LLM Providers
Each provider is a separate npm package. Install only what you need.
Anthropic
@inferagraph/anthropic-provider
import { InferaGraph } from '@inferagraph/core/react';
import { anthropicProvider } from '@inferagraph/anthropic-provider';
// Factory function returns an LLMProvider instance.
const llm = anthropicProvider({
apiKey: process.env.ANTHROPIC_API_KEY,
model: 'claude-sonnet-4-20250514', // default
maxTokens: 1024, // default
});
function App() {
return <InferaGraph data={data} llm={llm} />;
}OpenAI
@inferagraph/openai-provider
import { InferaGraph } from '@inferagraph/core/react';
import { openaiProvider } from '@inferagraph/openai-provider';
// Factory function. Handles public OpenAI by default; pass an OpenAI
// SDK instance via the 'client' option for Azure OpenAI / OpenRouter / GitHub Models.
const llm = openaiProvider({
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o-mini', // default
embeddingModel: 'text-embedding-3-small', // default
organization: 'org-...', // optional
});
function App() {
return <InferaGraph data={data} llm={llm} />;
}Azure OpenAI
@inferagraph/openai-provider — azureOpenaiProvider
Use the dedicated azureOpenaiProvider factory for Azure OpenAI deployments. It targets the Azure OpenAI v1 API, so there is no api-version query string and you never construct an OpenAI / AzureOpenAI client by hand. Pass an optional embeddingDeployment to enable semantic search; omit it for chat-only deployments and embed stays undefined.
import { InferaGraph } from '@inferagraph/core/react';
import { azureOpenaiProvider } from '@inferagraph/openai-provider';
// Azure OpenAI v1 API — no api-version query string. The factory
// encapsulates SDK construction; you never build an OpenAI client by hand.
const llm = azureOpenaiProvider({
apiKey: process.env.AZURE_OPENAI_API_KEY,
endpoint: process.env.AZURE_OPENAI_ENDPOINT, // bare resource URL
deployment: process.env.AZURE_OPENAI_DEPLOYMENT, // chat deployment name
embeddingDeployment: process.env.AZURE_OPENAI_EMBEDDING_DEPLOYMENT, // optional
});
function App() {
return <InferaGraph data={data} llm={llm} />;
}- apiKey — from AZURE_OPENAI_API_KEY.
- endpoint — from AZURE_OPENAI_ENDPOINT; the bare resource URL. Trailing slashes are trimmed.
- deployment — from AZURE_OPENAI_DEPLOYMENT; the chat model deployment name (sent as model).
- embeddingDeployment — optional, from AZURE_OPENAI_EMBEDDING_DEPLOYMENT. When omitted the provider has no embed capability.
- client — optional pre-built OpenAI SDK client; primarily for tests / mocks.
Azure AI Foundry
@inferagraph/azure-foundry-provider
Targets Azure AI Foundry deployments via @azure-rest/ai-inference. Chat (complete, stream, streamMessages) only — embed is intentionally omitted because the underlying SDK targets chat-completion routes only. Pair with azureOpenaiProvider (with an embeddingDeployment) when you need embeddings.
import { InferaGraph } from '@inferagraph/core/react';
import { AzureFoundryProvider } from '@inferagraph/azure-foundry-provider';
// Azure AI Foundry catalog. Chat-only — embed is intentionally omitted.
// Pair with a separate embedding-capable provider (e.g. azureOpenaiProvider
// with an embeddingDeployment) when you need embeddings.
const llm = new AzureFoundryProvider({
endpoint: 'https://your-endpoint.inference.ai.azure.com',
apiKey: '...', // or pass 'credential' (TokenCredential)
deploymentName: 'gpt-4o', // optional
});
function App() {
return <InferaGraph data={data} llm={llm} />;
}Back-compat: stream vs streamMessages
Every provider keeps the legacy single-string stream(prompt)
signature for backwards compatibility. New consumers should call
streamMessages(messages, opts) with a structured
[{role, content}] array — system instructions stay separate from user
input end-to-end, and multi-turn chat works without manual prompt concatenation. See
Chat API for the
LLMMessage / LLMRole types.
Custom Provider
Extend LLMProvider to add any model.
import { InferaGraph } from '@inferagraph/core/react';
import type { LLMProvider, LLMMessage } from '@inferagraph/core/data';
// LLMProvider is an interface — return any object that implements it.
// stream(prompt) is back-compat only; new providers implement streamMessages.
const llm: LLMProvider = {
name: 'my-provider',
async complete(request) {
// Call your model API
return { content: '...' };
},
async *streamMessages(messages: LLMMessage[], opts) {
// messages: [{role:'system'|'user'|'assistant', content}, ...]
yield { type: 'text', delta: '...' };
yield { type: 'done', reason: 'stop' };
},
};
function App() {
return <InferaGraph data={data} llm={llm} />;
}Embeddings
Semantic search needs an embedding model. The OpenAI provider ships with embeddings out of the box. Anthropic has no native embeddings API, so the Anthropic provider proxies to Voyage AI internally—pass a Voyage key to opt in. Omit the embeddings block and chat-only callers are unaffected.
- @inferagraph/openai-provider — chat and embeddings via the same client.
- @inferagraph/anthropic-provider — chat via Anthropic; pass embeddings.apiKey to enable Voyage AI for embeddings.
import { anthropicProvider } from '@inferagraph/anthropic-provider';
// Anthropic handles chat. Anthropic has no native embeddings API,
// so the provider proxies to Voyage AI when a Voyage key is present.
// Omit the embeddings block and chat-only callers are unaffected.
const llm = anthropicProvider({
apiKey: process.env.ANTHROPIC_API_KEY,
embeddings: {
apiKey: process.env.VOYAGE_API_KEY,
model: 'voyage-3.5', // optional; default 'voyage-3.5'
},
});Cache Providers
A pluggable CacheProvider sits between InferaGraph and the LLM. It caches completions and embeddings, and is bound to the provider instance so swapping models invalidates stale entries automatically.
In-memory (default)
lruCache() from @inferagraph/core/data. Zero-config; lives for the page session.
Redis / Cosmos DB / SQL
@inferagraph/redis, @inferagraph/cosmosdb (cosmosCacheProvider), and @inferagraph/sql (sqlCacheProvider) — production-ready, survive restarts, shareable across instances. Cache both completions and (when used as Tier 2) embeddings.
import { redisCacheProvider } from '@inferagraph/redis';
// Production-ready cache for completions and embeddings
const cache = redisCacheProvider({
url: process.env.REDIS_URL,
prefix: 'inferagraph:', // optional
ttl: 60 * 60 * 24, // optional, seconds
});
<InferaGraph data={data} llm={llm} cache={cache} />See Caching for the full contract, invalidation semantics, and how to write a custom cache backend.