Caching
Pluggable response + embedding cache. Built-in lruCache, external Redis, or roll your own.
Caching
A CacheProvider sits between the AIEngine and your LLM. It stores both completions and (optionally) embeddings, keyed by provider + model + content hash. Every entry is bound to the provider that wrote it — when you swap providers or models, the engine invalidates the old prefix automatically. No stale answers from yesterday's model.
In-memory (lruCache)
Built into core. Bounded LRU with optional TTL. Perfect for dev, demos, and any deployment where a single browser session's worth of cache is enough.
import { lruCache } from '@inferagraph/core/data';
// Built-in. Lives for the page session; cleared on reload.
const cache = lruCache({
maxEntries: 500, // optional, default 1000
ttl: '1h', // optional; accepts ms / '30s' / '1h' / '1d'
});
<InferaGraph data={data} llm={llm} cache={cache} />Redis (production)
Install @inferagraph/redis for a Redis-backed cache that survives restarts and is shareable across processes. Honors the same maxEntries + TTL semantics as the in-memory cache, so swapping between them is a one-line change. The factory redisCacheProvider({...}) owns the Redis SDK construction; pass an existing client only if you need to share connections.
import { redisCacheProvider } from '@inferagraph/redis';
// External. Survives restarts, shareable across instances.
// Suitable for production; runs server-side only.
const cache = redisCacheProvider({
url: process.env.REDIS_URL,
prefix: 'inferagraph:', // optional
ttl: '1d', // optional
maxEntries: 10_000, // optional; -1 disables cap
});Run server-side only — Redis clients can't talk to a Redis server from a browser. Pair with an httpTransport so cached completions live next to the engine.
Cosmos DB
Install @inferagraph/cosmosdb for an Azure-native cache backed by a Cosmos DB container. Use this when your stack already runs on Azure and you'd rather not stand up a separate Redis. The factory cosmosCacheProvider({...}) owns the SDK construction; pass endpoint + key (or supply your own CosmosClient via the class form).
import { cosmosCacheProvider } from '@inferagraph/cosmosdb';
// Azure-native cache backed by a Cosmos DB container.
const cache = cosmosCacheProvider({
endpoint: process.env.COSMOS_ENDPOINT,
key: process.env.COSMOS_KEY,
database: 'inferagraph',
container: 'cache',
});SQL (Postgres / MySQL / SQLite / MSSQL)
Install @inferagraph/sql for a SQL-backed cache. Useful when your app already has a relational database and you want completions persisted in the same durability domain. The factory sqlCacheProvider({...}) takes a Knex connection config and provisions/uses a single inferagraph_cache table. Pair with provisionSqlSchemas to bring up vector + cache + conversation tables together.
import { sqlCacheProvider } from '@inferagraph/sql';
// Postgres / MySQL / SQLite / MSSQL cache table via Knex.
const cache = sqlCacheProvider({
dialect: 'postgres',
connection: process.env.DATABASE_URL,
table: 'inferagraph_cache', // optional, default 'inferagraph_cache'
});Class form (new SqlCacheProvider({ knex })) is the escape hatch when you want to share an existing Knex instance.
CacheProvider contract
Four methods. The engine never assumes a specific backend — implement these and you have a cache. Core 0.9.0 widened the interface: set() now accepts an optional opts.ttlSeconds for per-call expiry, and delete(key) is a first-class method (no longer a prefix sweep).
// Widened in core 0.9.0: per-call ttlSeconds on set(), explicit delete().
interface CacheProvider {
get(key: string): Promise<string | undefined>;
set(
key: string,
value: string,
opts?: { ttlSeconds?: number }
): Promise<void>;
delete(key: string): Promise<void>;
clear(): Promise<void>;
}In-memory reference impl (inMemoryCacheProvider)
Core 0.9.0 ships inMemoryCacheProvider, a TTL-aware CacheProvider for dev, tests, and single-process deployments. Expiry is lazy — entries are dropped on the next get() that observes them past their deadline. Construction-time default TTL is overridden by the per-call opts.ttlSeconds on set(). Production impls live in the storage packages.
import { inMemoryCacheProvider } from '@inferagraph/core/data';
// TTL-aware in-memory CacheProvider. Reference impl for dev / tests.
// Lazy expiry on get(). Construction-time default + per-call override.
const cache = inMemoryCacheProvider({
ttlSeconds: 3600, // optional default, applied when set() omits opts
});
await cache.set('k1', 'value'); // uses default 3600s
await cache.set('k2', 'value', { ttlSeconds: 60 }); // per-call override
await cache.delete('k1');
await cache.clear();The bounded-size InMemoryLRUCache (used by lruCache) implements the same widened interface and gained delete(key) in 0.9.0 for completeness. Its TTL policy stays construction-time only — the opts.ttlSeconds argument is accepted but ignored, because per-entry expiry interacts awkwardly with bounded-size eviction. Reach for inMemoryCacheProvider when you need real per-call TTLs and lruCache when you need a hard memory cap.
Key shape and invalidation
Keys are namespaced by provider id and model id. When you call setProvider() with a different provider or model, the engine drops the old namespace before writing new entries. You never see cross-model bleed.
// Cache keys are composed by the engine. You don't construct
// them yourself; this is the shape so you know what to expect
// when inspecting Redis or designing eviction.
// Completion key
"<providerId>:<modelId>:completion:<hash(prompt)>"
// Embedding key (Tier 2 — cache acts as embedding store)
"<providerId>:<modelId>:embedding:<nodeId>:<hash(text)>"
// Provider switch invalidates everything under the old prefix.Custom caches
Wrap any backend — Memcached, Cloudflare KV, DynamoDB, a SQLite file — by implementing the four methods.
import type { CacheProvider } from '@inferagraph/core/data';
// Implement the four methods against any backend you like.
function memcacheProvider(client: MemcacheClient): CacheProvider {
return {
async get(key) {
return (await client.get(key)) ?? undefined;
},
async set(key, value, opts) {
await client.set(key, value, {
ttl: opts?.ttlSeconds,
});
},
async delete(key) { await client.delete(key); },
async clear() { await client.flush(); },
};
}