Caching

Pluggable response + embedding cache. Built-in lruCache, external Redis, or roll your own.

Caching

A CacheProvider sits between the AIEngine and your LLM. It stores both completions and (optionally) embeddings, keyed by provider + model + content hash. Every entry is bound to the provider that wrote it — when you swap providers or models, the engine invalidates the old prefix automatically. No stale answers from yesterday's model.

In-memory (lruCache)

Built into core. Bounded LRU with optional TTL. Perfect for dev, demos, and any deployment where a single browser session's worth of cache is enough.

import { lruCache } from '@inferagraph/core/data';

// Built-in. Lives for the page session; cleared on reload.
const cache = lruCache({
  maxEntries: 500,    // optional, default 1000
  ttl: '1h',          // optional; accepts ms / '30s' / '1h' / '1d'
});

<InferaGraph data={data} llm={llm} cache={cache} />

Redis (production)

Install @inferagraph/redis for a Redis-backed cache that survives restarts and is shareable across processes. Honors the same maxEntries + TTL semantics as the in-memory cache, so swapping between them is a one-line change. The factory redisCacheProvider({...}) owns the Redis SDK construction; pass an existing client only if you need to share connections.

import { redisCacheProvider } from '@inferagraph/redis';

// External. Survives restarts, shareable across instances.
// Suitable for production; runs server-side only.
const cache = redisCacheProvider({
  url: process.env.REDIS_URL,
  prefix: 'inferagraph:',  // optional
  ttl: '1d',              // optional
  maxEntries: 10_000,      // optional; -1 disables cap
});

Run server-side only — Redis clients can't talk to a Redis server from a browser. Pair with an httpTransport so cached completions live next to the engine.

Cosmos DB

Install @inferagraph/cosmosdb for an Azure-native cache backed by a Cosmos DB container. Use this when your stack already runs on Azure and you'd rather not stand up a separate Redis. The factory cosmosCacheProvider({...}) owns the SDK construction; pass endpoint + key (or supply your own CosmosClient via the class form).

import { cosmosCacheProvider } from '@inferagraph/cosmosdb';

// Azure-native cache backed by a Cosmos DB container.
const cache = cosmosCacheProvider({
  endpoint: process.env.COSMOS_ENDPOINT,
  key: process.env.COSMOS_KEY,
  database: 'inferagraph',
  container: 'cache',
});

SQL (Postgres / MySQL / SQLite / MSSQL)

Install @inferagraph/sql for a SQL-backed cache. Useful when your app already has a relational database and you want completions persisted in the same durability domain. The factory sqlCacheProvider({...}) takes a Knex connection config and provisions/uses a single inferagraph_cache table. Pair with provisionSqlSchemas to bring up vector + cache + conversation tables together.

import { sqlCacheProvider } from '@inferagraph/sql';

// Postgres / MySQL / SQLite / MSSQL cache table via Knex.
const cache = sqlCacheProvider({
  dialect: 'postgres',
  connection: process.env.DATABASE_URL,
  table: 'inferagraph_cache',  // optional, default 'inferagraph_cache'
});

Class form (new SqlCacheProvider({ knex })) is the escape hatch when you want to share an existing Knex instance.

CacheProvider contract

Four methods. The engine never assumes a specific backend — implement these and you have a cache. Core 0.9.0 widened the interface: set() now accepts an optional opts.ttlSeconds for per-call expiry, and delete(key) is a first-class method (no longer a prefix sweep).

// Widened in core 0.9.0: per-call ttlSeconds on set(), explicit delete().
interface CacheProvider {
  get(key: string): Promise<string | undefined>;
  set(
    key: string,
    value: string,
    opts?: { ttlSeconds?: number }
  ): Promise<void>;
  delete(key: string): Promise<void>;
  clear(): Promise<void>;
}

In-memory reference impl (inMemoryCacheProvider)

Core 0.9.0 ships inMemoryCacheProvider, a TTL-aware CacheProvider for dev, tests, and single-process deployments. Expiry is lazy — entries are dropped on the next get() that observes them past their deadline. Construction-time default TTL is overridden by the per-call opts.ttlSeconds on set(). Production impls live in the storage packages.

import { inMemoryCacheProvider } from '@inferagraph/core/data';

// TTL-aware in-memory CacheProvider. Reference impl for dev / tests.
// Lazy expiry on get(). Construction-time default + per-call override.
const cache = inMemoryCacheProvider({
  ttlSeconds: 3600,   // optional default, applied when set() omits opts
});

await cache.set('k1', 'value');                        // uses default 3600s
await cache.set('k2', 'value', { ttlSeconds: 60 });    // per-call override
await cache.delete('k1');
await cache.clear();

The bounded-size InMemoryLRUCache (used by lruCache) implements the same widened interface and gained delete(key) in 0.9.0 for completeness. Its TTL policy stays construction-time only — the opts.ttlSeconds argument is accepted but ignored, because per-entry expiry interacts awkwardly with bounded-size eviction. Reach for inMemoryCacheProvider when you need real per-call TTLs and lruCache when you need a hard memory cap.

Key shape and invalidation

Keys are namespaced by provider id and model id. When you call setProvider() with a different provider or model, the engine drops the old namespace before writing new entries. You never see cross-model bleed.

// Cache keys are composed by the engine. You don't construct
// them yourself; this is the shape so you know what to expect
// when inspecting Redis or designing eviction.

// Completion key
"<providerId>:<modelId>:completion:<hash(prompt)>"

// Embedding key (Tier 2 — cache acts as embedding store)
"<providerId>:<modelId>:embedding:<nodeId>:<hash(text)>"

// Provider switch invalidates everything under the old prefix.

Custom caches

Wrap any backend — Memcached, Cloudflare KV, DynamoDB, a SQLite file — by implementing the four methods.

import type { CacheProvider } from '@inferagraph/core/data';

// Implement the four methods against any backend you like.
function memcacheProvider(client: MemcacheClient): CacheProvider {
  return {
    async get(key) {
      return (await client.get(key)) ?? undefined;
    },
    async set(key, value, opts) {
      await client.set(key, value, {
        ttl: opts?.ttlSeconds,
      });
    },
    async delete(key) { await client.delete(key); },
    async clear() { await client.flush(); },
  };
}