Embeddings
Three tiers of embedding storage with content-hash provenance. Pick the one that matches your traffic and budget.
Embedding storage
Semantic search needs vectors somewhere. InferaGraph supports three storage tiers behind a single contract — start simple, upgrade in place. Embeddings are generated on demand the first time a node is searched semantically, then reused indefinitely until the model or the source text changes.
- Tier 1 — None. Keyword search only. Zero LLM calls for embeddings.
- Tier 2 — Cache as store. Pass a cache with a provider that supports embed(). Embeddings live next to completions.
- Tier 3 — Dedicated store. Pass embeddingStore for a vector-native backend with its own eviction policy.
Tier 1 — None
Default. useInferaGraphSearch() falls back to keyword scoring (name 10, alias 3, tag 2, content 1). Free, fast, predictable — and often plenty for single-token lookups.
// Tier 1 — keyword search only, zero embedding calls
<InferaGraph data={data} llm={llm} />Tier 2 — Cache as store
Pass any CacheProvider; the engine writes embeddings under composite keys and re-runs cosine similarity on reads. This is the easiest upgrade — switch between lruCache and @inferagraph/redis without touching anything else.
import { lruCache } from '@inferagraph/core/data';
// Tier 2 — reuse the response cache as embedding storage.
// Easiest upgrade path. Embeddings live alongside completions
// and are evicted by the same LRU/TTL rules.
<InferaGraph
data={data}
llm={llm}
cache={lruCache({ maxEntries: 1000 })}
/>Caveat: similarity is computed in-process, so very large vector sets benefit from Tier 3. For low-thousands node graphs, Tier 2 is usually sufficient.
Tier 3 — Dedicated EmbeddingStore
An EmbeddingStore is a typed contract that decouples vector storage from the response cache. Core ships inMemoryEmbeddingStore() built-in. Implement the contract directly to wrap your favorite vector database.
import { inMemoryEmbeddingStore } from '@inferagraph/core/data';
// Tier 3 — dedicated EmbeddingStore. Same API, but separate
// from the completion cache so eviction policies can differ.
<InferaGraph
data={data}
llm={llm}
embeddingStore={inMemoryEmbeddingStore()}
/>A vector-native external store (e.g. a Redis Search adapter) is on the roadmap; track progress in the core repo.
EmbeddingStore contract
Six methods, all async. Implement them and you can plug in any vector store — pgvector, Pinecone, Qdrant, a remote service, anything.
interface EmbeddingStore {
// Look up by composite key: nodeId + ':' + sourceHash
get(key: string): Promise<StoredEmbedding | undefined>;
// Persist a new embedding with provenance metadata
set(key: string, value: StoredEmbedding): Promise<void>;
// Remove a stale entry (model version mismatch, content change)
delete(key: string): Promise<void>;
// Iterate all stored vectors for a given node (rare — used by
// cleanup tooling)
listForNode(nodeId: string): Promise<StoredEmbedding[]>;
// Cosine-similarity search across all stored vectors
search(
vector: number[],
opts?: { topK?: number; minScore?: number },
): Promise<EmbeddingHit[]>;
}
interface StoredEmbedding {
nodeId: string;
vector: number[];
model: string; // e.g. 'text-embedding-3-small'
modelVersion: string; // provider-stamped version
sourceHash: string; // hash of the embedded text
createdAt: number;
}Provenance and invalidation
Every stored embedding carries the model id, model version, and a hash of the source text. The engine never trusts a stored vector blindly — on every read it checks that the model and content hash still match. If anything has drifted, the entry is dropped and regenerated.
// When AIEngine generates an embedding it stamps:
// - model — the embedding model name
// - modelVersion — provider-reported version (or model name)
// - sourceHash — SHA of the canonicalized embedding text
//
// On read, the engine verifies the stored hash matches the
// current node text. If the node was edited or the embedding
// model changed, the entry is invalidated and regenerated.
// Unchanged content is never re-embedded.