Chat API
Stream conversational answers from your graph: text deltas to your UI, tool calls applied to the graph internally.
Chat with the graph
useInferaGraphChat() returns a chat(message) function whose AsyncIterable yields a stream of ChatEvents. Text deltas surface to your UI; tool-call events (apply_filter, highlight, focus, annotate) are dispatched internally and update the graph without leaking into host code.
import { useInferaGraphChat } from '@inferagraph/core/react';
import { useState, useRef } from 'react';
function ChatBox() {
const { chat } = useInferaGraphChat();
const [text, setText] = useState('');
const abort = useRef<AbortController | null>(null);
// Persist a per-tab conversationId so multi-turn memory works across sends.
const convId = useRef(crypto.randomUUID());
async function send(message) {
abort.current?.abort();
abort.current = new AbortController();
setText('');
for await (const event of chat(message, {
signal: abort.current.signal,
conversationId: convId.current,
})) {
if (event.type === 'text') {
setText(prev => prev + event.delta);
}
// tool calls handled internally — graph updates on its own
}
}
}ChatEvent shape
The stream is a discriminated union of typed events. Host code typically inspects text for the answer body and debug for diagnostic badges; tool calls fire on the graph automatically. See Diagnostics for the canonical debug.phase values.
type ChatEvent =
| { type: 'text'; delta: string }
| { type: 'tool_call'; name: string; args: unknown; spec?: FilterSpec }
| { type: 'clear_visual_state' } // 0.10.0 — fresh canvas (highlights + annotations + filter + camera)
| { type: 'reset_view' } // 0.10.0 — camera home only
| { type: 'clear_annotations' } // 0.10.0 — drop every annotation
| { type: 'debug'; phase: DebugPhase; detail?: unknown; counters?: Record<string, number>; conversationId?: string }
| { type: 'done'; reason: 'stop' | 'aborted' | 'error' }
| { type: 'error'; message: string };
// Tool names dispatched internally:
// apply_filter — narrow visible nodes by attributes / tags / predicate
// highlight — emphasize a set of node ids
// focus — center the camera on one or more nodes
// annotate — attach a note to a node (cleared on next chat)
// set_inferred_visibility — toggle the dashed inferred-edge overlay
// 0.10.0 — host-driven visual-reset events. Parameterless. Servers MAY emit
// them on the wire, but the primary use case is host UI (dispatched via
// useInferaGraphCommands — see Visualization → Host-driven dispatch).
// clear_visual_state — comprehensive "fresh canvas" reset
// reset_view — camera home only (radius preserved)
// clear_annotations — drop every annotationProvider messages: LLMMessage / streamMessages
Providers expose a streamMessages(messages, opts) method that takes a structured [{role, content}] array. The legacy single-string stream(prompt) is kept for back-compat — new consumers should pass system instructions and user input as separate messages so multi-turn chat works without manual prompt concatenation.
// Structured roles — system instructions stay separate from user input
// end-to-end. Multi-turn chat works without manual prompt concatenation.
type LLMRole = 'system' | 'user' | 'assistant';
type LLMMessage = { role: LLMRole; content: string };
// Every provider implements streamMessages. stream(prompt) is back-compat only.
interface LLMProvider {
streamMessages?(messages: LLMMessage[], opts?: LLMStreamOptions): AsyncIterable<LLMStreamEvent>;
stream?(prompt: string, opts?: LLMStreamOptions): AsyncIterable<LLMStreamEvent>;
complete(request: LLMRequest): Promise<LLMResponse>;
embed?(texts: string[], opts?: EmbedOptions): Promise<Vector[]>;
}
// Direct provider call — system + user roles passed structurally.
for await (const ev of llm.streamMessages([
{ role: 'system', content: 'You answer in one sentence.' },
{ role: 'user', content: 'Who was Abraham's wife?' },
])) {
if (ev.type === 'text') process.stdout.write(ev.delta);
}Multi-turn memory: conversationId
Pass conversationId in ChatOptions to opt into multi-turn conversation memory. The engine fetches prior turns from a configured ConversationStore, builds the LLM messages array as [system, ...priorTurns, user], and after the stream closes appends the user turn + assistant turn (plus the retrievedNodeIds) for the next call. Pronoun resolution ("tell me more about him") uses the prior turn's retrieved set as a candidate group.
- When omitted, every call is a one-shot — no prior context is fetched or written.
- Pass a stable per-tab id (for example, crypto.randomUUID() persisted in sessionStorage).
- Default priorTurnLimit is 8; configure via AIEngineConfig.
- Conversation backing store ships separately — inMemoryConversationStore() in @inferagraph/core for dev, RedisConversationStore in @inferagraph/redis for production.
Cancellation
Pass an AbortSignal in the options. When the signal aborts, the stream emits a final done event with reason: 'aborted' and stops. Re-issuing a chat call should always abort the previous controller first.
const controller = new AbortController();
// Cancel mid-stream
setTimeout(() => controller.abort(), 5000);
for await (const event of chat(message, { signal: controller.signal })) {
// loop terminates with a 'done' event whose reason is 'aborted'
}Observing chat globally
Pass onChat to <InferaGraph /> to observe every event from any caller — useful for analytics, transcripts, or audit logs.
import { InferaGraph } from '@inferagraph/core/react';
// Top-level <InferaGraph> can also receive an onChat callback
// invoked for every event, regardless of who originated the chat.
<InferaGraph
data={data}
llm={llm}
onChat={(event) => {
if (event.type === 'tool_call') {
analytics.track('graph_tool_call', { name: event.name });
}
}}
/>Citations: citationKey + the [[slug|matched-text]] wire
@inferagraph/core@^0.12.0 handles citations server-side. The model writes naturally; after the stream completes the engine scans the assistant text against every node in the store and rewrites each title occurrence into [[token|matched-text]]. Both segments are required — token is the citation key (or node.id when the named attribute is missing) and matched-text is the model's exact wording so the rendered anchor preserves casing and articles ("the Fall" stays lowercase article, "ADAM" stays upper, etc.). Every occurrence is rewritten — first mention, fifth mention, all the same.
0.12.0 hard break. Through 0.11.0 the wire was bare [[slug]] and the prompt asked the model to emit it inline. Production gpt-4o-class models routinely ignored the rule, leaving the host with uncited streaming text — and even when they followed it, the result was an anchor adjacent to the entity name ("Adam Adam") on every mention. 0.12.0 makes the engine the single source of truth: the model writes naturally, the engine rewrites every entity-name occurrence in place, and the wire format mandates the matched-text segment so the anchor renders the model's exact casing once. Tokens without the |matched-text portion are not recognized — hosts upgrading from 0.11.x adopt the new shape on consumption.
import { AIEngine } from '@inferagraph/core/data';
// Without citationKey: the engine never injects citations. The model
// writes naturally; the bubble renders the answer as-is.
const engineDefault = new AIEngine(store, queryEngine);
// With citationKey: 'slug' — after the model stream completes, the
// engine scans the assistant text against EVERY node in the store and
// rewrites each title occurrence into [[token|matched-text]]:
// "Cain was Adam's firstborn son."
// → "[[cain|Cain]] was [[adam|Adam]]'s firstborn son."
// The token is node.attributes[citationKey] (or node.id when missing).
// matched-text is the model's exact casing — passed through to the
// host's renderCitation so the anchor preserves it.
const engineSlugCites = new AIEngine(store, queryEngine, {
citationKey: 'slug',
});Resolving citations in the host
Hosts parse [[token|matched-text]] on the frontend and resolve each token against their slug index — see RAG → Citations for the rendering pattern. renderCitation receives both arguments; render matchedText as the anchor's text content so the model's casing wins.
- citationKey is optional. When unset, the engine ships the assistant text untouched — the model writes naturally and no citations are inserted.
- Citations resolve against the whole store, not just the per-turn rerank top-K — entities outside that turn's relevant set still cite when their title appears in the response.
- If the named attribute is missing or non-string on a node, that node's citation falls back to node.id — no crashes on partial data.
- highlight() / focus() tool calls always take the canonical (first-column) id. Only the citation tokens shift.
Rendering chat text: <ChatText> (0.12.0+)
Through 0.10.1, every host wrote its own inline markdown parser plus a citation tokenizer to render the assistant's streamed text. Core 0.10.2 centralized that into <ChatText>; 0.12.0 changed the wire format to [[token|matched-text]] so the anchor's text content carries the model's exact casing per mention. Hosts supply only a renderCitation(token, matchedText) callback that turns each citation into a link (or any other React node), and override CSS custom properties to style the surrounding text. The architectural rule is "library renders text + structure; host only styles."
Component signature
import { ChatText } from '@inferagraph/core/react';
interface ChatTextProps {
text: string; // streamed assistant text with markdown + [[token|matched-text]] tokens
renderCitation?: (token: string, matchedText: string) => React.ReactNode; // host's link renderer; library renders matchedText plain if omitted
className?: string; // wrapper element class; defaults to 'ig-chat-text'
}Behavior
- Splits text on the citation regex /\[\[([a-z0-9][a-z0-9_-]*)\|([^\]]+)\]\]/gi. Tokens without the |matched-text segment are not recognized and fall through to plain markdown.
- For each non-citation segment: runs marked.parseInline() with raw HTML escaped (no DOM parser dependency), then renders the result through <span dangerouslySetInnerHTML>. Bold, italic, inline code, and links are preserved; block-level constructs are not — chat answers stay inline.
- For each citation: invokes renderCitation(token, matchedText) when provided; otherwise renders the matched text as plain prose.
- The wrapper is a single <span> with class ig-chat-text by default — pass className to override.
Theme tokens
The library ships sensible defaults in themes/default.css and themes/dark.css. Hosts override via the standard CSS custom-property cascade.
/* Library defaults shipped in themes/default.css + themes/dark.css. */
.ig-chat-text strong { font-weight: 600; }
.ig-chat-text em { font-style: italic; }
.ig-chat-text code {
font-family: monospace;
padding: 0 0.25em;
background: var(--ig-code-bg);
border-radius: 3px;
}
/* Hosts override via the standard CSS custom-property cascade —
either set --ig-code-bg, or scope rules under .ig-chat-text. */Worked example
A host wires its slug index and label/type resolvers into renderCitation; everything else (parsing, sanitization, DOM structure) is library-owned.
import { ChatText } from '@inferagraph/core/react';
import { slugResolver, typeResolver } from '@/lib/resolvers';
function ChatBubble({ message }: { message: ChatMessage }) {
return (
<div className="bubble">
<ChatText
text={message.text}
renderCitation={(token, matchedText) => {
const slug = slugResolver(token);
if (!slug) return <span className="unknown-citation">{matchedText}</span>;
const type = typeResolver(slug);
// matchedText carries the model's exact casing (e.g. "the Fall"
// stays lowercase article); slug feeds the URL.
return <a href={`/${type}/${slug}`}>{matchedText}</a>;
}}
/>
</div>
);
}0.10.3+ — SSR-safe. No DOM parser dependency
<ChatText> is a 'use client' component, but the only runtime dependency is marked. 0.10.3 dropped isomorphic-dompurify (and its transitive jsdom) by routing raw HTML through a per-instance Marked renderer that escapes <script>, <img onerror>, and other dangerous markup to plain text. Next.js SSR / serverless function runtimes can import @inferagraph/core/react directly with no webpack.externals dance required.
Where the LLM runs
By default, the chat hook runs the AIEngine in-process. For production deployments where keys must stay on the server, pair the hook with an HTTP transport — see Transports for the full pattern and a Next.js example.