Chat API

Stream conversational answers from your graph: text deltas to your UI, tool calls applied to the graph internally.

Chat with the graph

useInferaGraphChat() returns a chat(message) function whose AsyncIterable yields a stream of ChatEvents. Text deltas surface to your UI; tool-call events (apply_filter, highlight, focus, annotate) are dispatched internally and update the graph without leaking into host code.

import { useInferaGraphChat } from '@inferagraph/core/react';
import { useState, useRef } from 'react';

function ChatBox() {
  const { chat } = useInferaGraphChat();
  const [text, setText] = useState('');
  const abort = useRef<AbortController | null>(null);
  // Persist a per-tab conversationId so multi-turn memory works across sends.
  const convId = useRef(crypto.randomUUID());

  async function send(message) {
    abort.current?.abort();
    abort.current = new AbortController();
    setText('');

    for await (const event of chat(message, {
      signal: abort.current.signal,
      conversationId: convId.current,
    })) {
      if (event.type === 'text') {
        setText(prev => prev + event.delta);
      }
      // tool calls handled internally — graph updates on its own
    }
  }
}

ChatEvent shape

The stream is a discriminated union of typed events. Host code typically inspects text for the answer body and debug for diagnostic badges; tool calls fire on the graph automatically. See Diagnostics for the canonical debug.phase values.

type ChatEvent =
  | { type: 'text'; delta: string }
  | { type: 'tool_call'; name: string; args: unknown; spec?: FilterSpec }
  | { type: 'clear_visual_state' }   // 0.10.0 — fresh canvas (highlights + annotations + filter + camera)
  | { type: 'reset_view' }           // 0.10.0 — camera home only
  | { type: 'clear_annotations' }    // 0.10.0 — drop every annotation
  | { type: 'debug'; phase: DebugPhase; detail?: unknown; counters?: Record<string, number>; conversationId?: string }
  | { type: 'done'; reason: 'stop' | 'aborted' | 'error' }
  | { type: 'error'; message: string };

// Tool names dispatched internally:
//   apply_filter           — narrow visible nodes by attributes / tags / predicate
//   highlight              — emphasize a set of node ids
//   focus                  — center the camera on one or more nodes
//   annotate               — attach a note to a node (cleared on next chat)
//   set_inferred_visibility — toggle the dashed inferred-edge overlay

// 0.10.0 — host-driven visual-reset events. Parameterless. Servers MAY emit
// them on the wire, but the primary use case is host UI (dispatched via
// useInferaGraphCommands — see Visualization → Host-driven dispatch).
//   clear_visual_state — comprehensive "fresh canvas" reset
//   reset_view         — camera home only (radius preserved)
//   clear_annotations  — drop every annotation

Provider messages: LLMMessage / streamMessages

Providers expose a streamMessages(messages, opts) method that takes a structured [{role, content}] array. The legacy single-string stream(prompt) is kept for back-compat — new consumers should pass system instructions and user input as separate messages so multi-turn chat works without manual prompt concatenation.

// Structured roles — system instructions stay separate from user input
// end-to-end. Multi-turn chat works without manual prompt concatenation.
type LLMRole    = 'system' | 'user' | 'assistant';
type LLMMessage = { role: LLMRole; content: string };

// Every provider implements streamMessages. stream(prompt) is back-compat only.
interface LLMProvider {
  streamMessages?(messages: LLMMessage[], opts?: LLMStreamOptions): AsyncIterable<LLMStreamEvent>;
  stream?(prompt: string, opts?: LLMStreamOptions): AsyncIterable<LLMStreamEvent>;
  complete(request: LLMRequest): Promise<LLMResponse>;
  embed?(texts: string[], opts?: EmbedOptions): Promise<Vector[]>;
}

// Direct provider call — system + user roles passed structurally.
for await (const ev of llm.streamMessages([
  { role: 'system', content: 'You answer in one sentence.' },
  { role: 'user',   content: 'Who was Abraham's wife?' },
])) {
  if (ev.type === 'text') process.stdout.write(ev.delta);
}

Multi-turn memory: conversationId

Pass conversationId in ChatOptions to opt into multi-turn conversation memory. The engine fetches prior turns from a configured ConversationStore, builds the LLM messages array as [system, ...priorTurns, user], and after the stream closes appends the user turn + assistant turn (plus the retrievedNodeIds) for the next call. Pronoun resolution ("tell me more about him") uses the prior turn's retrieved set as a candidate group.

  • When omitted, every call is a one-shot — no prior context is fetched or written.
  • Pass a stable per-tab id (for example, crypto.randomUUID() persisted in sessionStorage).
  • Default priorTurnLimit is 8; configure via AIEngineConfig.
  • Conversation backing store ships separately — inMemoryConversationStore() in @inferagraph/core for dev, RedisConversationStore in @inferagraph/redis for production.

Cancellation

Pass an AbortSignal in the options. When the signal aborts, the stream emits a final done event with reason: 'aborted' and stops. Re-issuing a chat call should always abort the previous controller first.

const controller = new AbortController();

// Cancel mid-stream
setTimeout(() => controller.abort(), 5000);

for await (const event of chat(message, { signal: controller.signal })) {
  // loop terminates with a 'done' event whose reason is 'aborted'
}

Observing chat globally

Pass onChat to <InferaGraph /> to observe every event from any caller — useful for analytics, transcripts, or audit logs.

import { InferaGraph } from '@inferagraph/core/react';

// Top-level <InferaGraph> can also receive an onChat callback
// invoked for every event, regardless of who originated the chat.
<InferaGraph
  data={data}
  llm={llm}
  onChat={(event) => {
    if (event.type === 'tool_call') {
      analytics.track('graph_tool_call', { name: event.name });
    }
  }}
/>

Citations: citationKey + the [[slug|matched-text]] wire

@inferagraph/core@^0.12.0 handles citations server-side. The model writes naturally; after the stream completes the engine scans the assistant text against every node in the store and rewrites each title occurrence into [[token|matched-text]]. Both segments are required — token is the citation key (or node.id when the named attribute is missing) and matched-text is the model's exact wording so the rendered anchor preserves casing and articles ("the Fall" stays lowercase article, "ADAM" stays upper, etc.). Every occurrence is rewritten — first mention, fifth mention, all the same.

0.12.0 hard break. Through 0.11.0 the wire was bare [[slug]] and the prompt asked the model to emit it inline. Production gpt-4o-class models routinely ignored the rule, leaving the host with uncited streaming text — and even when they followed it, the result was an anchor adjacent to the entity name ("Adam Adam") on every mention. 0.12.0 makes the engine the single source of truth: the model writes naturally, the engine rewrites every entity-name occurrence in place, and the wire format mandates the matched-text segment so the anchor renders the model's exact casing once. Tokens without the |matched-text portion are not recognized — hosts upgrading from 0.11.x adopt the new shape on consumption.

import { AIEngine } from '@inferagraph/core/data';

// Without citationKey: the engine never injects citations. The model
// writes naturally; the bubble renders the answer as-is.
const engineDefault = new AIEngine(store, queryEngine);

// With citationKey: 'slug' — after the model stream completes, the
// engine scans the assistant text against EVERY node in the store and
// rewrites each title occurrence into [[token|matched-text]]:
//   "Cain was Adam's firstborn son."
//   → "[[cain|Cain]] was [[adam|Adam]]'s firstborn son."
// The token is node.attributes[citationKey] (or node.id when missing).
// matched-text is the model's exact casing — passed through to the
// host's renderCitation so the anchor preserves it.
const engineSlugCites = new AIEngine(store, queryEngine, {
  citationKey: 'slug',
});

Resolving citations in the host

Hosts parse [[token|matched-text]] on the frontend and resolve each token against their slug index — see RAG → Citations for the rendering pattern. renderCitation receives both arguments; render matchedText as the anchor's text content so the model's casing wins.

  • citationKey is optional. When unset, the engine ships the assistant text untouched — the model writes naturally and no citations are inserted.
  • Citations resolve against the whole store, not just the per-turn rerank top-K — entities outside that turn's relevant set still cite when their title appears in the response.
  • If the named attribute is missing or non-string on a node, that node's citation falls back to node.id — no crashes on partial data.
  • highlight() / focus() tool calls always take the canonical (first-column) id. Only the citation tokens shift.

Rendering chat text: <ChatText> (0.12.0+)

Through 0.10.1, every host wrote its own inline markdown parser plus a citation tokenizer to render the assistant's streamed text. Core 0.10.2 centralized that into <ChatText>; 0.12.0 changed the wire format to [[token|matched-text]] so the anchor's text content carries the model's exact casing per mention. Hosts supply only a renderCitation(token, matchedText) callback that turns each citation into a link (or any other React node), and override CSS custom properties to style the surrounding text. The architectural rule is "library renders text + structure; host only styles."

Component signature

import { ChatText } from '@inferagraph/core/react';

interface ChatTextProps {
  text: string;                                                                  // streamed assistant text with markdown + [[token|matched-text]] tokens
  renderCitation?: (token: string, matchedText: string) => React.ReactNode;   // host's link renderer; library renders matchedText plain if omitted
  className?: string;                                                             // wrapper element class; defaults to 'ig-chat-text'
}

Behavior

  • Splits text on the citation regex /\[\[([a-z0-9][a-z0-9_-]*)\|([^\]]+)\]\]/gi. Tokens without the |matched-text segment are not recognized and fall through to plain markdown.
  • For each non-citation segment: runs marked.parseInline() with raw HTML escaped (no DOM parser dependency), then renders the result through <span dangerouslySetInnerHTML>. Bold, italic, inline code, and links are preserved; block-level constructs are not — chat answers stay inline.
  • For each citation: invokes renderCitation(token, matchedText) when provided; otherwise renders the matched text as plain prose.
  • The wrapper is a single <span> with class ig-chat-text by default — pass className to override.

Theme tokens

The library ships sensible defaults in themes/default.css and themes/dark.css. Hosts override via the standard CSS custom-property cascade.

/* Library defaults shipped in themes/default.css + themes/dark.css. */
.ig-chat-text strong { font-weight: 600; }
.ig-chat-text em     { font-style: italic; }
.ig-chat-text code   {
  font-family: monospace;
  padding: 0 0.25em;
  background: var(--ig-code-bg);
  border-radius: 3px;
}

/* Hosts override via the standard CSS custom-property cascade —
   either set --ig-code-bg, or scope rules under .ig-chat-text. */

Worked example

A host wires its slug index and label/type resolvers into renderCitation; everything else (parsing, sanitization, DOM structure) is library-owned.

import { ChatText } from '@inferagraph/core/react';
import { slugResolver, typeResolver } from '@/lib/resolvers';

function ChatBubble({ message }: { message: ChatMessage }) {
  return (
    <div className="bubble">
      <ChatText
        text={message.text}
        renderCitation={(token, matchedText) => {
          const slug = slugResolver(token);
          if (!slug) return <span className="unknown-citation">{matchedText}</span>;
          const type = typeResolver(slug);
          // matchedText carries the model's exact casing (e.g. "the Fall"
          // stays lowercase article); slug feeds the URL.
          return <a href={`/${type}/${slug}`}>{matchedText}</a>;
        }}
      />
    </div>
  );
}

0.10.3+ — SSR-safe. No DOM parser dependency

<ChatText> is a 'use client' component, but the only runtime dependency is marked. 0.10.3 dropped isomorphic-dompurify (and its transitive jsdom) by routing raw HTML through a per-instance Marked renderer that escapes <script>, <img onerror>, and other dangerous markup to plain text. Next.js SSR / serverless function runtimes can import @inferagraph/core/react directly with no webpack.externals dance required.

Where the LLM runs

By default, the chat hook runs the AIEngine in-process. For production deployments where keys must stay on the server, pair the hook with an HTTP transport — see Transports for the full pattern and a Next.js example.