Conversational RAG
Multi-turn conversations with context-aware responses and full message history.
Coming Soon -- Chat endpoints (/v1/chats/*) and SDK methods (.sendMessage(), .streamMessage()) are not available in the V1 release. Use /v1/memories/search and the .search() SDK method for retrieval-based search. Conversational RAG will be available in a future release.
Chats provide a conversational interface on top of RAG. Create a session, send messages, and get context-aware responses that consider the full conversation history. Each message triggers retrieval + LLM generation, so the assistant always answers from your knowledge base.
How it works
Create a session
Start a chat tied to a user. The session stores all messages.
Send messages
Each message triggers RAG retrieval. The LLM sees the conversation history + retrieved context.
Get context-aware answers
Responses consider both previous messages and your knowledge base — no need to repeat context.
Create a chat
Send a message
Send a user message and get a RAG-powered response. The LLM retrieves relevant memories and generates an answer.
Message parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
message | string | Yes | The user's message |
mode | string | No | fast, balanced (default), or precise |
temperature | number | No | LLM temperature (0.0–1.0) |
maxSources | number | No | Max sources to include in context |
instructions | string | No | System instructions for the LLM |
filters | object | No | Filter which memories to retrieve |
responseFormat | string | No | text (default) or json |
Stream a response
For real-time UIs, stream the response token-by-token:
See the Streaming guide for React hooks and Next.js integration.
Get chat history
Retrieve the full message history for a chat session.
List chat sessions
Delete a chat
Deletes the chat session and all its messages.
Chat vs Query
| Feature | Query | Chat |
|---|---|---|
| Conversation history | No | Yes |
| Multi-turn context | No | Yes |
| Streaming | Yes | Yes |
| User scoping | Per-request | Per-session |
| Use case | Single Q&A | Ongoing conversation |
Use query for one-shot questions. Use chats when users need to ask follow-up questions and the assistant should remember previous messages.