MemoryKit

Conversational RAG

Multi-turn conversations with context-aware responses and full message history.

Coming Soon -- Chat endpoints (/v1/chats/*) and SDK methods (.sendMessage(), .streamMessage()) are not available in the V1 release. Use /v1/memories/search and the .search() SDK method for retrieval-based search. Conversational RAG will be available in a future release.

Chats provide a conversational interface on top of RAG. Create a session, send messages, and get context-aware responses that consider the full conversation history. Each message triggers retrieval + LLM generation, so the assistant always answers from your knowledge base.

How it works

Create a session

Start a chat tied to a user. The session stores all messages.

Send messages

Each message triggers RAG retrieval. The LLM sees the conversation history + retrieved context.

Get context-aware answers

Responses consider both previous messages and your knowledge base — no need to repeat context.

Create a chat

const chat = await mk.chats.create({
  userId: "user_123",
  title: "Support Chat",
});
 
console.log(chat.id); // "chat_abc123"

Send a message

Send a user message and get a RAG-powered response. The LLM retrieves relevant memories and generates an answer.

const response = await mk.chats.sendMessage(chat.id, {
  message: "How do I reset my password?",
  mode: "balanced",
});
 
console.log(response.message.content);
console.log(response.sources); // memories used to generate the answer

Message parameters

ParameterTypeRequiredDescription
messagestringYesThe user's message
modestringNofast, balanced (default), or precise
temperaturenumberNoLLM temperature (0.0–1.0)
maxSourcesnumberNoMax sources to include in context
instructionsstringNoSystem instructions for the LLM
filtersobjectNoFilter which memories to retrieve
responseFormatstringNotext (default) or json

Stream a response

For real-time UIs, stream the response token-by-token:

for await (const event of mk.chats.streamMessage(chat.id, {
  message: "Can you explain step by step?",
})) {
  if (event.event === "text") {
    process.stdout.write(event.data.content);
  }
}

See the Streaming guide for React hooks and Next.js integration.

Get chat history

Retrieve the full message history for a chat session.

const history = await mk.chats.getHistory(chat.id);
for (const msg of history.messages) {
  console.log(`${msg.role}: ${msg.content}`);
}

List chat sessions

const chats = await mk.chats.list({ userId: "user_123", limit: 20 });
for (const c of chats.data) {
  console.log(`${c.id}: ${c.title}`);
}

Delete a chat

Deletes the chat session and all its messages.

await mk.chats.delete(chat.id);

Chat vs Query

FeatureQueryChat
Conversation historyNoYes
Multi-turn contextNoYes
StreamingYesYes
User scopingPer-requestPer-session
Use caseSingle Q&AOngoing conversation

Use query for one-shot questions. Use chats when users need to ask follow-up questions and the assistant should remember previous messages.

Edit on GitHub

On this page