MemoryKit

Build a knowledge base

End-to-end guide: centralize documents from multiple sources, enable hybrid search, and keep content fresh.

Build an internal knowledge base that lets your team search and ask questions across all your documentation — Notion pages, Confluence wikis, Google Docs, and PDFs — from a single interface.

Prerequisites

  • A MemoryKit API key (get one here)
  • Node.js 18+ or Python 3.8+
  • Documents to ingest (any format: PDF, DOCX, XLSX, TXT, Markdown, HTML)

Step 1: Ingest documents from multiple sources

Tag each document with its source so you can filter later.

import { MemoryKit } from "memorykit";
import fs from "fs";
 
const mk = new MemoryKit({ apiKey: process.env.MEMORYKIT_API_KEY! });
 
// Ingest Notion pages as text
const notionPages = await fetchNotionPages(); // your function
await mk.memories.batchIngest({
  items: notionPages.map((page) => ({
    content: page.markdown,
    title: page.title,
    tags: ["knowledge-base"],
    metadata: { source: "notion", page_id: page.id, team: page.team },
  })),
});
 
// Upload PDF handbooks
for (const file of fs.readdirSync("./handbooks")) {
  await mk.memories.upload({
    file: new Blob([fs.readFileSync(`./handbooks/${file}`)], { type: "application/pdf" }),
    title: file.replace(".pdf", ""),
    tags: ["knowledge-base", "handbook"],
    metadata: { source: "local", department: "engineering" },
  });
}
 
console.log("All documents ingested");

Step 2: Build the search API

Hybrid search returns ranked results. Feed these into your own LLM if you need answer generation.

import express from "express";
 
const app = express();
app.use(express.json());
 
// Search — returns ranked snippets
app.get("/api/search", async (req, res) => {
  const results = await mk.memories.search({
    query: req.query.q as string,
    limit: 10,
    tags: "knowledge-base",
  });
  res.json(results);
});
 
app.listen(3000);

Step 3: Filter by team or source

Use tags and type filters to scope queries to specific teams or document sources.

// Only search engineering docs
const results = await mk.memories.search({
  query: "How do we deploy to production?",
  tags: "knowledge-base,engineering",
  limit: 10,
});
 
// Only search product docs created this year
const productResults = await mk.memories.search({
  query: "Q3 roadmap",
  tags: "product",
  createdAfter: "2025-01-01T00:00:00Z",
});

Step 4: Keep content fresh

Set up a sync pipeline that re-ingests updated documents. Use reprocess when content changes without re-uploading.

// Daily sync: update changed Notion pages
async function syncNotionPages() {
  const updatedPages = await fetchNotionPages({ since: yesterday() });
 
  for (const page of updatedPages) {
    // Find existing memory by metadata
    const existing = await mk.memories.list({
      // filter by metadata.page_id to find the right memory
    });
 
    if (existing.data.length > 0) {
      // Update existing memory
      await mk.memories.update(existing.data[0].id, {
        content: page.markdown,
        title: page.title,
      });
      // Re-process to re-chunk and re-embed
      await mk.memories.reprocess(existing.data[0].id);
    } else {
      // Create new memory
      await mk.memories.create({
        content: page.markdown,
        title: page.title,
        tags: ["knowledge-base"],
        metadata: { source: "notion", page_id: page.id },
      });
    }
  }
}

Use webhooks to get notified when reprocessing completes: listen for memory.completed events.


Summary

WhatHow
Ingest textbatchIngest() with source metadata
Ingest filesupload() for PDF, DOCX, XLSX
Searchsearch() — ranked results with relevance scores
Filter by teamtags and created_after / created_before params
Keep freshupdate() + reprocess() on schedule

Key features used: Batch ingest, File upload, Search, Reprocess

Edit on GitHub

On this page