Limits & Quotas
Plan limits, rate limits, and file size constraints for the MemoryKit API.
MemoryKit enforces limits at three levels: plan quotas (monthly/total), rate limits (per-minute), and request constraints (per-request).
Plan quotas
| Free | Starter | Pro | Enterprise | |
|---|---|---|---|---|
| Monthly queries | 100 | 5,000 | 50,000 | Unlimited |
| Total documents | 10 | 500 | 5,000 | Unlimited |
| Storage | 50 MB | 500 MB | 5 GB | Unlimited |
| Price | $0 | — | — | Custom |
Query quota resets on the 1st of each month (UTC). Document quota is a hard count across all memories in your project.
Check your usage
What happens when you hit a quota
- Query quota exceeded: Query and chat message endpoints return
429with a message. Search still works. - Document quota exceeded: Ingest, upload, and batch endpoints return
429. Existing memories remain searchable. - Storage exceeded: Upload and ingest return
429. Delete old memories to free space.
Rate limits
Per API key, per endpoint, 60-second sliding window.
| Endpoint | Requests/min |
|---|---|
POST /v1/memories/query | 30 |
GET /v1/memories/search | 60 |
POST /v1/memories | 100 |
POST /v1/memories/upload | 100 |
POST /v1/memories/batch | 100 |
POST /v1/chats | 60 |
POST /v1/chats/:id/messages | 30 |
POST /v1/chats/:id/messages/stream | 30 |
POST /v1/users/:id/events | 100 |
All GET endpoints | 300 |
All DELETE endpoints | 100 |
All PUT endpoints | 100 |
Rate limit headers
Every response includes rate limit headers:
| Header | Description |
|---|---|
X-RateLimit-Limit | Max requests per window |
X-RateLimit-Remaining | Remaining requests in current window |
X-RateLimit-Reset | Unix timestamp when limit resets |
Retry-After | Seconds to wait (only on 429 responses) |
The SDKs automatically retry on 429 with exponential backoff. You don't need to implement retry logic yourself.
Request constraints
File uploads
| Constraint | Limit |
|---|---|
| Max file size | 100 MB |
| Supported formats | PDF, DOCX, XLSX, PPTX, TXT, CSV, Markdown, HTML, JSON |
Batch ingest
| Constraint | Limit |
|---|---|
| Max items per batch | 100 |
Pagination
| Constraint | Limit |
|---|---|
Max limit per page | 100 (memories, chats, users, events) |
Max limit for chat messages | 200 |
Default limit | 20 (50 for chat messages) |
Content limits
| Constraint | Limit |
|---|---|
| Max content length (text ingest) | No hard limit (processed in chunks) |
| Max tags per memory | 50 |
| Max metadata size | 10 KB (JSON) |
Webhook limits
| Constraint | Limit |
|---|---|
| Delivery timeout | 10 seconds |
| Max retry attempts | 3 |
| Max webhooks per project | 10 |
Need higher limits?
Enterprise plans include custom limits, dedicated support, and SLAs. Contact us to discuss your requirements.