Integration overview

Developer docs

This page is a public overview. The full reference — method signatures, endpoint shapes, request/response schemas — ships to pilot teams under mutual NDA.

Quickstart

From zero to scored memory in three conceptual steps. KAPEX is middleware — your app calls KAPEX, KAPEX returns context, you inject it into any LLM.

1. Get a pilot key

Start the pilot and we'll send you a key plus the configured SDK and MCP packages. Pilot keys are scoped, rate-limited, and revocable.

2. Send a conversation turn

Pass the user's message and your LLM's reply to KAPEX through whichever surface you prefer (SDK, MCP, or REST). KAPEX scores the turn against 12 signal dimensions, updates the memory graph, and applies the decay model.

3. Inject memory into your LLM

Request a salient-memory context block at query time and inject it into your system prompt. The model now references prior turns — without you changing the model. Provider-agnostic.

import kapex

# 1. Send the message to KAPEX
memory_context = kapex.process(user_id, message, session_id)

# 2. Inject memory into your prompt
enhanced_prompt = system_prompt + "\n\n" + memory_context

# 3. Call your LLM as usual
response = your_llm.chat(enhanced_prompt, message)

Integration pattern

Three steps, in order: deploy the container, connect your app, inject memory into your prompt. KAPEX runs in your infrastructure — your app talks to your KAPEX endpoint, never to ours.

Step 01 Deploy KAPEX container your VPC · your DB · your LLM key
Step 02 your app KAPEX (SDK / MCP / REST) memory_context
Step 03 your LLM (system_prompt + memory_context) response

Typical integration takes under one day for a senior engineer. No model changes, no prompt rewriting, no vendor lock-in. User data never leaves your VPC.

Three surfaces, one engine

Pick whichever surface fits your stack. They're functionally equivalent — the same scoring engine, decay model, retrieval, and safety pipeline runs behind all three.

Python SDK

async-first

Native async client for Python applications. Covers the full memory lifecycle — ingest, recall, node diagnostics, per-tenant configuration.

  • Context-manager friendly (works inside async with)
  • Type-checked end-to-end
  • Automatic retry with exponential backoff on rate-limit responses
  • Per-node diagnostics including baseline components and decay projections

MCP server

3 transports

Model Context Protocol native. Connects to any MCP-compatible agent — Claude Desktop, Cursor, Windsurf, Claude Code, or your own client.

  • Memory tools for store / query / process / graph inspection / deletion
  • Three transports: stdio (local), SSE (cloud streaming), HTTP (request/response)
  • Tenant-scoped JWT auth for hosted deployments
  • Drop-in configuration; no Python needed on the agent side

REST API

OpenAPI 3.1

Standard HTTPS endpoints for any language or framework. The base layer everything else sits on.

  • OpenAPI 3.1 specification (shared on pilot signup)
  • Process, recall, ingest, node management, graph inspection, configuration, deletion
  • Webhooks for memory-event subscriptions
  • Idempotency keys on write operations
  • API-key authentication via standard header

Auth & rate limits

Authentication

Rate limits

Plan
Per minute
Per day
Monthly
Pilot
60
10,000
100 users
Build
120
100,000
100K
Scale
300
1,000,000
1M
Enterprise
Custom
Custom
Custom

When you hit the cap, the API returns a 429 with a Retry-After header. No overage charges. The SDK auto-retries with exponential backoff.

Errors

Standard HTTP status codes. JSON body with a stable error shape — error message, code for programmatic handling, and a request_id for support. Always include request_id when escalating.

Deployment

Two paths, depending on your data-sovereignty needs:

Full reference under NDA

Method signatures, endpoint shapes, and configuration schemas ship to pilot teams.

To protect IP during the patent prosecution window, complete API and configuration reference material is shared under mutual NDA when you start a pilot. We'll send the NDA template, then the SDK, MCP, and OpenAPI bundles. Most pilot teams are integrated within a business day of signing.

Experience KAPEX live

Sign up for the free KAPEX beta and see salience-scored memory in action — no NDA, no commitment.

Try the free beta
Patent pending

Get the keys. Wire it in.

Start the 30-day pilot. We send the SDK, MCP, and OpenAPI bundles within a business day of NDA signature.