KAPEX Docs — Integration overview

Quickstart

From zero to scored memory in three conceptual steps. KAPEX is middleware — your app calls KAPEX, KAPEX returns context, you inject it into any LLM.

1. Get a pilot key

Start the pilot and we'll send you a key plus the configured SDK and MCP packages. Pilot keys are scoped, rate-limited, and revocable.

2. Send a conversation turn

Pass the user's message and your LLM's reply to KAPEX through whichever surface you prefer (SDK, MCP, or REST). KAPEX scores the turn against 12 signal dimensions, updates the memory graph, and applies the decay model.

3. Inject memory into your LLM

Request a salient-memory context block at query time and inject it into your system prompt. The model now references prior turns — without you changing the model. Provider-agnostic.

import kapex

# 1. Send the message to KAPEX
memory_context = kapex.process(user_id, message, session_id)

# 2. Inject memory into your prompt
enhanced_prompt = system_prompt + "\n\n" + memory_context

# 3. Call your LLM as usual
response = your_llm.chat(enhanced_prompt, message)

Integration pattern

Three steps, in order: deploy the container, connect your app, inject memory into your prompt. KAPEX runs in your infrastructure — your app talks to your KAPEX endpoint, never to ours.

Step 01 Deploy KAPEX container → your VPC · your DB · your LLM key

Step 02 your app → KAPEX (SDK / MCP / REST) → memory_context

Step 03 your LLM (system_prompt + memory_context) → response

Typical integration takes under one day for a senior engineer. No model changes, no prompt rewriting, no vendor lock-in. User data never leaves your VPC.

Three surfaces, one engine

Pick whichever surface fits your stack. They're functionally equivalent — the same scoring engine, decay model, retrieval, and safety pipeline runs behind all three.

Python SDK

async-first

Native async client for Python applications. Covers the full memory lifecycle — ingest, recall, node diagnostics, per-tenant configuration.

Context-manager friendly (works inside async with)
Type-checked end-to-end
Automatic retry with exponential backoff on rate-limit responses
Per-node diagnostics including baseline components and decay projections

MCP server

3 transports

Model Context Protocol native. Connects to any MCP-compatible agent — Claude Desktop, Cursor, Windsurf, Claude Code, or your own client.

Memory tools for store / query / process / graph inspection / deletion
Three transports: stdio (local), SSE (cloud streaming), HTTP (request/response)
Tenant-scoped JWT auth for hosted deployments
Drop-in configuration; no Python needed on the agent side

REST API

OpenAPI 3.1

Standard HTTPS endpoints for any language or framework. The base layer everything else sits on.

OpenAPI 3.1 specification (shared on pilot signup)
Process, recall, ingest, node management, graph inspection, configuration, deletion
Webhooks for memory-event subscriptions
Idempotency keys on write operations
API-key authentication via standard header

Auth & rate limits

Authentication

API keys are hashed at rest, never stored in plaintext. Rotation supported.
Hosted MCP servers use Cognito-issued RS256 JWTs with short TTLs and rotating signing keys.
Enterprise plans support SSO via SAML 2.0 (Okta, Azure AD, Google Workspace) and SCIM 2.0 provisioning.

Rate limits

Plan

Per minute

Per day

Monthly

Pilot

10,000

100 users

Build

120

100,000

100K

Scale

300

1,000,000

Enterprise

Custom

When you hit the cap, the API returns a 429 with a Retry-After header. No overage charges. The SDK auto-retries with exponential backoff.

Errors

Standard HTTP status codes. JSON body with a stable error shape — error message, code for programmatic handling, and a request_id for support. Always include request_id when escalating.

Deployment

Two paths, depending on your data-sovereignty needs:

KAPEX Cloud — hosted in AWS us-east-1. Default on Pilot, Build, and Scale plans.
Customer-hosted — deploys into your AWS, GCP, or Azure VPC, or on-prem. Ships as a CloudFormation template or signed Docker package. Sandstone handles deployment. Available on Enterprise. See Deployment model.

Full reference under NDA

Method signatures, endpoint shapes, and configuration schemas ship to pilot teams.

To protect IP during the patent prosecution window, complete API and configuration reference material is shared under mutual NDA when you start a pilot. We'll send the NDA template, then the SDK, MCP, and OpenAPI bundles. Most pilot teams are integrated within a business day of signing.

Developer docs