Quickstart
From zero to scored memory in three conceptual steps. KAPEX is middleware — your app calls KAPEX, KAPEX returns context, you inject it into any LLM.
1. Get a pilot key
Start the pilot and we'll send you a key plus the configured SDK and MCP packages. Pilot keys are scoped, rate-limited, and revocable.
2. Send a conversation turn
Pass the user's message and your LLM's reply to KAPEX through whichever surface you prefer (SDK, MCP, or REST). KAPEX scores the turn against 12 signal dimensions, updates the memory graph, and applies the decay model.
3. Inject memory into your LLM
Request a salient-memory context block at query time and inject it into your system prompt. The model now references prior turns — without you changing the model. Provider-agnostic.
import kapex # 1. Send the message to KAPEX memory_context = kapex.process(user_id, message, session_id) # 2. Inject memory into your prompt enhanced_prompt = system_prompt + "\n\n" + memory_context # 3. Call your LLM as usual response = your_llm.chat(enhanced_prompt, message)
Integration pattern
Three steps, in order: deploy the container, connect your app, inject memory into your prompt. KAPEX runs in your infrastructure — your app talks to your KAPEX endpoint, never to ours.
Typical integration takes under one day for a senior engineer. No model changes, no prompt rewriting, no vendor lock-in. User data never leaves your VPC.
Three surfaces, one engine
Pick whichever surface fits your stack. They're functionally equivalent — the same scoring engine, decay model, retrieval, and safety pipeline runs behind all three.
Python SDK
async-firstNative async client for Python applications. Covers the full memory lifecycle — ingest, recall, node diagnostics, per-tenant configuration.
- Context-manager friendly (works inside
async with) - Type-checked end-to-end
- Automatic retry with exponential backoff on rate-limit responses
- Per-node diagnostics including baseline components and decay projections
MCP server
3 transportsModel Context Protocol native. Connects to any MCP-compatible agent — Claude Desktop, Cursor, Windsurf, Claude Code, or your own client.
- Memory tools for store / query / process / graph inspection / deletion
- Three transports:
stdio(local),SSE(cloud streaming),HTTP(request/response) - Tenant-scoped JWT auth for hosted deployments
- Drop-in configuration; no Python needed on the agent side
REST API
OpenAPI 3.1Standard HTTPS endpoints for any language or framework. The base layer everything else sits on.
- OpenAPI 3.1 specification (shared on pilot signup)
- Process, recall, ingest, node management, graph inspection, configuration, deletion
- Webhooks for memory-event subscriptions
- Idempotency keys on write operations
- API-key authentication via standard header
Auth & rate limits
Authentication
- API keys are hashed at rest, never stored in plaintext. Rotation supported.
- Hosted MCP servers use Cognito-issued RS256 JWTs with short TTLs and rotating signing keys.
- Enterprise plans support SSO via SAML 2.0 (Okta, Azure AD, Google Workspace) and SCIM 2.0 provisioning.
Rate limits
When you hit the cap, the API returns a 429 with a Retry-After header. No overage charges. The SDK auto-retries with exponential backoff.
Errors
Standard HTTP status codes. JSON body with a stable error shape — error message, code for programmatic handling, and a request_id for support. Always include request_id when escalating.
Deployment
Two paths, depending on your data-sovereignty needs:
- KAPEX Cloud — hosted in AWS us-east-1. Default on Pilot, Build, and Scale plans.
- Customer-hosted — deploys into your AWS, GCP, or Azure VPC, or on-prem. Ships as a CloudFormation template or signed Docker package. Sandstone handles deployment. Available on Enterprise. See Deployment model.
Method signatures, endpoint shapes, and configuration schemas ship to pilot teams.
To protect IP during the patent prosecution window, complete API and configuration reference material is shared under mutual NDA when you start a pilot. We'll send the NDA template, then the SDK, MCP, and OpenAPI bundles. Most pilot teams are integrated within a business day of signing.