KAPEX is the memory layer between your app and your LLM. That means we're handling personal data, conversation history, and salience scores that reveal what matters to a user. We treat that responsibility seriously.
All infrastructure runs on AWS. Single-region by default (us-east-1, Virginia). Multi-region available on Enterprise. EU residency (Frankfurt) is a paid add-on on Scale and Enterprise.
Default region. Multi-region failover and EU residency available as add-ons. All compute runs in private subnets behind an AWS WAF.
Structured data (memory nodes, entities, edges, audit logs). Encryption at rest (AES-256). Point-in-time recovery enabled.
Caching layer for hot retrievals and rate-limit counters. Per-tenant key isolation. No shared cache between namespaces.
Archival storage with configurable retention. Server-side encryption (AES-256). Bucket policies restrict access to the production VPC.
CDN with managed-rule WAF for OWASP top-10 protections. Custom rules for rate-limit enforcement and prompt-injection patterns.
Database and compute live in private subnets. No public ingress. Outbound traffic is restricted to known LLM providers and our embedding endpoints.
KAPEX ships as a Docker container that runs in your infrastructure. You own the compute, the database, and all memory data. Sandstone provides the container, the license, and the support.
Deploys into your AWS, GCP, or Azure VPC. Memory data never leaves your environment. Ships as a signed Docker container with a CloudFormation template for one-click AWS deployment.
The container validates its license key once per 24 hours. The heartbeat sends only a key hash — zero user data, zero memory data, zero conversation content. Runs 7 days offline before read-only mode. Enterprise offers air-gapped deployment.
Our deployment team handles the initial spin-up, configuration, validation, and cutover. You inherit a running, preconfigured cluster — not a Helm chart and a wiki page. Ongoing upgrades arrive as signed releases.
Quarterly on Starter, monthly on Growth, weekly on Scale, custom on Enterprise. Security patches ship within 48 hours regardless of tier. Every release is signed and versioned.
API keys are first-class. JWTs are reserved for MCP hosted servers. Everything is hashed.
API keys are hashed (bcrypt) and never stored in plaintext. Pass via a standard API-key header on every authenticated request. Rotation supported; we can revoke any key in under a minute.
Hosted MCP servers use Cognito-issued RS256 JWTs with short TTLs and rotating signing keys. Per-tenant scope claims.
SSO via SAML 2.0 with major IdPs (Okta, Azure AD, Google Workspace). SCIM 2.0 for user provisioning. Coming on the Enterprise plan.
Layered enforcement. Per-key limits are plan-driven. Per-IP limits catch abuse patterns. Hit the cap and you get a standard rate-limit response with a retry-after header.
Encryption at every layer. PII handled at ingestion. Per-node and per-user deletion as first-class API calls.
All client-server traffic uses TLS 1.3. HSTS enforced. Internal service-to-service traffic uses mutual TLS within the VPC.
Database, object storage, and backups are all encrypted at rest with AES-256. KMS-managed keys with annual rotation.
Module 03 of the safety layer detects and scrubs emails, phone numbers, SSNs, and addresses before they're stored. Configurable scrub policies per tenant.
Single API call. Cascades across nodes, entities, edges, and history. GDPR Article 17 right-to-erasure compliant.
Every memory operation logged with timestamp, actor, operation type, and request ID. Cryptographic deployment verification. Logs retained per plan.
Continuous RDS backups with point-in-time recovery to any second within the retention window. Cross-region copies on Enterprise.
An independent layer that cannot be overridden by memory state, user input, or operator configuration. Runs identically regardless of LLM. See Features → Safety layer for module-level detail.
Lexical scanning, pattern matching, behavioral deviation, temporal analysis. Routes flagged conversations to safe-handling protocols including 988 Lifeline routing.
Memory validator cross-checks LLM output against the actual graph. Prevents the model from inventing memories the user never disclosed.
Detects and redacts SSNs, credit cards, bank accounts, passports, and driver's licenses before they reach the memory graph. Configurable per tenant.
Always injected into context, regardless of memory state. Cannot be overridden by operators, users, or memory scores. Platform-level governance.
Built for regulated industries from day one. Architecture supports HIPAA, GDPR, CCPA, and emerging AI regulation (SB-243). BAA and evidence packages available on Enterprise.
A memory system that knows what matters to a user has more responsibility than a chatbot. Our ethical framework covers four risks; each has explicit mitigations in the platform.
Long-lived memory makes AI products more sticky — that can become unhealthy attachment. The safety layer monitors interaction-frequency patterns and surfaces use-rate notifications to operators when patterns suggest dependency.
The system knows more about the user than the user realizes. Mitigation: every user has the right to inspect their memory graph, edit individual nodes, and delete anything — including their entire history.
Consent isn't binary. Users can opt in to memory at the entity level (remember relationships but not health context, or vice versa), and consent is revocable per-node. The default is opt-in, not opt-out.
Operators get a lot of configurability — decay coefficients, domain weights, retention. But platform-level safety floors (minimum decay rates, mandatory safety modules, crisis routing) cannot be weakened. This is governance-enforced, not policy-suggested.
Experience KAPEX live
Sign up for the free KAPEX beta and see salience-scored memory in action — no NDA, no commitment.
Security questionnaires, evidence packages, and SOC 2 progress documents are available under NDA. Email us and we'll respond within a business day.