KAPEX FAQ — Questions, technical, business, compliance

General

The basics

What is KAPEX?

Memory middleware for LLM applications. It sits between your app and your LLM, maintains a salience-scored memory graph per user, and injects the most relevant memories into each query. Patent pending.

How is this different from Mem0 / Zep / RAG?

Those systems store and retrieve memories. KAPEX scores them by significance, models their decay over time, and surfaces what matters right now — not what was said most recently or most often. The decay direction is the key: in KAPEX, memories you've processed resolve faster, while unresolved content persists. Every other system does the opposite. See Why KAPEX.

What LLMs does KAPEX work with?

Any LLM. Claude, GPT, Gemini, Llama, Mistral, or your own fine-tuned model. KAPEX is model-agnostic middleware — it returns a context block you paste into any system prompt.

How long does integration take?

Under a day for a senior engineer. Three lines of code in the typical integration pattern: (1) send user message to KAPEX, (2) get memory context back, (3) inject into your LLM prompt. See the integration pattern.

Is there vendor lock-in?

No. KAPEX handles memory. You keep your LLM provider, your prompts, your application logic. If you stop using KAPEX, your app still works — it just doesn't remember.

Who built this?

Sandstone Cloud, co-founded by Oliver La Roche and Jemell Sanders. Tosin Oyewole leads business development. Based in San Francisco. Get in touch at support@sandstonecloud.com.

Technical

For engineers

What's the latency impact?

The process endpoint adds roughly 800–1500ms to a typical conversation turn (this includes entity extraction and scoring inference). In most architectures it runs in parallel with your LLM call, so the user-perceived latency is dominated by the LLM. For lower-latency use cases, separate the write-time ingestion call from the read-time recall call — they're independent operations.

What data do you store?

Memory nodes (topic, summary, scores, metadata), entities (names, types, relationships), and the conversation history needed for ongoing scoring. All scoped to your tenant namespace. Per-node and full-user deletion are first-class operations. On Enterprise (customer-hosted), Sandstone stores nothing — everything sits in your VPC; we only see read-only health telemetry.

Can I self-host?

Yes, on Enterprise. KAPEX deploys into your AWS, GCP, or Azure VPC, or fully on-prem. Ships as a CloudFormation template or signed Docker package. Memory data never leaves your environment. Sandstone runs the initial deployment and validation; our health-monitoring endpoint is read-only telemetry only — no memory data, no PII, no conversation content. See Deployment model for the full picture.

What about rate limits?

Trial: 60 req/min, 10K/day. Build: 120/min, 100K/day. Scale: 300/min, 1M/day. Enterprise: custom. The SDK auto-retries on rate-limit responses with exponential backoff.

How does the safety layer work?

14 independent modules running in parallel. Crisis detection, anti-fabrication guards, PII scrubbing, trigger management, prompt-injection detection, governance enforcement. The safety pipeline runs identically regardless of LLM and cannot be disabled by operators. See Features → Safety layer.

How do you handle multi-tenant isolation?

Per-tenant memory graphs, scoring parameters, and decay coefficients. Strict isolation at storage, computation, and retrieval layers — there's no shared embedding pool or shared cache that could leak between tenants. Each namespace has its own audit log.

Can I use my own embedding model?

On Enterprise, yes — we can configure custom embedding endpoints. On Build and Scale, we use platform-managed embeddings to keep latency predictable.

Business

Pilot, pricing, IP

Is KAPEX patent-pending?

Yes. Patent applications have been filed covering the core architecture — the salience scoring framework, the processing-modulated decay model, and the legitimacy gap detection mechanism. Detailed claim language is shared under mutual NDA.

Can I see the scoring formulas?

Detailed technical architecture — signal weights, decay derivations, and the legitimacy gap mechanism — is shared under mutual NDA during pilot onboarding. The public documentation covers what each signal measures and how the system behaves, not the specific mathematical implementation.

What's the pilot?

30 days free. Full feature access. Up to 100 users. Direct Slack support from the founders. Our only ask: a 5-minute feedback form at Day 15 and Day 30. See Pricing.

What happens at the end of the pilot?

Three paths. (1) Upgrade to Build, Scale, or Enterprise — your data and configuration carry over with no migration. (2) Request a 30-day extension if your team needs more eval time. (3) Walk away with a per-node export and we delete your namespace within 30 days.

Do you offer non-profit, research, or academic discounts?

Yes. Verified non-profits, academic researchers, and qualifying open-source projects get Scale-tier access at the Build price. Email support@sandstonecloud.com with a brief description.

Compliance

HIPAA, GDPR, SOC 2

Is KAPEX HIPAA-compliant?

The architecture supports HIPAA requirements — per-node deletion, audit logging, encryption at rest (AES-256) and in transit (TLS 1.3), access controls. BAA is available on the Enterprise plan.

GDPR?

Yes. Per-node deletion (Article 17 — right to erasure), full user deletion, data portability via export API, and append-only audit logging. Per-node and per-user deletion are single API calls.

SOC 2?

SOC 2 Type II roadmap is in progress. Current evidence package available under NDA. Available on the Enterprise plan timeline.

Where is data stored?

AWS us-east-1 (Virginia) by default. PostgreSQL on RDS for structured data, S3 for archival, Redis on ElastiCache for caching. EU residency (Frankfurt) is available as a paid add-on on Scale and Enterprise. See Security.

How do you handle SB 243 (California AI Transparency)?

The safety layer enforces AI disclosure, crisis resources surfacing, and 988 Lifeline routing for crisis-flagged conversations. These are governance-enforced — operators cannot disable them at the tenant level. See the Safety modules in Features.

Frequently asked.

General

Technical

Business

Compliance

Ask us directly.