Inside the engine.
This is how Altadore processes a message. Local classification, privacy scrubbing, structured memory. Type something and watch the real pipeline work.
Real names replaced before anything reaches the cloud.
It doesn't guess. It scores.
Every message runs through a deterministic local gate before a single cloud API is called. The gate answers 22 binary yes/no questions — 14 resolved instantly by regex and keyword matching, 8 deferred to the cloud. It classifies what you're actually asking, scores how complex it is, detects PII, and routes to the correct pipeline.
The split is structural, not dynamic. 14 questions are answered deterministically in under 5ms — things like "is this a greeting?" or "does this contain a date?" The remaining 8 require language understanding and go to the first cloud call. The gate handles the cheap stuff. The cloud handles the hard stuff.
Simple stuff — greetings, confirmations — gets pattern-matched at zero cost. No API call. No latency. Real questions get real compute.
How the gate works
The 22 questions split into two tiers. Tier 1 (14 questions) runs locally — regex, keyword matching, pattern detection. Done in under 5ms. Tier 2 (8 questions) requires language understanding and goes to the first cloud call alongside classification.
The split is fixed — always 14 local, always 8 deferred. A greeting gets pattern-matched by Tier 1 and never reaches the cloud at all. A complex question still gets its 14 local answers instantly, then sends the remaining 8 to the cloud. The system scales cloud usage to match complexity, not message length.
Nothing leaves the building unless it has to.
Before any message reaches the cloud, a 3-layer PII scanner (word list, regex, NER) finds names, phone numbers, emails, addresses, and sensitive identifiers. Names become realistic pseudonyms — not bracket tokens. The cloud models see natural language they were trained on, not synthetic [PERSON_1] syntax. Real names never reach an API.
Real data stays here: ─── sanitized text ───▸ Cloud sees only:
Phil Henderson ──────────────────── Michael Chen
403-555-0192 ───────────────────── 403-555-0147
phil@altadore.ai ──────────────────── [EMAIL_1]
◄──────────────── RESTORE ────────────────
rehydrate pseudonyms back to real values
The token map lives in local process memory. Never serialized. Never sent to any API. The cloud generates a response using pseudonyms, then the restore pass swaps them back before the user sees it.
Every fact is scored, not stored.
Each piece of information in Altadore carries ten numerical scores — weight, depth, domain, expiry, sensitivity, confidence, urgency, valence, feedback, scope. The system doesn't search a text file. It runs vector math against a SQLite table and pulls exactly what matters.
Four pipelines, one gate
DEEP — Full pipeline. 3 cloud calls: classify and plan (fast model), generate the response (reasoning model), enforce voice and format (fast model). External data pulled before the model thinks. For complex, multi-domain questions.
DEEP_LITE — Same pipeline, skips the expensive reasoning call. 2 cloud calls instead of 3. Simple messages skip the expensive model — the first call both classifies and drafts the response in one pass.
QUICK — Fast cloud model. One or two calls. Quick answers, casual questions, low-stakes lookups.
SNAP — Zero API calls. Zero LLM calls. Pattern-matched responses for greetings, confirmations, one-word replies. Instant. Free.
What's inside
The engine is modular. Each piece does one thing. Green border means zero API cost — pure logic, math, and local ops. Amber means cloud model calls.