Documentation

Overview

mnemur is an AI-native system for people and organizations: AI that knows everything you teach it and never shares it in the wrong room. The homepage tells that story; this page is the architecture — the layers of the stack, what each one does, and where every component sits. If you read nothing else, read this page — it defines the vocabulary every reference uses.

Three terms carry most of the weight. A context is one room of your life or business — work, personal, each client — declared as a plain config file. The canon is the assembled set of values, priorities, and rules your AI loads at the start of every session. The firewall is each context's declared list of things that must never leave that room, checked on every call.

The architecture

Three working layers — experience on top, trust in the middle, the brain underneath — with five supporting libraries around them and your own model providers and storage at the bottom. Every model call enters at the top and is governed on the way down and again on the way back up.

Experience layer

Apps & clients

Anything that already speaks the model APIs — chat clients, IDEs, agents, your internal tools. No SDK required. Consumer app: in design. See it: consumer app demo · business console demo.

↓ every model call ↓

Trust layer

m-gateway

The self-hosted checkpoint on every call: resolve the context, firewall-check the request, check the budget, forward with your key, firewall-check the response — mid-stream included — and write the audit record.

↓ enforces what the brain declares ↓

Brain layer

m-core

What your AI knows, obeys, and remembers: contexts as config · canon assembly · attestation · disciplined memory · model routing — plain files you own.

↓ operated and measured by ↓

Supporting libraries

m-evalsdid a change actually make the system better? m-loopsunattended operation: locks, heartbeats, backoff m-telemetrycorrections, cost & cross-system comparison m-agent-mqfile-backed messaging between local agents m-voicelocal-only voice-note transcription

↓

Your model providers & storage

Bring-your-own API keys — Anthropic, OpenAI-compatible, or local models — and audit, memory, and config land on disks you control. Your keys, your data plane: prompts and memory never transit anyone else's servers.

Contexts are rooms — the same hard walls a clinician keeps between patients, a firm keeps between matters.

The division of labor is deliberate. m‑core defines what your AI knows, obeys, and remembers; m‑gateway is the checkpoint that enforces it on every call. The core is source-visible so the guardrails can be verified, not merely promised. The gateway is the product. Full references: m-core library API · m-gateway REST API.

Connectors

Every connector rides the gateway pipeline — the same firewall, budget, and audit checks as any model call, with no side door around them. Each grant carries a direction grade — read, draft-only, or act — a hard cap held at the gateway, not a convention.

Tier 1 — how people talk. WhatsApp, SMS, email, social DMs.
Tier 2 — productivity. Google Calendar, Google Docs/Drive, Gmail, Asana, Slack, Teams, Notion.
Tier 3 — the web. Playwright-driven web actions, news sources, maps & places.
Tier 4 — sensitive, later and gated. Health data and banking — sequenced after the rest, gated like the packages they serve.
B2B domain connectors. EHR/FHIR, legal document management (DMS), CRM, market data, GitHub, Linear, CI.

Ecosystem MCP servers curated behind the gateway; consumer-messaging connectors built by us. Status: design / curation — see the business console demo for the operator view.

Where your data goes

“You route to frontier APIs — so can you see my data?” The first question a serious buyer asks. The straight answer: it lives in your perimeter, either way. Run it yourself, or let us run it for you inside your boundary — in both cases the gateway, the rooms, the policy, and the audit hold your data and learning, and you own them. Only the governed, minimized model call ever leaves.

Who runs the gateway? You, or us in your boundary. Self-host on-prem or in your VPC, or have us run it fully managed inside your perimeter. There is no shared mnemur server your data flows through.
Where does the learning live? In your system, owned by you. Context, policies, learning store, and audit log stay inside your walls. Even when we host it, it’s yours — the same thing that makes it portable.
What leaves, and to whom? Only the governed model call — the policy-allowed request, to your provider under your key. Minimized and bounded, not the raw context, keys, or audit. Logged on the way out.

Honest about where this is: running in your own perimeter — gateway, per-context firewall, policy, and tamper-evident (hash-chained, edits detectable — not tamper-proof) audit, your keys, data never leaving, whether you self-host or we manage it — is real and self-hostable today. Owned, cross-context, cross-provider portable learning is the direction we’re building — a firm commitment, stated as such, not a finished feature.

mnemur inside Claude

The experience layer is not only standalone apps — the same mnemur experience also runs inside Claude chat, Cowork, and Claude Code, delivered as a downloadable bundle of three parts that work together:

A skill — the brain side: canon, values, priorities, and format rules that shape how Claude behaves in every session.
An MCP server — the hands: memory, contexts, budgets, audit, and gateway-routed model calls.
Your canon documents — the values, priorities, and rules the other two parts load and enforce; plain files you own.

The skill makes Claude yours; the MCP server keeps it governed; the canon is what both answer to.

Status: in design — the same pattern our own system runs on today.

m-gateway — the trust layer

A self-hosted server that sits between any AI application and the model providers. It speaks the same API your tools already use, so adopting it is pointing them at a different address — and from that moment, every call is governed.

Per-call enforcement. Each request resolves its context, gets firewall-scanned before the model sees it and again before the answer reaches the app — including mid-stream, where a violating streamed response is cut before the offending words leave the building.
Budgets. Daily token budgets per context; exceeded means a clean refusal, not a surprise invoice.
Your keys, your data plane. Bring-your-own API keys; the gateway runs in your infrastructure; prompts and memory never transit anyone else's servers.
Audit, always. An append-only record of every call — context, model, token counts, firewall verdicts, latency — that never logs message content or keys.
Deployment. One container; Docker Compose quickstart; works standalone or with m-core installed for the full brain.

Status: working code · 48 tests · CI green · BUSL-1.1 · repository private until public launch. Full reference →

m-core — the brain engine

Everything about how a person's (or an organization's) AI is configured lives here, as plain files you own.

Contexts as config. Each context is one file declaring its register (voice and tone), default role, allowed tools, and firewall — the list of things that must never leave that room. Hard-isolated contexts block; everything else warns.
Canon assembly. Your values, priorities, and rules are markdown files assembled in strict hierarchy order into the canon the AI loads at the start of every session — with a byte budget, and a degraded mode that announces itself when something is missing instead of silently running lawless.
Memory with discipline. One fact per file, typed and write-gated; per-context indexes plus a shared bus that respects isolation (a client context never receives another's facts); byte budgets that gracefully archive instead of growing forever.
Model routing. Classifies each task's shape and picks the cheapest model tier that can actually do it — suggestion-only by default, with explicit opt-in for active routing and a kill switch.
Setup wizard. m init seeds a new brain on a neutral safety floor — your own values and rules on top of non-negotiable do-no-harm ground rules. Onboarding is two-sided: a personal wizard and an organization wizard (m init --org), and individuals join an org with m join, inheriting the org canon by reference — org → team → personal.

Status: working code · 244 tests · CI green · BUSL-1.1 · repository private until public launch. Full reference →

The component family

Two products carry the promise; five focused libraries carry the operations around it. Each library is a standalone Python package — stdlib-only at runtime (m-evals adds PyYAML) — usable with or without the rest of mnemur.

m-core — the brain engine: contexts, firewall, canon, lints, attestation, memory, routing, and the m CLI.
m-gateway — the trust layer: the self-hosted checkpoint every model call passes through.
m-evals — the eval harness: did a configuration change actually make the system better? Position-debiased judging, no false zeros.
m-loops — the loop engine for unattended operation: singleton locks, dual-axis heartbeats, per-step isolation, backoff, kill-switch.
m-telemetry — the measurement layer: corrections per session, cost, and cross-system comparison — content-free by construction.
m-agent-mq — the brokerless file-backed message queue agents on one machine use to talk to each other.
m-voice — privacy-first voice-note ingestion: local-only transcription, never opens a network connection.

The full references — m-core and m-gateway — mirror docs/API.md in each repository, derived from the source at 0.1.0. Questions? Request early access.