Name: MARIA OS
Author: MARIA OS

Abstract

The deployment of AI meeting assistants creates a fundamental tension between intelligence extraction and privacy preservation. A system that records, transcribes, and analyzes meeting audio operates in a domain where consent is not optional — it is a legal requirement in most jurisdictions and an ethical imperative in all of them. Yet most commercial meeting AI tools treat consent as a notification banner: a passive disclosure that participants may or may not read, with recording proceeding regardless.

This paper presents a different approach. MARIA Meeting AI implements a gate architecture — a layered evaluation system where four independent gates (Consent, Scope, Export, Speak) must each evaluate to pass before the system takes any action on meeting data. The architecture is fail-closed: when any gate cannot determine its state, it defaults to pending or fail, and the system restricts its behavior accordingly. No data is stored without consent. No full transcript is retained when external participants are present. No data leaves the system without explicit export approval.

We formalize this gate architecture as an algebraic structure, prove that the composition of fail-closed gates preserves the fail-closed property, and derive the information flow constraints that each gate enforces. The practical contribution is a meeting AI system that achieves full transcription capability for authorized sessions while maintaining zero unauthorized data retention.

1. The Meeting Privacy Problem

1.1 Why Meetings Are Different from Documents

A document is a deliberate artifact. Its author chose to create it, reviewed its contents, and decided to share it. A meeting, by contrast, is an ephemeral event. Participants speak spontaneously. They share tentative ideas, express uncertainty, make mistakes, and correct themselves. The social contract of a meeting assumes that this stream-of-consciousness dialogue is transient — heard by those present, remembered imperfectly, and not recorded verbatim.

When an AI system transcribes a meeting, it transforms this ephemeral event into a permanent record. Every hesitation, every off-the-record comment, every half-formed thought becomes a searchable, quotable artifact. This transformation is not neutral. It changes the nature of the meeting itself. Participants who know they are being recorded behave differently — they self-censor, they speak more formally, they avoid sensitive topics. The Hawthorne effect is not a hypothetical concern; it is a documented phenomenon in recorded meeting research.

The engineering challenge is therefore not merely technical (how to transcribe accurately) but architectural (how to design a system whose privacy guarantees are as strong as its transcription capabilities).

1.2 The Consent Hierarchy

Meeting consent is not a binary state. It exists on a spectrum with distinct levels, each enabling different system capabilities:

Level 0 — No consent: The system may join the meeting but must not capture, store, or process any audio. It operates as a silent observer with no data collection.
Level 1 — Host consent only: The meeting host has consented to AI participation. The system may capture and process audio in real-time but applies gate restrictions to storage and distribution based on participant scope.
Level 2 — Full participant consent: All participants have individually consented. The system has full authorization for transcription, storage, minutes generation, and distribution.
Level 3 — Export consent: In addition to transcription consent, explicit authorization has been granted to export meeting data to external systems (email, project management tools, CRM).

MARIA Meeting AI operates primarily at Level 1 in its Phase 1 deployment — host consent is required, and the scope gate determines what can be stored based on participant composition.

2. Gate Architecture Formalization

2.1 Gate Evaluation Function

A gate is a function that maps a session state to an evaluation result. Formally:

G: S \rightarrow \{\text{pass}, \text{fail}, \text{pending}\} $$

where $S$ is the space of all possible session states. A session state $s \in S$ includes the participant list, consent records, meeting phase, and current gate results. Each gate evaluates independently:

Consent Gate $G_c(s)$: Evaluates whether the host has explicitly consented to AI participation.
Scope Gate $G_s(s)$: Evaluates whether all participants are internal to the organization.
Export Gate $G_e(s)$: Evaluates whether data export has been explicitly approved.
Speak Gate $G_k(s)$: Evaluates whether the AI is authorized to produce audio output in the meeting (Phase 2 only).

2.2 The Fail-Closed Property

Definition (Fail-Closed Gate). A gate $G$ is fail-closed if and only if:

\forall s \in S: G(s) = \text{pass} \implies \text{evidence}(s, G) \neq \emptyset $$

In other words, a gate can only evaluate to pass when positive evidence exists. The absence of evidence always results in pending or fail — never in pass. This is the fundamental asymmetry of the architecture: passing requires proof; failing requires only the absence of proof.

Theorem 1 (Fail-Closed Composition). If $G_1$ and $G_2$ are both fail-closed gates, then their conjunction $G_{12}(s) = G_1(s) \wedge G_2(s)$ (where $\text{pass} \wedge \text{pass} = \text{pass}$, and anything else yields the more restrictive result) is also fail-closed.

Proof. Suppose $G_{12}(s) = \text{pass}$. Then both $G_1(s) = \text{pass}$ and $G_2(s) = \text{pass}$. Since both are fail-closed, $\text{evidence}(s, G_1) \neq \emptyset$ and $\text{evidence}(s, G_2) \neq \emptyset$. Therefore $\text{evidence}(s, G_{12}) \supseteq \text{evidence}(s, G_1) \cup \text{evidence}(s, G_2) \neq \emptyset$. $\square$

This theorem is not merely academic. It guarantees that adding more gates to the system can only make it more restrictive, never less. A system with four fail-closed gates is at least as restrictive as any individual gate. This property enables compositional security reasoning — we can verify each gate independently and know that the composed system is at least as safe as any component.

2.3 Gate Evaluation Order

While gates evaluate independently, their results compose in a specific order that determines system behavior:

\text{canSaveData}(s) = G_c(s) = \text{pass} $$

\text{canSaveFullTranscript}(s) = G_c(s) = \text{pass} \wedge G_s(s) = \text{pass} $$

\text{canExportData}(s) = G_c(s) = \text{pass} \wedge G_e(s) = \text{pass} $$

The consent gate is the prerequisite for all data operations. Without consent, no data is saved regardless of scope or export status. The scope gate controls the granularity of stored data — when external participants are present, only summaries (not full transcripts) are retained. The export gate controls whether data can leave the MARIA OS boundary.

3. Consent Gate Implementation

3.1 Consent Detection Methods

The consent gate accepts consent through multiple channels, ordered by signal strength:

1. Chat-based consent: The host types a specific keyword (e.g., 'CONSENT' or '同意') in the meeting chat. This produces an unambiguous, timestamped consent record. 2. Dashboard consent: The host clicks a consent button in the MARIA OS dashboard before or during the meeting. 3. Calendar-based pre-consent: When the meeting was scheduled through MARIA OS with explicit AI participation, the host's scheduling action constitutes pre-consent.

Each method produces a consent record $c = (\text{consentedBy}, \text{consentedAt}, \text{method})$ that is stored as part of the session state.

3.2 Temporal Consent Semantics

Consent is not retroactive. If the host consents at time $t_c$, only data captured after $t_c$ is eligible for storage. Data captured before consent — during the pending window — is processed in real-time but not persisted. This temporal boundary is enforced by the session manager, which tags each transcript segment with a capturedAt timestamp and compares it against $t_c$.

Definition (Consent Window). The consent window for session $s$ is the interval $[t_c, t_\text{end}]$ where $t_c$ is the consent timestamp and $t_\text{end}$ is the session end timestamp. Only data within this window is eligible for persistence.

If consent is never given during an active session, the gate evaluates to pending. When the session completes without consent, the gate transitions to fail, and all real-time data is discarded.

4. Scope Gate: Information-Theoretic Privacy

4.1 Internal vs. External Classification

The scope gate classifies each participant as internal or external based on their email domain. The organization maintains a whitelist of internal domains $D_\text{internal} = \{d_1, d_2, \ldots, d_k\}$. A participant $p$ with email $e$ is classified as:

\text{class}(p) = \begin{cases} \text{internal} & \text{if } \text{domain}(e) \in D_\text{internal} \\ \text{external} & \text{otherwise} \end{cases} $$

Participants without email addresses (anonymous or phone-in participants) are conservatively classified as external. This conservative classification is another manifestation of the fail-closed principle — uncertainty about identity resolves to the more restrictive classification.

4.2 Data Restriction Under External Presence

When the scope gate detects external participants, it restricts the data that can be stored:

Full transcript: NOT stored (only available during real-time processing)
AI-generated summary: Stored (contains no verbatim quotes from external participants)
Decision items: Stored (attributed to roles, not individuals, when external parties are involved)
Action items: Stored with anonymized owners when assigned to external participants

This creates a two-tier storage model where internal-only meetings receive full transcript retention while mixed meetings receive summary-only retention. The information loss is intentional — it is the price of operating in a privacy-preserving mode.

4.3 Privacy Bound

The scope gate enforces an information-theoretic bound on stored data. Let $I(T; P_\text{ext})$ denote the mutual information between the stored transcript $T$ and the speech of external participants $P_\text{ext}$. The scope gate ensures:

I(T; P_\text{ext}) \leq \epsilon $$

where $\epsilon$ is bounded by the information content of the summary. Since the summary is generated by the AI (not stored verbatim), and since it describes topics and decisions rather than quoting speakers, the mutual information with any individual speaker's verbatim content is bounded by the summary's entropy — typically 2-3 orders of magnitude less than the full transcript.

5. Export and Speak Gates

5.1 Export Gate

The export gate controls whether meeting data can leave the MARIA OS system boundary. This gate is always pending by default and requires explicit approval through the dashboard UI. The separation of export from consent recognizes that a participant may consent to AI transcription within the organization's boundary while objecting to that data being sent to external systems.

Export destinations are classified and logged:

Internal systems (Notion, internal Slack, company wiki): Requires export gate pass
External systems (client email, public platforms): Requires export gate pass AND scope gate pass
Regulatory systems (audit trails, compliance databases): Exempt from export gate (governed by separate compliance framework)

5.2 Speak Gate (Phase 2)

The speak gate controls whether MARIA can produce audio output in the meeting — answering questions, providing summaries, or offering suggestions. This gate is automatically fail in Phase 1 (silent transcription mode) and becomes available in Phase 2.

The speak gate requires both consent AND phase authorization:

G_k(s) = \begin{cases} \text{fail} & \text{if } s.\text{phase} = 1 \\ G_c(s) & \text{if } s.\text{phase} \geq 2 \end{cases} $$

This ensures that Phase 1 deployments cannot accidentally enable voice output, even if all other conditions are met.

6. System Architecture and Integration

6.1 Gate Evaluation in the Session Lifecycle

Gates are evaluated at multiple points in the session lifecycle:

1. Session creation: All gates are initialized to pending. 2. Bot join: Consent gate is evaluated. If pending, the bot joins but does not store data. 3. Participant change: Scope gate is re-evaluated whenever participants join or leave. 4. Host consent received: Consent gate re-evaluates to pass. All dependent gates are re-evaluated. 5. Session end: Gates with pending results transition to fail. Unstored data is discarded.

This lifecycle ensures that gates are continuously evaluated against the current session state, not just at a single point in time.

6.2 Audit Trail

Every gate evaluation produces an immutable audit record:

\text{audit}(g, s, t) = (\text{gate}: g, \text{result}: G_g(s), \text{reason}: r, \text{timestamp}: t) $$

These records provide a complete trace of every privacy decision made by the system. When a gate transitions from pending to pass, the reason includes the evidence that triggered the transition (e.g., 'Host consent received via chat at 10:01:23'). When a gate transitions to fail, the reason includes the condition that caused the failure (e.g., 'External participant detected: tanaka@client.co.jp').

7. Conclusion

The gate architecture transforms meeting AI from a surveillance tool into a governed intelligence system. By making privacy the architectural foundation rather than a feature flag, MARIA Meeting AI ensures that the system's capability envelope is always bounded by its authorization state. The fail-closed composition theorem guarantees that this property holds regardless of how many gates are added or how they are composed. The result is a system where users can trust that the AI will never exceed its authorization — not because of policy, but because of architecture.

The four-gate design (Consent, Scope, Export, Speak) maps directly to the four questions that any responsible meeting AI must answer: Was recording authorized? Who was present? Where can the data go? Can the AI speak? By formalizing these questions as independently evaluable gates with fail-closed semantics, the system provides verifiable answers that can be audited, tested, and proven correct.

Gated Meeting Intelligence: Fail-Closed Privacy Architecture for AI-Powered Meeting Transcription