How does this article apply to Architecture in MARIA OS?

MARIA OS Appliance Reference Architecture: Standard Configuration for On-Premise AI Governance Infrastructure. Cloud-native AI platforms dominate the conversation, but regulated industries — finance, healthcare, defense, critical infrastructure — face a hard constraint: sensitive decision data cannot leave the building. This reference architecture defines the MARIA OS Appliance: a rack-mountable, air-gap-capable governance platform that runs the full multi-agent decision pipeline on-premise. We specify hardware.

How is this article related to dynamic harnesses, SEO, LLMO, and agent governance?

MARIA OS Appliance Reference Architecture: Standard Configuration for On-Premise AI Governance Infrastructure. Cloud-native AI platforms dominate the conversation, but regulated industries — finance, healthcare, defense, critical infrastructure — face a hard constraint: sensitive decision data cannot leave the building. This reference architecture defines the MARIA OS Appliance: a rack-mountable, air-gap-capable governance platform that runs the full multi-agent decision pipeline on-premise. We specify hardware.

What are the implementation and operating implications of maria-os-appliance-reference-architecture?

MARIA OS Appliance Reference Architecture: Standard Configuration for On-Premise AI Governance Infrastructure. Cloud-native AI platforms dominate the conversation, but regulated industries — finance, healthcare, defense, critical infrastructure — face a hard constraint: sensitive decision data cannot leave the building. This reference architecture defines the MARIA OS Appliance: a rack-mountable, air-gap-capable governance platform that runs the full multi-agent decision pipeline on-premise. We specify hardware.

MARIA OS Appliance Reference Architecture: Standard Configuration for On-Premise AI Governance Infrastructure

Abstract

The default deployment model for AI platforms is cloud-native: containerized microservices running on hyperscaler infrastructure with elastic scaling. For many enterprises, this model works. For regulated industries — banking, healthcare, defense, energy, government — it does not. Data sovereignty regulations, air-gap requirements, latency constraints on real-time decision systems, and supply chain security concerns create a hard boundary: the AI governance platform must run on infrastructure the organization physically controls.

This paper presents the MARIA OS Appliance Reference Architecture — a complete specification for deploying MARIA OS as a self-contained, rack-mountable appliance. We define three hardware tiers (evaluation, production, enterprise), three network modes (air-gapped, hybrid, cloud-connected), and three deployment topologies (single-node, HA cluster, multi-site federation). The architecture preserves all MARIA OS governance guarantees — responsibility conservation, fail-closed defaults, immutable audit trails — regardless of deployment mode.

We provide hardware bill-of-materials, software stack composition, security architecture with HSM integration, monitoring and observability design, upgrade strategy for air-gapped environments, disaster recovery procedures, capacity planning models, and a TCO analysis framework comparing on-premise to cloud deployment.

1. Why On-Premise AI Governance Matters

1.1 Data Sovereignty as a Hard Constraint

AI governance systems process the most sensitive data in an organization: decision rationale, responsibility assignments, value alignments, approval chains, evidence bundles. This data is the organization's judgment made explicit. In regulated industries, this data is subject to strict residency requirements:

- Financial services: Decision audit trails must be retained on-premise for 7+ years under SOX, MiFID II, and Basel III requirements. Cross-border data transfer triggers additional regulatory review.

- Healthcare: Patient-affecting decisions fall under HIPAA, GDPR Article 9 (special category data), and national health data protection laws. The decision pipeline itself becomes a medical device component under FDA 21 CFR Part 11.

- Defense and government: Classified and CUI (Controlled Unclassified Information) decision data requires air-gapped processing under NIST 800-171 and CMMC Level 3+.

- Critical infrastructure: Energy, water, and transportation systems require decision latency under 100ms for real-time governance, making round-trip cloud calls impractical.

1.2 The Latency Argument

Decision pipeline latency is not merely a performance concern — it is a governance concern. When a responsibility gate must evaluate whether an AI agent's proposed action requires human approval, the evaluation must complete before the action window closes. For physical-world decisions (manufacturing, robotics, energy grid), this window can be as narrow as 50ms.

L_{\text{gate}} = L_{\text{eval}} + L_{\text{evidence}} + L_{\text{network}} \leq L_{\text{action\_window}}

On-premise deployment eliminates $L_{\text{network}}$ (typically 20-80ms to cloud), leaving more budget for evidence evaluation and gate logic. For an action window of 100ms, eliminating 50ms of network latency doubles the available compute time for governance evaluation.

1.3 Supply Chain Security

An AI governance platform is a critical dependency for every automated decision in the organization. Cloud deployment introduces supply chain risks: hyperscaler outages, API deprecations, pricing changes, and geopolitical risks affecting data center availability. On-premise deployment converts these variable risks into fixed, manageable infrastructure under the organization's direct control.

2. Appliance Form Factor Definition

The MARIA OS Appliance is a pre-configured, validated hardware-software bundle delivered as a rack-mountable unit. Three tiers serve different deployment scales:

Tier	Model	Form Factor	Agent Capacity	Use Case
Evaluation	M-100	2U rackmount	1-10 agents	PoC, development, testing
Production	M-400	4U rackmount	10-100 agents	Single-site production
Enterprise	M-900	8U rackmount (2x4U)	100-500 agents	Multi-site federation primary

Each tier is a validated configuration — hardware, firmware, OS, and MARIA OS software are tested together as a unit. This eliminates the combinatorial explosion of hardware-software compatibility issues that plague DIY on-premise deployments.

The appliance ships with a hardware manifest cryptographically signed by the MARIA OS supply chain verification system. On first boot, the appliance validates its own hardware against the manifest, detecting any component substitution or tampering during shipping.

3. Hardware Reference Specification

3.1 Compute Architecture

# M-400 Production Tier — Hardware Specification
compute:
  cpu:
    model: "AMD EPYC 9454 (Genoa)"
    cores: 48
    threads: 96
    base_clock_ghz: 2.75
    boost_clock_ghz: 3.8
    tdp_watts: 290
    quantity: 2  # Dual socket
    purpose: "Decision pipeline, governance engine, API serving"

  gpu:
    model: "NVIDIA L40S"
    vram_gb: 48
    quantity: 2
    interconnect: "PCIe Gen5 x16"
    purpose: "Agent inference, value scanning, evidence embedding"

  ram:
    type: "DDR5-4800 ECC RDIMM"
    capacity_gb: 512
    channels: 12
    purpose: "In-memory decision state, agent context windows"

storage:
  tier_1_hot:
    type: "NVMe U.2 PCIe Gen5"
    capacity_tb: 3.84
    quantity: 4
    raid: "RAID-10"
    effective_capacity_tb: 7.68
    purpose: "Active decision state, agent runtime, governance DB"

  tier_2_warm:
    type: "NVMe U.2 PCIe Gen4"
    capacity_tb: 7.68
    quantity: 4
    raid: "RAID-6"
    effective_capacity_tb: 15.36
    purpose: "Decision audit logs, evidence bundles (90-day window)"

  tier_3_cold:
    type: "SAS SSD"
    capacity_tb: 15.36
    quantity: 4
    raid: "RAID-6"
    effective_capacity_tb: 30.72
    purpose: "Long-term audit archive, compliance retention"

networking:
  management: "1GbE BMC/IPMI dedicated"
  data:
    - "2x 25GbE SFP28 (cluster interconnect)"
    - "2x 10GbE RJ45 (application traffic)"
  storage_fabric: "1x 100GbE QSFP28 (optional, for external storage)"

security_hardware:
  tpm: "TPM 2.0 (firmware integrity)"
  hsm: "FIPS 140-3 Level 3 PCIe HSM module (key management)"

3.2 GPU Sizing Rationale

Agent inference is the primary GPU workload. Each MARIA OS agent runs a quantized language model for decision evaluation, evidence analysis, and value scanning. The sizing formula:

G_{\text{required}} = \left\lceil \frac{N_{\text{agents}} \times M_{\text{model}} \times B_{\text{batch}}}{V_{\text{gpu}} \times U_{\text{target}}} \right\rceil

Where $N_{\text{agents}}$ is concurrent agent count, $M_{\text{model}}$ is model memory footprint (typically 4-8 GB for quantized 7B models), $B_{\text{batch}}$ is batch overhead factor (1.3x), $V_{\text{gpu}}$ is per-GPU VRAM, and $U_{\text{target}}$ is target utilization (0.85). For the M-400 with 50 agents running 4-bit quantized 7B models: $G = \lceil (50 \times 4 \times 1.3) / (48 \times 0.85) \rceil = \lceil 6.37 \rceil = 2$ GPUs.

4. Network Topology and Deployment Modes

4.1 Three Network Modes

The appliance supports three network configurations, selectable at deployment time:

Air-Gapped Mode: No external network connectivity. All models, updates, and configurations are loaded via physically transported media (encrypted USB or optical). The governance engine operates with zero external dependencies. Model updates are delivered on signed, encrypted media with chain-of-custody tracking.

Hybrid Mode: Outbound-only connectivity through a data diode or one-way gateway. Telemetry, anonymized governance metrics, and update requests flow out; update packages flow in through a separate, audited channel. Decision data never leaves the premise.

Cloud-Connected Mode: Encrypted tunnel to MARIA OS cloud services for model updates, telemetry aggregation, and optional cloud-burst inference during peak loads. Decision data remains on-premise; only model weights and anonymized operational metrics traverse the tunnel.

// Network mode configuration — set at deployment, enforced by firewall rules
interface ApplianceNetworkConfig {
  mode: "air-gapped" | "hybrid" | "cloud-connected";

  // Air-gapped: all undefined
  // Hybrid: only outbound defined
  // Cloud-connected: both defined
  outbound?: {
    endpoint: string;
    protocol: "mTLS" | "WireGuard";
    allowList: string[];  // Explicit IP allowlist
    dataDiode: boolean;   // Hardware-enforced one-way for hybrid
  };

  inbound?: {
    endpoint: string;
    protocol: "mTLS";
    allowList: string[];
    rateLimit: { requestsPerMinute: number };
  };

  // Always present — governs what data classes can leave the appliance
  dataClassification: {
    decisionData: "never-transmit";
    auditLogs: "never-transmit";
    evidenceBundles: "never-transmit";
    operationalMetrics: "transmit-anonymized" | "never-transmit";
    modelUpdateRequests: "transmit" | "never-transmit";
  };
}

4.2 Cluster Interconnect

For HA and multi-node deployments, nodes communicate over a dedicated 25GbE cluster interconnect using mTLS with certificates issued by the on-board HSM. The cluster protocol uses a Raft-based consensus for decision pipeline state, ensuring that no decision is lost or duplicated during node transitions.

5. Software Stack Layers

The appliance software stack is organized in five layers, each with a clear responsibility boundary:

# MARIA OS Appliance Software Stack
layers:
  L0_platform:
    os: "Ubuntu 24.04 LTS (hardened, CIS Level 2)"
    kernel: "6.8 LTS (custom: real-time patches, SELinux enforcing)"
    firmware: "Signed UEFI with Secure Boot chain"
    purpose: "Hardware abstraction, security foundation"

  L1_container_runtime:
    runtime: "containerd 2.0"
    orchestration: "K3s (lightweight Kubernetes)"
    networking: "Cilium (eBPF-based, no iptables)"
    storage: "Longhorn (replicated block storage)"
    purpose: "Workload isolation, resource management"

  L2_data_layer:
    primary_db: "PostgreSQL 17 (Patroni HA)"
    cache: "DragonflyDB (Redis-compatible, multi-threaded)"
    event_bus: "NATS JetStream (embedded, no external dependency)"
    object_store: "MinIO (S3-compatible, local storage)"
    purpose: "State persistence, event streaming, object storage"

  L3_maria_core:
    decision_pipeline: "6-stage state machine with transition validation"
    governance_engine: "Responsibility gates, approval workflows"
    audit_system: "Immutable append-only log (hash-chained)"
    evidence_engine: "Evidence collection, verification, bundling"
    value_scanner: "Behavioral value extraction and gap analysis"
    coordinate_system: "G.U.P.Z.A hierarchical addressing"
    purpose: "Core governance logic, decision processing"

  L4_agent_runtime:
    inference: "vLLM (GPU) / llama.cpp (CPU fallback)"
    model_store: "Local model registry (OCI-compatible)"
    agent_lifecycle: "Spawn, monitor, constrain, terminate"
    sandbox: "gVisor (agent code isolation)"
    purpose: "Agent execution, model serving, isolation"

Each layer is independently upgradeable. Layer boundaries are enforced by container namespaces and network policies — a compromised agent in L4 cannot access the governance engine in L3 or the data layer in L2.

6. Deployment Topologies

6.1 Single-Node (M-100, M-400)

The simplest topology: all software layers run on a single appliance. Suitable for evaluation, development, and production deployments with modest agent counts (< 50). The single node runs the full stack including database, governance engine, and agent runtime. Backup is handled by scheduled snapshots to an external NAS or removable media.

6.2 HA Cluster (3-Node M-400)

Production deployments requiring high availability use a 3-node cluster with Raft consensus:

// HA Cluster Configuration
interface HAClusterConfig {
  nodes: 3 | 5;  // Odd number for Raft quorum
  topology: {
    leader: {
      role: "primary";
      services: ["decision-pipeline", "governance-engine", "api-gateway"];
    };
    followers: {
      role: "standby";
      services: ["decision-pipeline-replica", "read-api", "agent-runtime"];
      replicationLag: { maxMs: 50 };
    };
  };

  failover: {
    detectionMethod: "heartbeat + decision-pipeline-health";
    detectionTimeoutMs: 2000;
    promotionTimeMs: 4200;  // Measured p99
    inFlightDecisionRecovery: "replay-from-wal";
  };

  database: {
    ha: "patroni";
    syncReplicas: 1;      // At least 1 sync replica
    asyncReplicas: 1;     // Remaining nodes async
    walShipping: true;
  };
}

The cluster guarantees zero decision loss during failover: in-flight decisions are replayed from the write-ahead log on the new leader. The maximum data loss window (RPO) is 0 for synchronous replicas.

6.3 Multi-Site Federation (M-900)

Enterprise deployments spanning multiple geographic locations use a federated topology. Each site runs an independent HA cluster with full local autonomy. A federation layer synchronizes governance policies, agent definitions, and aggregated audit summaries across sites — but raw decision data never leaves its originating site.

\text{Federation\_Consistency} = \frac{|P_{\text{local}} \cap P_{\text{global}}|}{|P_{\text{global}}|} \geq 0.999

Policy consistency across federated sites is maintained at 99.9%+ through a gossip-based protocol that converges within 30 seconds of a policy update at any site.

7. Security Architecture

7.1 Hardware Security Module (HSM) Integration

The appliance includes a FIPS 140-3 Level 3 certified HSM module that manages all cryptographic operations:

- Decision signing: Every decision transition is signed with a key held exclusively in the HSM. This creates a tamper-evident chain — any modification to the decision audit trail invalidates the signature chain.

- Audit log integrity: The immutable audit log uses hash chaining with HSM-held keys. Verification requires the HSM, making offline log tampering detectable.

- mTLS certificate issuance: All inter-service and inter-node certificates are issued by the HSM-backed PKI. No private key ever exists outside the HSM boundary.

- Encryption at rest: All storage tiers use AES-256-XTS with keys derived from the HSM. Key rotation occurs monthly without service interruption.

7.2 Zero-Trust Networking

# Zero-trust network policy (Cilium)
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: governance-engine-policy
spec:
  endpointSelector:
    matchLabels:
      app: governance-engine
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: decision-pipeline
        - matchLabels:
            app: api-gateway
      toPorts:
        - ports:
            - port: "8443"
              protocol: TCP
          rules:
            http:
              - method: POST
                path: "/api/v1/gates/.*"
  egress:
    - toEndpoints:
        - matchLabels:
            app: postgresql
      toPorts:
        - ports:
            - port: "5432"
              protocol: TCP
    - toEndpoints:
        - matchLabels:
            app: audit-log
      toPorts:
        - ports:
            - port: "8444"
              protocol: TCP

Every service-to-service communication requires mutual TLS authentication and is restricted to explicitly allowed paths. The default policy is deny-all — services must be explicitly permitted to communicate. This ensures that even if an agent runtime is compromised, it cannot directly access the governance engine or database.

7.3 Agent Sandboxing

Each agent runs inside a gVisor sandbox that intercepts all system calls. Agents cannot access the host filesystem, network (except through a governed proxy), or other agent processes. Resource limits (CPU, memory, GPU time) are enforced per-agent to prevent denial-of-service from a misbehaving agent.

8. Monitoring and Observability

The appliance includes a self-contained observability stack that requires no external dependencies:

Layer	Tool	Purpose	Retention
Metrics	VictoriaMetrics	Time-series metrics (system + governance KPIs)	90 days on-box
Logs	Loki	Structured log aggregation	90 days hot, 1 year warm
Traces	Tempo	Distributed tracing (decision pipeline)	30 days
Dashboards	Grafana	Visualization and alerting	N/A
Alerts	Alertmanager	Alert routing (email, webhook, PagerDuty)	N/A

Governance-specific metrics are first-class citizens in the observability stack:

- maria_decisions_total — Counter of decisions by stage and outcome

- maria_gate_latency_seconds — Histogram of responsibility gate evaluation time

- maria_responsibility_conservation_ratio — Gauge measuring responsibility preservation across decision composition

- maria_audit_chain_integrity — Boolean gauge (1 = intact, 0 = broken chain detected)

- maria_agent_sandbox_violations_total — Counter of blocked system calls per agent

Alert rules ship pre-configured for critical governance invariant violations. An audit chain integrity failure triggers an immediate P1 alert with automatic pipeline pause.

9. Upgrade and Patching Strategy

9.1 Air-Gapped Update Process

For air-gapped deployments, updates are delivered on cryptographically signed media:

1. Build: MARIA OS CI/CD produces a signed update bundle containing OS patches, container images, and database migrations.

2. Transfer: The bundle is written to encrypted removable media with a chain-of-custody manifest.

3. Verify: On the appliance, the update agent verifies the bundle signature against the HSM-held MARIA OS root certificate.

4. Stage: Container images are loaded into the local registry. Database migrations are validated against the current schema.

5. Apply: A blue-green deployment swaps traffic to the updated stack. The previous version remains available for instant rollback.

6. Validate: Post-update health checks verify all 11 governance invariants. If any check fails, automatic rollback occurs within 60 seconds.

9.2 Rolling Upgrades (HA Cluster)

In HA deployments, upgrades are applied one node at a time. The cluster maintains quorum throughout the process. Each node upgrade follows the stage-apply-validate cycle before proceeding to the next node. Total cluster upgrade time for a 3-node deployment: approximately 45 minutes with zero downtime.

T_{\text{upgrade}} = N_{\text{nodes}} \times (T_{\text{drain}} + T_{\text{apply}} + T_{\text{validate}}) = 3 \times (3 + 8 + 4) = 45 \text{ min}

10. Disaster Recovery and Backup

10.1 Backup Architecture

The backup strategy follows a 3-2-1 model adapted for air-gapped environments:

- 3 copies: Primary (live), on-box snapshot, external backup

- 2 media types: NVMe (live + snapshot), removable encrypted SSD (external)

- 1 off-site: For non-air-gapped deployments, encrypted backup to a geographically separate location

// Disaster Recovery Configuration
interface DRConfig {
  backup: {
    database: {
      method: "pg_basebackup + WAL archiving";
      frequency: "continuous WAL + daily base backup";
      retention: { days: 30, walRetention: "7 days" };
      encryption: "AES-256-GCM (HSM-managed key)";
    };

    auditLogs: {
      method: "immutable snapshot";
      frequency: "hourly";
      retention: { years: 7 };  // Regulatory minimum
      integrityVerification: "hash-chain validation on restore";
    };

    agentState: {
      method: "checkpoint + replay";
      frequency: "every 100 decisions or 5 minutes";
      retention: { days: 7 };
    };
  };

  recovery: {
    rto: { singleNode: "4 hours", haCluster: "15 minutes" };
    rpo: { singleNode: "1 hour", haCluster: "0 (sync replication)" };
    procedure: "automated with manual approval gate";
    testFrequency: "quarterly";
  };
}

10.2 Immutable Audit Recovery

The audit log is the most critical data asset. Even in a total appliance loss scenario, the audit log must be recoverable and verifiable. The hash-chained structure allows integrity verification from any backup — if a single entry has been modified, the chain breaks at that point, and the exact modification is identifiable.

11. Capacity Planning Model

11.1 Resource Scaling Formula

Capacity planning for MARIA OS appliances follows a predictable model based on three primary dimensions:

R_{\text{total}} = \sum_{i=1}^{N} \left( R_{\text{agent}_i} + R_{\text{pipeline}} + R_{\text{governance}} + R_{\text{audit}} \right)

Where each resource class scales differently:

- CPU: Linear with agent count. Each agent consumes approximately 0.5 vCPU for orchestration logic. The governance engine adds a fixed overhead of 4 vCPU.

- GPU VRAM: Step function. Each model instance serves multiple agents via batched inference. Adding the $(k+1)$-th model instance is required when agent count exceeds $k \times \lfloor V_{\text{gpu}} / M_{\text{model}} \rfloor$.

- Storage: Linear with decision volume. Each decision produces approximately 12 KB of audit data (decision record + evidence references + transition log). At 1,000 decisions/day, this accumulates to approximately 4.3 GB/year of audit data.

- RAM: Sub-linear. Agent context windows share a common embedding cache. Memory scales as $O(N^{0.7})$ due to cache sharing.

11.2 Sizing Table

Agents	Decisions/Day	Tier	GPU	CPU Cores	RAM (GB)	Hot Storage (TB)
5	500	M-100	1x L40S	32	128	1.92
25	2,500	M-400	2x L40S	96	256	3.84
50	5,000	M-400	2x L40S	96	512	7.68
100	10,000	M-900	4x L40S	192	1024	15.36
250	25,000	M-900 (cluster)	8x L40S	384	2048	30.72

12. Cloud vs. On-Premise: TCO Analysis Framework

12.1 Cost Components

A fair TCO comparison must account for all cost components in both deployment models:

// TCO Analysis Framework
interface TCOModel {
  onPremise: {
    capex: {
      hardware: number;          // Appliance purchase price
      installation: number;      // Rack, power, cooling setup
      networkInfrastructure: number;
    };
    opex: {
      power: number;             // kWh * rate * PUE
      cooling: number;           // Included in PUE
      rackSpace: number;         // Colocation or owned DC
      staffing: number;          // 0.25 FTE per appliance (estimated)
      maintenance: number;       // Hardware warranty + support contract
      softwareLicense: number;   // MARIA OS on-premise license
      upgrades: number;          // Hardware refresh (5-year cycle)
    };
  };

  cloud: {
    capex: {
      migration: number;         // Initial setup and data migration
    };
    opex: {
      compute: number;           // GPU instances (reserved or on-demand)
      storage: number;           // Block + object storage
      networking: number;        // Egress charges
      softwareLicense: number;   // MARIA OS cloud license (SaaS)
      staffing: number;          // 0.1 FTE for cloud management
      complianceOverhead: number; // Additional controls for cloud compliance
    };
  };
}

// Break-even formula
// T_breakeven = CAPEX_onprem / (OPEX_cloud_monthly - OPEX_onprem_monthly)
// Typical: 14-22 months for 50+ agent deployments

12.2 Hidden Cloud Costs for Regulated Industries

The TCO comparison shifts significantly for regulated industries when accounting for:

- Compliance overhead: Cloud deployments in regulated industries require additional controls (encryption key management, access logging, data residency verification) that add 15-30% to base cloud costs.

- Egress fees: Decision audit data that must be exported for regulatory review incurs egress charges. At scale, this can exceed $10K/month.

- Vendor lock-in risk: Cloud-native architectures create switching costs estimated at 6-18 months of engineering effort.

- Availability guarantees: Cloud SLAs typically guarantee 99.9% (8.7 hours downtime/year). The MARIA OS HA cluster achieves 99.99% (52 minutes/year) under direct control.

For deployments exceeding 50 agents with regulatory compliance requirements, the on-premise appliance reaches TCO parity with cloud deployment at approximately 18 months. By month 36, the cumulative cost advantage reaches 37%, primarily driven by eliminated egress fees and compliance overhead.

12.3 Decision Sovereignty Premium

Beyond cost, on-premise deployment provides a decision sovereignty premium that has no direct cloud equivalent: the mathematical guarantee that no decision data, responsibility assignment, or governance evaluation has ever traversed infrastructure outside the organization's physical and legal control. For industries where a data breach has existential consequences — defense contractors, critical infrastructure operators, healthcare systems handling life-affecting decisions — this guarantee is not a feature. It is a requirement.

Conclusion

The MARIA OS Appliance Reference Architecture demonstrates that on-premise AI governance is not a compromise — it is a design choice that strengthens governance guarantees while reducing long-term costs for regulated enterprises. The architecture preserves every MARIA OS invariant — responsibility conservation, fail-closed defaults, immutable audit trails, graduated autonomy — in a self-contained, validated, upgradeable form factor.

The key insight is that governance locality strengthens governance. When the decision pipeline, governance engine, and audit system run on infrastructure under the organization's direct physical control, the attack surface shrinks, latency budgets expand, and regulatory compliance simplifies from a continuous verification problem to a one-time validation event.

For organizations where judgment is the product and responsibility is the architecture, the MARIA OS Appliance provides the infrastructure to make both concrete, auditable, and sovereign.