ENGINEERING BLOG
Technical research and engineering insights from the team building the operating system for responsible AI operations.
121 articles · Published by MARIA OS
How Proximal Policy Optimization enables medium-risk task automation while respecting human approval gates
Gated autonomy requires reinforcement learning that respects responsibility boundaries. This paper positions actor-critic methods — specifically PPO — as a core algorithm in the Control Layer, showing how the actor learns policies, the critic estimates state value, and responsibility gates constrain the action space dynamically. We derive a gate-constrained policy-gradient formulation, analyze PPO clipping behavior under trust-region constraints, and model human-in-the-loop approval as part of environment dynamics.
Quantifying reversibility scores for medical procedures and dynamically adjusting governance gates to prevent catastrophic irreversible harm
Medical decisions have different reversibility profiles: some interventions are easy to roll back, others are not. This paper introduces a formal reversibility model that assigns numerical scores to treatment actions and adapts AI governance-gate strength to expected irreversibility. Lower reversibility triggers tighter control, while higher reversibility allows broader delegated autonomy, yielding a principled framework for graduated clinical AI operation.
Modeling defect rate as a state variable and applying control-theoretic stability analysis to manufacturing quality gates
Manufacturing AI systems face a stability problem that traditional software governance often does not: defect rates evolve as continuous dynamical variables under material variation, tool wear, and environmental drift. This paper models the manufacturing quality gate as a feedback-control system, derives Lyapunov stability conditions for gate equilibria, designs a PID-style controller to keep defect rates below tolerance under bounded disturbances, and extends the analysis to multi-stage quality cascades. In a semiconductor fabrication case study, the framework showed 94.7% defect containment with sub-200ms gate response time and BIBO-stability behavior under realistic disturbance profiles.
Evaluating power grid decision stability through Lyapunov energy functions and responsibility-gated load balancing
Power grids can operate near stability limits, where dispatch errors or delayed interventions may trigger cascading disruptions. This paper introduces a Lyapunov-based decision-stability score for energy-grid AI agents, providing formal criteria for when autonomous grid-management actions remain within stable operating regions.
Preventing AI tutoring systems from converging on single recommendation patterns through diversity-enforcing stability constraints
Left unconstrained, recommendation algorithms can converge to narrow patterns: similar problem types, difficulty bands, or teaching approaches. In education, this can create learning monocultures that limit broader development. This paper develops a control-theoretic framework for suppressing over-fixation in educational AI while preserving learning effectiveness.
Five axioms, four pillar equations, and five theorems that transform organizational judgment into executable decision systems
Decision Intelligence Theory formalizes decision-making as a control system, integrating evidence, conflict, responsibility, execution, and learning to reduce false allowances while improving organizational completion rates. This capstone article presents a unified mathematical framework — five axioms, four pillar equations, and five theorems — with proofs, implementation mappings, and cross-industry validation across finance, healthcare, legal, and manufacturing.
Why responsibility is a computable threshold, not a philosophical debate - and how to implement it
Existing AI governance frameworks rely on qualitative guidelines to determine when human oversight is required. This paper formalizes responsibility decomposition as a quantitative threshold problem: we define a Responsibility Demand Function R(d) over decision nodes using five normalized factors - impact, uncertainty, externality, accountability, and novelty - and introduce a decomposition threshold τ that determines when human responsibility must be enforced. A dynamic equilibrium model captures temporal shifts driven by learning and contextual change. The framework is operationalized within MARIA OS gate architecture and validated through reproducible experiments on decision graphs.
A control-theoretic framework for gate design where smarter AI needs smarter stopping, not simply more stopping
Enterprise governance often assumes that more gates automatically mean more safety. This paper analyzes why that assumption can fail. We model gates as delayed binary controllers with feedback loops and derive stability conditions: serial delay should remain within the decision-relevance window, and feedback-loop gain should satisfy `kK < 1` to avoid over-correction oscillation. Safety is therefore not monotonic in gate count; it depends on delay-budget management, loop-gain control, and bounded recovery cycles.
Proving that fail-closed gates create a stable equilibrium in the risk-velocity state space using Lyapunov's direct method
Enterprise AI governance systems can accumulate risk over time through compounding errors, configuration drift, and expanding autonomy. This paper models governance dynamics as a continuous-time state system with risk `r` and decision velocity `v`, and control inputs gate strength `g` and evidence quality `q`. Using Lyapunov candidate `V(r, v) = alpha*r^2 + beta*v^2`, we derive conditions on `g` and `q` such that `dV/dt < 0`, establishing asymptotic stability. The resulting stability region in `(g, q)` space provides a design specification for bounded risk accumulation.
Formulating the multi-agent decision pipeline as a continuous-time control problem and deriving the optimal governance law
A Decision OS can be modeled as a control system that observes governance state, applies gate/evidence controls, and steers operations toward target conditions. This paper formulates the decision pipeline as a state-space control problem with state vector `x = [risk, compliance, evidence, velocity]`, control `u = [gate_strength, human_review_rate, evidence_threshold]`, and multi-objective cost `J = integral(risk + lambda * delay)dt`. We derive a control law via Pontryagin's maximum principle and characterize co-state dynamics, where optimal gate strength varies with accumulated risk and compliance margin.
AGENT TEAMS FOR TECH BLOG
Every article passes through a 5-agent editorial pipeline. From research synthesis to technical review, quality assurance, and publication approval — each agent operates within its responsibility boundary.
Editor-in-Chief
ARIA-EDIT-01
Content strategy, publication approval, tone enforcement
G1.U1.P9.Z1.A1
Tech Lead Reviewer
ARIA-TECH-01
Technical accuracy, code correctness, architecture review
G1.U1.P9.Z1.A2
Writer Agent
ARIA-WRITE-01
Draft creation, research synthesis, narrative craft
G1.U1.P9.Z2.A1
Quality Assurance
ARIA-QA-01
Readability, consistency, fact-checking, style compliance
G1.U1.P9.Z2.A2
R&D Analyst
ARIA-RD-01
Benchmark data, research citations, competitive analysis
G1.U1.P9.Z3.A1
Distribution Agent
ARIA-DIST-01
Cross-platform publishing, EN→JA translation, draft management, posting schedule
G1.U1.P9.Z4.A1
Complete list of all 121 published articles. EN / JA bilingual index.
121 articles
All articles reviewed and approved by the MARIA OS Editorial Pipeline.
© 2026 MARIA OS. All rights reserved.