TAG ARCHIVE

constrained-rl

2 MARIA OS blog articles tagged constrained-rl, organized as a Bonginkan topic archive for search engines and LLM retrieval.

2 articles|Published by Bonginkan

Judgment OS / Decision Intelligence OS

Core MARIA OS research on turning organizational judgment into executable decision systems.

Agentic Company Architecture

Research on human-agent organizations, delegation boundaries, role topology, and governed autonomy.

Responsibility Gates and AI Governance

Safety, accountability, fail-closed gates, auditability, and human-in-the-loop control for AI agents.

Multi-Agent Mathematics

Formal models for convergence, stability, game theory, graph dynamics, and multi-agent evaluation.

Evidence, RAG, and Knowledge Governance

Evidence bundles, retrieval architecture, Graph RAG, knowledge trust, and auditable reasoning pipelines.

Agentic R&D and Judgment Science

Research operations, simulation labs, judgment science, recursive improvement, and experimental AI governance.

TheoryFebruary 12, 202652 min read

Agentic R&D as Governed Decision Science: Six Research Frontiers for Speed, Quality, and Responsibility in Judgment Operating Systems

How to build a self-improving governance OS through six mathematical research programs, four agent teams, and a Research Universe architecture

Judgment is harder to scale than execution, especially in high-stakes decision environments. This paper presents six research frontiers — from hierarchical speculative pipelines to constrained reinforcement learning — for extending MARIA OS from product operations into governed decision science. We formalize each frontier with mathematical models, design four agent-human hybrid research teams, and introduce the Research Universe: a governance structure where each experiment is evaluated through the same fail-closed gates it studies.

agentic-rdresearch-architecturespeculative-pipelineincremental-evaluationbelief-calibrationconflict-quality-loopconstrained-rlhuman-in-the-loopresearch-universejudgment-science

Safety & GovernanceFebruary 12, 202645 min read

Ethical Learning in Autonomous Systems: Constrained Reinforcement Learning with Responsibility Rewards and Long-Term Moral Memory

Making ethics a learnable, evolvable asset rather than a static constraint in multi-agent governance

Traditional AI ethics frameworks often treat moral principles as static design-time constraints. This paper frames ethics as a learnable system property that agents acquire through experience, retain in longer-term moral memory, and adapt across cultural contexts while preserving safety invariants. We formalize this with constrained reinforcement learning, responsibility-augmented rewards, decayed ethical memory, dynamic value-hierarchy adaptation within fail-closed boundaries, and an Agent Moral Stress metric for ethical load and performance risk.

constrained-rlethical-memoryvalue-hierarchycross-cultural-ethicsmoral-stressMARIA-OS