TAG ARCHIVE
constrained-rl
2 MARIA OS blog articles tagged constrained-rl, organized as a Bonginkan topic archive for search engines and LLM retrieval.
Judgment OS / Decision Intelligence OS
Core MARIA OS research on turning organizational judgment into executable decision systems.
Agentic Company Architecture
Research on human-agent organizations, delegation boundaries, role topology, and governed autonomy.
Responsibility Gates and AI Governance
Safety, accountability, fail-closed gates, auditability, and human-in-the-loop control for AI agents.
Multi-Agent Mathematics
Formal models for convergence, stability, game theory, graph dynamics, and multi-agent evaluation.
Evidence, RAG, and Knowledge Governance
Evidence bundles, retrieval architecture, Graph RAG, knowledge trust, and auditable reasoning pipelines.
Agentic R&D and Judgment Science
Research operations, simulation labs, judgment science, recursive improvement, and experimental AI governance.
Agentic R&D as Governed Decision Science: Six Research Frontiers for Speed, Quality, and Responsibility in Judgment Operating Systems
How to build a self-improving governance OS through six mathematical research programs, four agent teams, and a Research Universe architecture
Judgment is harder to scale than execution, especially in high-stakes decision environments. This paper presents six research frontiers — from hierarchical speculative pipelines to constrained reinforcement learning — for extending MARIA OS from product operations into governed decision science. We formalize each frontier with mathematical models, design four agent-human hybrid research teams, and introduce the Research Universe: a governance structure where each experiment is evaluated through the same fail-closed gates it studies.
Ethical Learning in Autonomous Systems: Constrained Reinforcement Learning with Responsibility Rewards and Long-Term Moral Memory
Making ethics a learnable, evolvable asset rather than a static constraint in multi-agent governance
Traditional AI ethics frameworks often treat moral principles as static design-time constraints. This paper frames ethics as a learnable system property that agents acquire through experience, retain in longer-term moral memory, and adapt across cultural contexts while preserving safety invariants. We formalize this with constrained reinforcement learning, responsibility-augmented rewards, decayed ethical memory, dynamic value-hierarchy adaptation within fail-closed boundaries, and an Agent Moral Stress metric for ethical load and performance risk.