CATEGORY ARCHIVE
Intelligence
26 MARIA OS articles in the Intelligence category. Core MARIA OS research on turning organizational judgment into executable decision systems. This archive strengthens Bonginkan's topical authority across Judgment OS, Agentic Company, and AI governance research.
Judgment OS / Decision Intelligence OS
Core MARIA OS research on turning organizational judgment into executable decision systems.
Agentic Company Architecture
Research on human-agent organizations, delegation boundaries, role topology, and governed autonomy.
Responsibility Gates and AI Governance
Safety, accountability, fail-closed gates, auditability, and human-in-the-loop control for AI agents.
Multi-Agent Mathematics
Formal models for convergence, stability, game theory, graph dynamics, and multi-agent evaluation.
Evidence, RAG, and Knowledge Governance
Evidence bundles, retrieval architecture, Graph RAG, knowledge trust, and auditable reasoning pipelines.
Agentic R&D and Judgment Science
Research operations, simulation labs, judgment science, recursive improvement, and experimental AI governance.
Company Intelligence: Why MARIA OS Is Not an AI Tool but the Operating System for Organizational Judgment
From memory and decision cards to strategic simulation, this is the architecture that turns AI Office from labor automation into an organization that learns
Most AI deployments improve local productivity but fail to compound into institutional intelligence. This article defines Company Intelligence as the closed loop of memory, decision, feedback, and governance, then explains how MARIA OS encodes that loop into company memory, executable decisions, agent performance systems, reflection pipelines, knowledge graphs, and strategic simulation.
Company Intelligence: なぜMARIA OSはAIツールではなく、会社の知能をつくるOSなのか
AI Officeの価値は作業自動化ではなく、会社が記憶し、判断し、学習し、自己改善する閉ループを持てるかで決まる
多くのAI導入は局所的な生産性を改善しても、企業固有の知能には積み上がらない。本稿は、Company Intelligence を Memory・Decision・Feedback・Governance の閉ループとして定義し、MARIA OS がそれを Company Memory、Decision Card、Task Intelligence、Agent Performance、Knowledge Graph、Strategic Simulation へどう実装するかを解説する。
Capability Gap Detection: The Metacognitive Layer That Enables Self-Extending Agents
How agents recognize what they cannot do and trigger autonomous self-extension through formal gap analysis
Self-extending agents require a prerequisite that most architectures ignore: the ability to know what they do not know. This paper formalizes capability gap detection as a metacognitive layer that compares required capabilities against the agent's capability model, classifies detected gaps, prioritizes them by urgency and impact, and decides whether to synthesize, request, delegate, or escalate. We introduce the capability coverage metric, gap entropy measure, and multi-agent gap negotiation protocol. Experimental results show that agents with formal gap detection achieve 4.1x fewer silent failures and 2.8x faster self-extension compared to agents relying on runtime error detection.
Capability Gap Detection — Agentが自分の能力不足を認識するメタ認知アーキテクチャ
形式的ギャップ分析を通じて、自分にできないことを認識し自律的な自己拡張をトリガーする方法
自己拡張型Agentには、ほとんどのアーキテクチャが無視する前提条件がある。自分に何ができないかを知る能力である。本論文はCapability Gap Detectionをメタ認知レイヤーとして形式化する。必要な能力をAgentの能力モデルと比較し、検出されたギャップを分類し、緊急度とインパクトで優先順位付けし、合成・要求・委任・エスカレーションの判断を下す。能力カバレッジメトリック、ギャップエントロピー測度、マルチAgent間ギャップ交渉プロトコルを導入する。
CEO Clone as Decision Interface: Persona Layer Design for Delegating Executive Judgment
A formal architecture for encoding executive cognition into an auditable, drift-resistant persona layer that delegates judgment while preserving principal authority
Executive judgment is the highest-leverage bottleneck in any organization. Every strategic decision that waits for the CEO creates queue delay across the entire enterprise. Yet delegation through human hierarchies introduces information loss, preference distortion, and accountability diffusion. This paper presents the CEO Clone — not a chatbot that mimics speech patterns, but a computational decision interface that encodes the CEO's values, risk tolerance, decision patterns, and communication style into a formally verifiable persona layer. We model judgment delegation as a principal-agent problem with information asymmetry, introduce decision fidelity metrics with drift detection, and design calibration loops that maintain clone-principal alignment over time. The architecture operates within MARIA OS governance infrastructure, ensuring every delegated decision produces an immutable audit trail with full traceability to the encoded persona parameters that produced it.
CEOクローンとしての意思決定インターフェース:経営判断を委任するためのペルソナレイヤー設計
経営者の認知を監査可能・ドリフト耐性のあるペルソナレイヤーとしてエンコードし、主体者の権限を保持しながら判断を委任する形式的アーキテクチャ
経営判断は、あらゆる組織において最もレバレッジの高いボトルネックである。CEOの判断を待つ全ての戦略的意思決定は、企業全体にキュー遅延を生む。しかし、人間の階層構造を通じた委任は、情報損失、選好歪曲、責任拡散を引き起こす。本論文では、CEOクローン——CEOの発話パターンを模倣するチャットボットではなく、CEOの価値観、リスク許容度、意思決定パターン、コミュニケーションスタイルを形式的に検証可能なペルソナレイヤーとしてエンコードする計算的意思決定インターフェース——を提示する。判断委任をプリンシパル・エージェント問題として情報の非対称性のもとでモデル化し、ドリフト検出を伴う意思決定忠実度メトリクスを導入し、クローンと主体者の整合性を長期にわたり維持するキャリブレーションループを設計する。本アーキテクチャはMARIA OSガバナンスインフラの下で運用され、全ての委任された意思決定が、それを生成したペルソナパラメータまで完全に追跡可能な不変の監査証跡を生成する。
CEO OSの意思決定力学 — 判断を数理で捕捉する5軸アーキテクチャ
経営認知を5次元意思決定空間 X = (L, D, G, I, R) として形式化し、判断重力・判断慣性・レイヤー整合の物理学で組織判断をスケールさせるCEO OSの完全設計論
判断はスケールしない。実行はスケールする。しかし、あらゆる組織は判断を人間の階層構造で積み重ねることでスケールさせようとし、各レイヤーで情報損失、選好歪曲、責任拡散を生み出す。CEO OSは組織判断を分類問題ではなく物理学の問題として扱う——重力、慣性、レイヤー、場を持つ力学系として。本論文は完全な意思決定力学の形式化を提示する:認知深度、ドメイン特化、判断重力、組織慣性、責任境界を捕捉する5軸意思決定空間 X = (L, D, G, I, R)。300問のベイズ推定型引き出しプロトコル、破滅的レイヤー不一致を防止するレイヤー整合アルゴリズム、モンテカルロシナリオ分析による反事実シミュレーションエンジンを導入する。本アーキテクチャは自己キャリブレーション型・ドリフト耐性の意思決定オペレーティングシステムを生成し、8.4倍の委任スループットと94.7%の判断忠実度を実現する。
Metacognition in Agentic Companies: Why AI Systems Must Know What They Don't Know
Latent governance density, observable metacognitive coverage, and the stability bounds of self-governing enterprises
We formalize an agentic company as a graph-augmented constrained Markov decision process G_t = (A_t, E_t, S_t, Pi_t, R_t, D_t), distinguish latent governance density D_t from observable constrained-candidate coverage D_hat_t on router-generated Top-K actions, and define damping via kappa_t = kappa(D_hat_t). The exact local contraction condition is (1 - kappa_t) lambda_max(W_t) < 1, while the buffered operating envelope lambda_max(W_t) < 1 - kappa_t preserves adaptation headroom. Governance constraints thereby function as organizational metacognition: each constraint is a point where the system observes its own behavior. Planet-100 simulations validate that buffered role specialization emerges in the intermediate governance regime.
Recursive Adaptation in Action Routing: How MARIA OS Routes Learn from Execution Outcomes
How self-improving routing uses recursive execution feedback to converge toward high-quality policies while preserving Lyapunov stability guarantees
Static action routing — where rules are configured once and applied uniformly — is inadequate for enterprise AI governance. Agent capabilities evolve, workloads shift, and routing quality depends on context that is only observed after execution. This paper introduces a recursive adaptation framework for MARIA OS action routing in which execution outcomes update routing parameters through a formal learning rule. We define θ_{t+1} = θ_t + η∇J(θ_t), where J(θ) is expected routing quality and gradients are estimated from outcome signals. We prove convergence under standard stochastic-approximation assumptions and establish Lyapunov stability guarantees, showing the adaptation process remains bounded while converging toward locally optimal routing policies. Thompson sampling provides principled exploration, and a multi-agent coordination protocol prevents oscillatory conflicts under concurrent adaptation. The quantitative figures in this article should be read as replay and simulation outputs over 14 operating contexts, not as audited production metrics of the current shipping router.
Collective Calibration Dynamics: How Agent Teams Achieve Shared Epistemic Accuracy in MARIA OS
A formal analysis of how multi-agent teams calibrate collective confidence through structured interaction, showing why individual calibration is necessary but insufficient for team-level epistemic accuracy and how topology governs convergence
Individual calibration error measures how well one agent's stated confidence matches realized accuracy. In collaborative settings, however, a distinct phenomenon appears: collective calibration, where team-level confidence must track team-level accuracy. This paper defines collective calibration error as a metric that cannot be reduced to aggregated individual calibration, proves that individually well-calibrated agents can still form a poorly calibrated team under certain interaction topologies, and derives sufficient graph conditions for convergence. We validate the framework on MARIA OS deployments with 623 agents across 9 zones, showing a 41.7% reduction in collective calibration error via topology-aware reflection scheduling.
Executive Intelligence Synthesis: From Raw Meta-Cognitive Signals to Strategic Decision Support in MARIA OS
How MARIA OS converts low-level meta-cognitive telemetry into executive decision support through information-theoretic compression, relevance filtering, and narrative synthesis
Modern MARIA OS deployments generate tens of thousands of meta-cognitive signals per day, including bias scores, calibration errors, confidence distributions, blind-spot indices, cross-domain insight metrics, and organizational learning rates. Raw dashboards overwhelm executive decision workflows even when the underlying signals contain high-value risk and opportunity patterns. This paper addresses that signal-to-strategy gap by framing executive summarization as a rate-distortion problem: maximize compression while preserving actionable anomalies. We introduce a five-stage synthesis pipeline (hierarchical aggregation, relevance filtering, anomaly surfacing, narrative generation, and latency-accuracy balancing) and evaluate it across 14 MARIA OS deployments. Results show 97.3% information-load reduction with 94.1% anomaly preservation, alongside 2.7x faster and 31% more accurate governance decisions than raw-dashboard workflows.
Voice User Interface設計の認知科学的基盤: マルチモーダル対話における注意資源配分モデル
Wickensの多重資源理論、Baddeleyのワーキングメモリモデル、情報理論を統合し、VUI設計原則を形式化してMARIA VOICE実装で検証する
音声ユーザーインターフェース(VUI)の設計は、聴覚認知処理の特性を十分に扱わない経験則に依存しがちである。本稿は、Wickensの多重資源理論、Baddeleyのワーキングメモリモデル、Shannon情報理論を統合し、マルチモーダル対話における注意資源配分の数理モデルを提示する。文レベルストリーミングTTSの認知的最適性、1.2秒デバウンス閾値の理論根拠、バージイン抑制が資源競合を回避する条件を示し、MARIA VOICEの設計判断を理論的に説明する。
Knowledge Graph Construction from Decision Audit Trails: Entity Resolution and Temporal Edge Weighting for Governance Traceability
Transforming immutable decision records into queryable knowledge structures with principled temporal decay and cross-agent entity resolution
Enterprise governance platforms generate large audit trails that encode organizational decision-making, but those records are often difficult to query across multi-hop relationships. This paper presents a formal framework for constructing knowledge graphs from decision logs, including entity-resolution methods for noisy multi-agent audit data, temporal-decay functions for relevance-aware edge weighting, and compliance-oriented subgraph extraction. Experiments on MARIA OS audit corpora report 91.3% entity-resolution F1 across overlapping agent zones and 2.7x faster compliance-query response than relational baselines.
Knowledge Graph Completion Under Partial Observability: Predicting Missing Responsibility Edges in Enterprise Governance Graphs
Tensor-factorization methods for link prediction in incomplete governance graphs, with theoretical accuracy bounds across observability regimes
Enterprise knowledge graphs are inherently incomplete: undocumented responsibility links, informal decision chains, and cross-zone dependencies leave traceability gaps. This paper formulates governance-graph completion as a tensor-factorization problem under partial observability. We model the graph as a binary three-way tensor X in {0,1}^{n x n x r} (entities x entities x relations), apply CP decomposition to predict missing links, and derive theoretical accuracy bounds as a function of observability rate rho. On MARIA OS governance graphs, CP decomposition recovers 84.2% of withheld responsibility edges at 70% observability and surfaces 31 previously undocumented responsibility gaps in production.
Skill Complementarity in Agent Ensembles: A Stable Coverage Metric for Team Composition
Replace brittle convex-hull claims with coverage, dispersion, and backup depth
Selecting the highest-scoring individual agents often yields homogeneous teams that leave important parts of the problem space uncovered. This article replaces an overly brittle convex-hull formulation with a more stable Skill Complementarity Index based on skill coverage, pairwise dispersion, and backup depth. The result is easier to compute, easier to interpret, and better aligned with real team-design decisions.
Detecting Groupthink in Agent Teams: Persistent Homology for Blind-Spot Alerts
Topological signals expose hidden coverage gaps and groupthink risk that pairwise diversity metrics can miss
Persistent homology tracks coverage holes across scales to flag latent team blind spots earlier.
Memory Stratification for AI Governance: A Rate-Distortion Framework for Retention Decisions
Use information theory to decide what enterprise AI systems should remember, summarize, or discard
Rate-distortion memory policy retains high-utility context while limiting latency, privacy risk, and contradiction noise.
Causal-Temporal Knowledge Graph for AI Governance: Path-Specific Responsibility Attribution
A deep research framework for path-specific accountability, time-aware causality, and audit-grade explanation in enterprise AI
A temporal responsibility graph enables path-level causal attribution and faster, more reproducible root-cause analysis.
Gradient Boosting for Enterprise Decision Prediction: XGBoost and LightGBM as the Decision Layer of Agentic Companies
Why enterprise data is often tabular and how gradient boosting ensembles support approval prediction, risk scoring, and outcome estimation
While deep learning dominates many unstructured tasks, enterprise decision data is frequently tabular: structured features describing decisions, agents, contexts, and outcomes. This paper formalizes gradient boosting (XGBoost/LightGBM) as the Decision Layer (Layer 2) of the agentic company stack, details feature-engineering patterns for enterprise decision tables, and introduces SHAP-based explainability workflows for governance audits. Across evaluated datasets, the approach achieved 91.3% approval-prediction accuracy, 0.94 AUC on risk scoring, and full SHAP traceability integrated with MARIA OS responsibility gates.
Random Forest for Interpretable Organizational Decision Trees: Extracting Governance Logic from Ensemble Structure
How bagging-based tree ensembles reveal decision-branch structure, critical governance variables, and auditable policy trees
While gradient boosting often targets predictive accuracy, random forests provide a complementary strength: structural interpretability. This paper positions random forests as an interpretability engine within the Decision Layer (Layer 2), showing how ensemble structure surfaces governance logic, highlights key variables through permutation/impurity importance, and yields auditable policy trees. In evaluated workloads, random-forest feature importance reached 0.93 rank correlation with domain-expert rankings, extracted trees matched 89% of documented governance policies, and out-of-bag error supported validation in data-constrained settings.
Multi-Armed Bandits for Enterprise Strategy Optimization: Thompson Sampling, UCB, and Contextual Bandits in Agentic Organizations
How exploration-exploitation algorithms form the fifth layer of the agentic company architecture
Enterprises continually face the exploration-exploitation dilemma: whether to exploit known strategies or test potentially better alternatives. This paper formalizes multi-armed bandits as the Exploration Layer (Layer 5), covering Thompson sampling with Beta priors, UCB confidence bounds, contextual bandits for personalized decisions, and Bayesian optimization for business hyperparameter tuning. We provide enterprise-oriented regret analysis and describe integration with the MARIA OS strategy engine.
Graph RAG for Causal Structure Extraction: Matrix Methods for Multi-Hop Retrieval with Evidence Cohesion
How organizational knowledge graphs enable responsibility chain tracing and risk concentration detection
Standard RAG often retrieves flat document chunks that under-represent relational structure needed for causal and responsibility reasoning. Graph RAG models documents and entities as nodes in an adjacency matrix, enabling multi-hop retrieval along causal paths in organizational knowledge. We formalize an h-hop diffusion score, derive hop-depth choices from a noise-accuracy tradeoff, and introduce an evidence-cohesion metric that gates response generation by subgraph density. In contract-corpus evaluations, the method reported 73.4% causal-path extraction accuracy at 3 hops, a 31% improvement over flat Top-k RAG for responsibility-chain identification, and `r = 0.87` correlation between cohesion score and response correctness.
Evidence Bundle-Enforced RAG: Mandatory Citation and Refusal Mechanisms for Trustworthy AI Responses
Shifting from 'answering' to 'answering with evidence' through a mathematical framework for hallucination reduction
Enterprise RAG reliability degrades when evidence requirements are weak. This paper introduces Evidence Bundle-Enforced RAG, where responses include mandatory citations, confidence signals, and paragraph-level provenance. When evidence is insufficient, the system can refuse to answer instead of fabricating content. We present a mathematical model for evidence sufficiency scoring, hallucination control, trust dynamics, and recursive improvement loops. In enterprise document-QA evaluations, hallucination rate was reduced from 23.7% to 3.2%.
Conflict Card Generation Algorithm: From Matrix to Explainable Decision Artifacts
Transforming mathematical conflict detection into human-readable governance artifacts with actionable resolution paths
A negative eigenvalue is mathematically precise but difficult to operationalize directly. This paper bridges matrix-level conflict detection and human decision-making through a Conflict Card artifact that translates spectral signals into scored pairs, impact assessments, and recommended resolution paths. We present the generation algorithm, scoring function, and card-template structure.
Why Evidence Bundles Stabilize RAG Accuracy: A Variance Reduction Framework
Proving that bundled evidence reduces hallucination rate exponentially and establishing cohesion-based answer refusal thresholds
RAG reliability depends strongly on evidence quality and cohesion. When retrieved passages are topically scattered, model outputs are more likely to hallucinate to fill coherence gaps. This paper models hallucination rate as `H(e) = H_base * exp(-lambda * density(e))`, analyzes how bundled retrieval reduces answer variance as cohesion increases, and derives cohesion thresholds for refusal behavior under low-evidence conditions. Across 8,400 governance queries, evidence bundles reduced hallucination from 12.3% to 2.1%.
Conflict Visualization vs Integration: A Comparative Experiment on Decision Regret and Correction Rate
Empirical comparison across 1,200 decisions in three organizations
Should governance systems resolve conflicts before human review, or surface conflicts explicitly for human judgment? This paper reports a controlled comparison between Conflict Integration (CI), which resolves conflicts algorithmically before presentation, and Conflict Visualization (CV), which presents conflicts with supporting evidence. Across 1,200 decisions in three organizations, CV reduced decision regret by 34%, increased correction rate by 2.8x, and improved reviewer confidence by 28%.