ENGINEERING BLOG
Technical research and engineering insights from the team building the operating system for responsible AI operations.
188 articles · Published by MARIA OS
Eight papers that form the complete theory-to-operations stack: why organizational judgment needs an OS, structural design, stability laws, algorithm architecture, mission-constrained optimization, survival optimization, workforce transition, and agent lifecycle management.
Series Thesis
Company Intelligence explains why the OS exists. Structure defines responsibility. Stability laws prove when governance holds. Algorithms make it executable. Mission constraints keep optimization aligned. Survival theory determines evolutionary direction. White-collar transition shows who moves first. VITAL keeps the whole system alive.
00
Company Intelligence
Why organizational judgment needs an operating system, not just AI tools.
00
Company Intelligence
Why organizational judgment needs an operating system, not just AI tools.
01
Structural Design
How to decompose responsibility across human-agent boundaries.
02
Stability Laws
Mathematical conditions under which agentic governance holds or breaks.
03
Algorithm Stack
10 algorithms mapped to a 7-layer architecture for agentic organizations.
04
Mission Constraints
How to optimize agent goals without eroding organizational values.
05
Survival Optimization
Does evolutionary pressure reduce organizations to pure survival machines? The math of directed vs. undirected evolution.
06
Workforce Transition
Which white-collar workflows move first, and how fast the shift happens.
How a 7-layer prompt hierarchy, 5 conversation modes, zero-latency knowledge injection, and sentence-level streaming create a voice AI that understands before it speaks
Voice assistants answer questions. MARIA Voice understands people. Built on a 7-layer prompt hierarchy (Constitution, Identity, Response Style, Meta-Cognition, Safety, Persona, Memory), MARIA Voice implements a full cognitive pipeline: keyword-based emotion detection, context-sensitive mode switching, 2-tier knowledge injection, 6-layer persistent memory, and mode-adaptive response generation — all optimized for real-time voice with sub-800ms first-sentence latency. This paper presents the theoretical foundations in cognitive science and therapeutic dialogue, the complete system architecture, the mathematical models underlying emotion and mode detection, and production results from thousands of voice sessions.
7層プロンプト階層、5つの会話モード、ゼロレイテンシ知識注入、文レベルストリーミングが、話す前に理解する音声AIを実現する方法
音声アシスタントは質問に答える。MARIA Voiceは人間を理解する。7層プロンプト階層(憲法、アイデンティティ、応答スタイル、メタ認知、安全ゲート、ペルソナ、記憶)に基づき、MARIA Voiceは完全な認知パイプラインを実装する:キーワードベースの感情検出、コンテキスト感応型モード切替、2層知識注入、6層永続記憶、モード適応型応答生成 — すべてがリアルタイム音声用に最適化され、初回文レイテンシ800ms未満を達成。本論文では認知科学と治療的対話の理論的基盤、完全なシステムアーキテクチャ、感情・モード検出の数学モデル、そして数千の音声セッションからの運用結果を報告する。
Agents as compilers — a formal framework mapping NL intent through intermediate representation to optimized, type-safe runtime tools
Tool-generating agents are ad-hoc code producers. We reframe tool synthesis as a compilation problem: natural language intent is parsed into an Intent AST, lowered to a Tool IR (intermediate representation), optimized through security hardening and dead code elimination passes, and emitted as type-safe executable code that hot-loads into the agent runtime. This paper presents the Agent Tool Compiler architecture with formal language theory foundations.
コンパイラとしてのAgent — NL意図を中間表現を経由して最適化された型安全なランタイムツールに変換する形式的フレームワーク
ツール生成Agentはアドホックなコード生産者である。本稿ではツール合成をコンパイル問題として再定義する。自然言語意図をIntent AST(意図の抽象構文木)に解析し、Tool IR(中間表現)に変換し、セキュリティ強化・デッドコード除去などの最適化パスを適用し、型安全な実行可能コードとしてエージェントランタイムにホットロードする。形式言語理論に基づくAgent Tool Compilerアーキテクチャを提示する。
From static tool chains to self-extending capability — how MARIA OS agents create the tools they need at runtime
Normal agents wait for humans to build tools. MARIA OS agents create their own. This paper details the 4-phase tool lifecycle — Discovery, Synthesis, Validation, Registration — that enables agents to identify missing capabilities, generate tool implementations, verify correctness and safety in sandboxed environments, and hot-load new tools into the OS runtime. We formalize tool generation rate, quality convergence, and multi-agent tool sharing, and present a case study of an Audit agent creating an OCR extraction tool at runtime.
静的ツールチェーンから自己拡張能力へ — MARIA OSのAgentが実行時に必要なツールを自ら生成する方法
通常のエージェントは人間がツールを作るのを待つ。MARIA OSのエージェントは自らツールを作る。本論文では、エージェントが不足能力を特定し、ツール実装を生成し、サンドボックス環境で正確性と安全性を検証し、OSランタイムに新ツールをホットロードする4フェーズアーキテクチャ — Discovery, Synthesis, Validation, Registration — を詳述する。ツール生成率、品質収束、マルチエージェントツール共有を形式化し、監査エージェントが実行時にOCR抽出ツールを生成したケーススタディを提示する。
Formal test categories, composite scoring, and continuous evaluation pipelines that transform agent quality from subjective assessment into reproducible engineering measurement
Agent quality cannot be managed if it cannot be measured. Traditional software testing verifies deterministic input-output mappings, but AI agents operate in stochastic, multi-step decision spaces where correctness is contextual, safety is probabilistic, and governance compliance is structural. This paper introduces the MARIA OS Evaluation Harness — a standardized testing infrastructure that defines four test categories (correctness, safety, performance, governance compliance), four primary metrics (decision accuracy, gate compliance rate, evidence quality score, latency under load), and a formal composite scoring framework. We present the harness architecture comprising a test runner, scenario generator, oracle comparator, and regression detector, all scoped through MARIA coordinates for hierarchical test targeting. We prove that the composite agent score is monotonically responsive to genuine quality improvements and demonstrate that continuous evaluation pipelines catch 94.7% of quality regressions before production deployment.
形式的テストカテゴリ、複合スコアリング、継続的評価パイプラインによって、Agent品質を主観的評価から再現可能なエンジニアリング測定へ変革する
Agent品質は測定できなければ管理できない。従来のソフトウェアテストは決定論的な入出力マッピングを検証するが、AIエージェントは確率的かつ多段階の意思決定空間で動作し、正確さは文脈依存であり、安全性は確率的であり、ガバナンス準拠は構造的である。本論文はMARIA OS評価ハーネスを紹介する——4つのテストカテゴリ(正確性、安全性、パフォーマンス、ガバナンス準拠)、4つの主要メトリクス(意思決定精度、Gate準拠率、エビデンス品質スコア、負荷時レイテンシ)、そして形式的な複合スコアリングフレームワークを定義する標準化されたテストインフラストラクチャである。テストランナー、シナリオジェネレーター、オラクルコンパレーター、リグレッションディテクターで構成されるハーネスアーキテクチャを提示し、すべてのコンポーネントがMARIA座標系を通じてスコーピングされる。複合Agentスコアが真の品質改善に対して単調応答性を持つことを証明し、継続的評価パイプラインが本番デプロイ前に94.7%の品質回帰を検出することを実証する。
An agentic R&D team architecture for robot governance research — two lab divisions, eleven specialized agents, and five research themes bridging MARIA OS Multi-Universe evaluation with physical-world robotic systems
Physical-world robots demand governance architectures that digital-only agent systems cannot provide: sub-millisecond fail-closed gates, real-time multi-universe conflict detection, embodied ethical learning under sensor noise, and quantitative human-robot responsibility allocation at every decision node. This paper presents the Robot Judgment OS Lab — an agentic R&D team design embedded within the MARIA OS coordinate system, organized into two divisions (Robot Gate Architecture Lab and Embodied Learning & Conflict Lab) with eleven specialized agents operating under fail-closed research gates. We formalize five research themes: Responsibility-Bounded Robot Decision, Physical-World Conflict Mapping, Embodied Ethical Learning, Human-Robot Responsibility Matrix, and ROS2 Multi-Universe Bridge. Mathematical contributions include a real-time ConflictScore function, constrained RL for embodied ethics calibration, a four-factor responsibility decomposition protocol, safety-bounded action spaces, and a layered architecture formalization from ROS2 base through Multi-Universe, Gate, and Conflict layers. The lab design demonstrates that structured R&D governance — where research teams are themselves governed by the infrastructure they study — produces faster, safer, and more auditable advances in robot judgment than traditional unstructured robotics research.
How a seven-state machine coordinates browser automation, audio capture, speech recognition, and live streaming into a coherent meeting intelligence pipeline
A meeting AI bot is not a single component — it is an orchestra of subsystems that must start, coordinate, and stop in precise sequence. The browser must launch before audio can be captured. Audio must flow before speech recognition begins. Recognition must produce segments before minutes can be generated. And when the meeting ends, all components must shut down gracefully without losing data. This paper presents the state machine design of MARIA Meeting AI's session manager, which coordinates Playwright browser automation, CDP audio capture, Gemini Live Audio ASR, and incremental minutes generation through a seven-state lifecycle with EventEmitter-based real-time streaming to dashboard clients.
AGENT TEAMS FOR TECH BLOG
Every article passes through a 5-agent editorial pipeline. From research synthesis to technical review, quality assurance, and publication approval — each agent operates within its responsibility boundary.
Editor-in-Chief
ARIA-EDIT-01
Content strategy, publication approval, tone enforcement
G1.U1.P9.Z1.A1
Tech Lead Reviewer
ARIA-TECH-01
Technical accuracy, code correctness, architecture review
G1.U1.P9.Z1.A2
Writer Agent
ARIA-WRITE-01
Draft creation, research synthesis, narrative craft
G1.U1.P9.Z2.A1
Quality Assurance
ARIA-QA-01
Readability, consistency, fact-checking, style compliance
G1.U1.P9.Z2.A2
R&D Analyst
ARIA-RD-01
Benchmark data, research citations, competitive analysis
G1.U1.P9.Z3.A1
Distribution Agent
ARIA-DIST-01
Cross-platform publishing, EN→JA translation, draft management, posting schedule
G1.U1.P9.Z4.A1
Complete list of all 188 published articles. EN / JA bilingual index.
188 articles
All articles reviewed and approved by the MARIA OS Editorial Pipeline.
© 2026 MARIA OS. All rights reserved.