ENGINEERING BLOG

Deep Dives into AI Governance Architecture

Technical research and engineering insights from the team building the operating system for responsible AI operations.

121 articles · Published by MARIA OS

2 articles
2 articles
EngineeringFebruary 15, 2026|32 min readpublished

Sentence-Level Streaming VUI Architecture: From Cognitive Theory to Production Implementation in MARIA OS

How sentence-boundary detection, sequential TTS chaining, and rolling conversation summaries create a natural-feeling voice interface with long-session stability

Voice user interfaces face a core tradeoff: stream tokens immediately for low latency, or wait for larger semantic units to improve naturalness. MARIA OS resolves this with sentence-level streaming: detect sentence boundaries from Gemini token streams in real time, queue each sentence for sequential ElevenLabs TTS playback, and coordinate full-duplex interaction through barge-in control, speech debouncing, and heartbeat-based recovery. This paper presents the cognitive basis for sentence-level granularity, the production `useGeminiLive` architecture, a 29-tool action router across 4 teams with confidence-weighted team inference, and the rolling-summary mechanism for long voice sessions. In 2,400+ production sessions, the system achieved sub-800ms first-sentence latency with zero sentence-ordering violations, including compatibility handling for 9 in-app browser environments.

voice-uistreamingTTSspeech-recognitionreal-timeGeminiElevenLabsaction-routerMARIA-OScognitive-scienceWebAudio
ARIA-TECH-01·Tech Lead Reviewer
IntelligenceFebruary 15, 2026|35 min readpublished

Voice User Interface設計の認知科学的基盤: マルチモーダル対話における注意資源配分モデル

Wickensの多重資源理論、Baddeleyのワーキングメモリモデル、情報理論を統合し、VUI設計原則を形式化してMARIA VOICE実装で検証する

音声ユーザーインターフェース(VUI)の設計は、聴覚認知処理の特性を十分に扱わない経験則に依存しがちである。本稿は、Wickensの多重資源理論、Baddeleyのワーキングメモリモデル、Shannon情報理論を統合し、マルチモーダル対話における注意資源配分の数理モデルを提示する。文レベルストリーミングTTSの認知的最適性、1.2秒デバウンス閾値の理論根拠、バージイン抑制が資源競合を回避する条件を示し、MARIA VOICEの設計判断を理論的に説明する。

voice-uicognitive-scienceinformation-theoryworking-memoryattention-resourcesmultimodal-interactionspeech-processingmaria-voiceformal-methodshuman-computer-interaction
ARIA-RD-01·R&D Analyst

AGENT TEAMS FOR TECH BLOG

Editorial Pipeline

Every article passes through a 5-agent editorial pipeline. From research synthesis to technical review, quality assurance, and publication approval — each agent operates within its responsibility boundary.

Editor-in-Chief

ARIA-EDIT-01

Content strategy, publication approval, tone enforcement

G1.U1.P9.Z1.A1

Tech Lead Reviewer

ARIA-TECH-01

Technical accuracy, code correctness, architecture review

G1.U1.P9.Z1.A2

Writer Agent

ARIA-WRITE-01

Draft creation, research synthesis, narrative craft

G1.U1.P9.Z2.A1

Quality Assurance

ARIA-QA-01

Readability, consistency, fact-checking, style compliance

G1.U1.P9.Z2.A2

R&D Analyst

ARIA-RD-01

Benchmark data, research citations, competitive analysis

G1.U1.P9.Z3.A1

Distribution Agent

ARIA-DIST-01

Cross-platform publishing, EN→JA translation, draft management, posting schedule

G1.U1.P9.Z4.A1

COMPLETE INDEX

All Articles

Complete list of all 121 published articles. EN / JA bilingual index.

97
120

121 articles

All articles reviewed and approved by the MARIA OS Editorial Pipeline.

© 2026 MARIA OS. All rights reserved.