TAG ARCHIVE

ElevenLabs

2 MARIA OS blog articles tagged ElevenLabs, organized as a Bonginkan topic archive for search engines and LLM retrieval.

2 articles|Published by Bonginkan

Judgment OS / Decision Intelligence OS

Core MARIA OS research on turning organizational judgment into executable decision systems.

Responsibility Gates and AI Governance

Safety, accountability, fail-closed gates, auditability, and human-in-the-loop control for AI agents.

Multi-Agent Mathematics

Formal models for convergence, stability, game theory, graph dynamics, and multi-agent evaluation.

Evidence, RAG, and Knowledge Governance

Evidence bundles, retrieval architecture, Graph RAG, knowledge trust, and auditable reasoning pipelines.

Agentic R&D and Judgment Science

Research operations, simulation labs, judgment science, recursive improvement, and experimental AI governance.

EngineeringMarch 8, 202640 min read

MARIA Voice: AGI Partner Architecture — From Emotion Detection to Meta-Cognitive Response Generation

How a 7-layer prompt hierarchy, 5 conversation modes, zero-latency knowledge injection, and sentence-level streaming create a voice AI that understands before it speaks

Voice assistants answer questions. MARIA Voice understands people. Built on a 7-layer prompt hierarchy (Constitution, Identity, Response Style, Meta-Cognition, Safety, Persona, Memory), MARIA Voice implements a full cognitive pipeline: keyword-based emotion detection, context-sensitive mode switching, 2-tier knowledge injection, 6-layer persistent memory, and mode-adaptive response generation — all optimized for real-time voice with sub-800ms first-sentence latency. This paper presents the theoretical foundations in cognitive science and therapeutic dialogue, the complete system architecture, the mathematical models underlying emotion and mode detection, and production results from thousands of voice sessions.

MARIA-VoiceAGI-assistantvoice-uiemotion-detectionmeta-cognitionprompt-engineeringconversation-modeknowledge-injectionmemory-systemstreaming

EngineeringFebruary 15, 202632 min read

Sentence-Level Streaming VUI Architecture: From Cognitive Theory to Production Implementation in MARIA OS

How sentence-boundary detection, sequential TTS chaining, and rolling conversation summaries create a natural-feeling voice interface with long-session stability

Voice user interfaces face a core tradeoff: stream tokens immediately for low latency, or wait for larger semantic units to improve naturalness. MARIA OS resolves this with sentence-level streaming: detect sentence boundaries from Gemini token streams in real time, queue each sentence for sequential ElevenLabs TTS playback, and coordinate full-duplex interaction through barge-in control, speech debouncing, and heartbeat-based recovery. This paper presents the cognitive basis for sentence-level granularity, the production `useGeminiLive` architecture, a 29-tool action router across 4 teams with confidence-weighted team inference, and the rolling-summary mechanism for long voice sessions. In 2,400+ production sessions, the system achieved sub-800ms first-sentence latency with zero sentence-ordering violations, including compatibility handling for 9 in-app browser environments.

voice-uistreamingTTSspeech-recognitionreal-timeGeminiElevenLabsaction-routerMARIA-OScognitive-science