EngineeringFebruary 16, 202630 min read

Real-Time Meeting Session Orchestration: State Machine Design for Multi-Component Bot Systems

How a seven-state machine coordinates browser automation, audio capture, speech recognition, and live streaming into a coherent meeting intelligence pipeline

A meeting AI bot is not a single component — it is an orchestra of subsystems that must start, coordinate, and stop in precise sequence. The browser must launch before audio can be captured. Audio must flow before speech recognition begins. Recognition must produce segments before minutes can be generated. And when the meeting ends, all components must shut down gracefully without losing data. This paper presents the state machine design of MARIA Meeting AI's session manager, which coordinates Playwright browser automation, CDP audio capture, Gemini Live Audio ASR, and incremental minutes generation through a seven-state lifecycle with EventEmitter-based real-time streaming to dashboard clients.

meeting-aistate-machineorchestrationevent-drivenssereal-timeplaywrightsession-management
EngineeringFebruary 15, 202632 min read

Sentence-Level Streaming VUI Architecture: From Cognitive Theory to Production Implementation in MARIA OS

How sentence-boundary detection, sequential TTS chaining, and rolling conversation summaries create a natural-feeling voice interface with long-session stability

Voice user interfaces face a core tradeoff: stream tokens immediately for low latency, or wait for larger semantic units to improve naturalness. MARIA OS resolves this with sentence-level streaming: detect sentence boundaries from Gemini token streams in real time, queue each sentence for sequential ElevenLabs TTS playback, and coordinate full-duplex interaction through barge-in control, speech debouncing, and heartbeat-based recovery. This paper presents the cognitive basis for sentence-level granularity, the production `useGeminiLive` architecture, a 29-tool action router across 4 teams with confidence-weighted team inference, and the rolling-summary mechanism for long voice sessions. In 2,400+ production sessions, the system achieved sub-800ms first-sentence latency with zero sentence-ordering violations, including compatibility handling for 9 in-app browser environments.

voice-uistreamingTTSspeech-recognitionreal-timeGeminiElevenLabsaction-routerMARIA-OScognitive-science
EngineeringFebruary 12, 202636 min read

Engineering Case Study: Quality Gate Control Theory for Manufacturing AI

Applying established control theory, R2R-aware manufacturing practice, and MARIA OS audit gates to simulated semiconductor quality cascades

Manufacturing AI systems face a stability problem that traditional software governance often does not: defect rates evolve as continuous dynamical variables under material variation, tool wear, and environmental drift. This engineering case study applies established PID, Lyapunov, and BIBO analysis to quality gates, positions the approach against semiconductor run-to-run control, and shows how MARIA OS adds fail-closed escalation, evidence bundles, and audit coordinates. The reported 94.7% defect containment, sub-200ms gate response, and 0.12x/stage attenuation are simulation results on a tuned linear model, not production fab measurements.

manufacturingquality-gatecontrol-theorystability-analysisreal-timedefect-rategovernance