MARIA OS

DECISION OS
FOR AGENT
COMPANIES

Encode human judgment as an OS. AI Agents execute the business.

MARIA OS is a Decision Operating System that defines human decision-making as an OS, enabling AI Agents to execute operations following that judgment structure.

Rather than pursuing automation itself, it designs judgment, responsibility, and boundaries upfront.

As a result, companies scale as Agent Companies while preserving human authority.

The result is an Agent Company—where AI agents operate as accountable members of your company, governed by shared values and clear decision boundaries.

Not running AI. Running decisions.

Most AI tools chain prompts to accelerate automation. But what companies truly need is not just automation — where to delegate to AI, where to stop, and where humans take responsibility — this structure of decision-making. MARIA OS defines leader judgment as an operating system and transforms it into execution by AI Agents.

Preserve human authority while scaling AI execution

Make implicit judgment reusable as structured decisions

Prevent AI autonomy from becoming organizational risk through governance

This is for

Organizations where AI Agents execute real operations and decisions
Leaders who prioritize responsibility and governance, not just speed
Companies that demand consistent and reusable judgment structures

This is not for

AI tools that just chain prompts faster
Full automation that eliminates human responsibility
AI that replaces leadership judgment

Dynamic Harness / Main Concept

Autonomy needs a harness, not a bigger prompt.

MARIA OS does not judge agents by final output alone. Dynamic Harness tracks each run as a state vector across goal, memory, authority, quality, cost, and responsibility. When drift appears, it tightens gates, records evidence, escalates to humans, and expands autonomy only after stability is proven.

Intent

goal stability

Memory

context integrity

Authority

safe autonomy

See phase control Read the research

x(t) phase space

H(t) control surface

drift

harness

stable path

Detect drift

Tighten gates

Expand autonomy gradually

Dynamic Harness / Why it works

Static Harness checks whether an agent may start. Dynamic Harness checks whether it is still safe to continue.

Static Harness — activation time

Is the design contract valid before the system runs?

permissions & ownersrollback pathevidence requirementsapproval gates

Dynamic Harness — runtime

Given what is happening now, is it still safe to continue?

quality & evidenceheartbeat & retriescost & latencyauthority drift

runtime verdict:passwarnblockquarantinehuman approvalproposal-only

Catches post-activation failures

A valid blueprint can still fail live — quality drops, evidence goes missing, a tool starts failing. The harness stops the bad episode anyway.

Blocks runtime financial accidents

Duplicate charges and budget overruns are blocked, and high-cost tool calls escalate to a human, before the side effect lands.

Isolates unhealthy agents

Missing heartbeats, exhausted retries, and permission denials move an agent to quarantine before it amplifies failures.

Stops self-repair from self-harm

Loop guards catch repeated failure fingerprints and repair attempts, degrading to human approval or quarantine instead of looping.

Turns scanners into control signals

Decision and workflow scans become live runtime vectors — observe, constrain, require approval, repair, or quarantine — not passive reports.

Makes autonomy reversible

The allowed action space expands only while runtime evidence stays stable, and shrinks the moment risk rises.

So MARIA OS can say not just that an agent was designed safely, but that it is operating safely now — and if it stops being safe, autonomy is reduced before damage spreads.

Read the research

Dynamic Harness

Dynamic Harness changes the trajectory before an AI organization fails.

Instead of judging final outputs alone, MARIA OS evaluates goal, memory, identity, quality, latency, cost, and authority as one runtime episode. When the scorecard degrades, the harness shifts the system into rerun, quarantine, human approval, or repair proposal mode.

x(t) = [g,m,i,t,q,l,c,a]

H(t): observe / constrain / repair

Goal

Memory

Identity

Trust

Quality

Latency

Cost

Authority

stable

adapt

quarantine

Observe the runtime

Normalize every agent run into a runtime episode with intent, memory, tools, gates, assets, latency, and corrections.

Classify the drift

Map failures to owner, severity, confidence, user visibility, and the verification command that can prove the fix.

Read the gradient

Track completion, pass rate, retry rate, advisory lift, and failure density as a time-varying scorecard.

Control the phase

Convert instability into reruns, quarantine, draft repair PRs, or human approval before autonomy can expand.

Dynamic Harness connects episode extraction, failure taxonomy, scorecards, repair proposals, and controlled self-healing into runtime governance for companies and agentic society.

Read the research

Implementation Pattern / Spinal Reflex Wiring

How to implement spinal-reflex neural wiring for AI agents

Do not send every known stimulus to a large model. MARIA OS routes routine, bounded, accountable events through reflex arcs, while ambiguous or risky cases rise into deliberation.

Fast path for known work. Deliberative path for unknown work.

layer 1

stimulus

layer 2

reflex arc

layer 3

harness

layer 4

envelope

layer 5

trace

known stimulusgoverned action

Normalize every event into a stimulus packet

A reflex cannot fire from raw text alone. First, the system converts messages, forms, workflow changes, API callbacks, and document updates into typed stimulus packets with context, actor, object, risk, and current state.

The key is not intent detection. The key is operational typing.

raw input

stimulus packet

actorprincipal

objectresource

riskscore

statephase

authorityscope

Route known stimuli through bounded reflex arcs

A reflex arc is a predesigned execution path for a known class of work: reject incomplete input, classify a request, stop a prohibited transfer, attach evidence, escalate a high-risk case, or run a deterministic workflow.

A reflex is not a shortcut. It is a decision that has already been designed.

known / low risk

reflex arc

deterministic

ambiguous / novel

deliberation

LLM + human

missing authority

fail

closed

Reflex selection matrix

The routing decision is explicit: reflex for known and bounded work, deliberation for ambiguous work, fail-closed for missing authority.

Wrap each reflex in static and dynamic harnesses

Static harnesses define fixed authority, data, tool, and prohibition boundaries. Dynamic harnesses adjust the allowed range at runtime based on risk, confidence, state, deadline, and audit conditions.

The reflex moves fast only inside a governed action space.

Execute only through Envelope responsibility contracts

If the Envelope is missing or invalid, the action fails closed.

control wrapper

Wrap the reflex

reflex

check

static boundary

check

runtime risk

check

multi-stage control

check

stop condition

responsibility contract

Carry accountability

reflex

check

owner

check

authority scope

check

purpose scope

check

failure route

Observe, tune, and promote field patterns into OS assets

Each reflex produces traces: fired, blocked, escalated, overridden, or rolled back. FDE teams use those traces to tune local reflexes, then promote stable patterns into reusable MARIA OS assets.

Field implementation becomes platform learning.

MARIA OS

fired

blocked

returned

tuned

promoted

FDE traces become reusable reflex libraries

Operational Governance

The moat is not autonomy. It is knowing when autonomy must stop.

MARIA OS treats stopping, recovery, evidence, and human escalation as production paths, not exception handling. Internally, recovery paths are stressed aggressively. In customer environments, HITL stays heavier until trust, evidence, and repetition justify more autonomy.

Read the assessment

What we measure

Runtime proof

Fail-closed

Stop when authority, evidence, or context is insufficient.

Auto-recovery

Recover internally with causal logs and post-recovery checks.

HITL convergence

Reduce repeated human review only after the workflow proves stable.

Responsibility envelope

No execution path is valid without an accountable owner.

Evidence Hub

Every new section strengthens the Bonginkan trust graph.

MARIA OS content is not isolated marketing copy. Product pages, experiments, architecture notes, and technical articles are connected back to Bonginkan as publisher, builder, and accountable source.

Bonginkan Follow @bongin_ai

Company source

Product proof

MARIA OS reference

Official Bonginkan case material that ties the operating system back to the company record.

Implementation source

Deployment proof

MARIA OS Appliance

Appliance reference for teams that need a governed AI Agent runtime with a concrete delivery path.

Public archive

Research proof

Engineering blog

Technical articles, experiments, and decision architecture notes published as MARIA OS grows.

Social signal

Live proof

@bongin_ai

Ongoing product notes and founder-level thinking connected to the same product graph.

Features & Products

See the reality. Fix the structure. Run it every day.

We don't connect products with integrations. We align them with judgment.

01–06 Universe

07–10 Service

11–16 Platform

Sales Universe

6 agents + 3 packs

AI agent teams that execute deals with judgment-aware automation. Every deal stage has a specialist.

Learn more

Audit Universe

6 agents + 4 packs

Not a faster spreadsheet. A reproducible audit engine where every finding carries evidence.

Learn more

FAQ Universe

4 planets + 17 agents

Auto-generates FAQ from real documents. Every answer cites source, page, and evidence quality.

Learn more

Auto-Dev Universe

5 agents + 4 gates

Agents build, test, review, and deploy. Not 'AI writes code' — the database authorizes changes.

Learn more

CPA Universe

Knowledge graph + pass gate

AI agents that study for CPA exams using knowledge graphs and spaced repetition — governed by evidence.

Learn more

Meeting Universe

Value Scanning

10 agents + 4 gates

Stated values vs. practiced values. See where your organization's behavior contradicts its beliefs.

Learn more

Value Scanning

Workflow Scanning

Scan real processes, identify waste, responsibility gaps, and bottlenecks. Prescribe recomposition.

Learn more

Workflow Scanning

MVV OS Consulting

Turn your mission, vision, and values into executable governance. Philosophy becomes operating constraints.

Learn more

MVV OS Consulting

Agentic Company Insight

Assess your company's agentic maturity. Find where agents can act, where humans must decide, and where risk gates are missing.

Learn more

Agentic Company Insight

MARIA Voice

Speak to your Decision OS. Voice commands become governed actions.

Learn more

MARIA Voice

AI Office

A virtual office where AI agents work as departments — HR, Finance, Legal, Dev — governed by MARIA OS.

Learn more

MARIA BOOKING

CEO Clone OS

Voice interview, Decision OS, 5KB Genome, meeting agents, approval gates, integrations, and Doctor Agent repair.

Learn more

AI Office

MARIA VITAL

Life support OS for agent organizations. Continuously monitor behavior health, judgment quality, coordination state, and recoverability.

Learn more

CEO Clone OS

Agentic Company

From Human Company to Self-Improving — a structured evolution path with governance at every stage.

Learn more

MARIA VITAL

Life Support OS for Agent Orgs

Continuously monitor agent vitals — behavior health, judgment quality, coordination state, and recoverability.

Learn more

Agentic Company

The Destination, Not a Feature

From Human Company to Self-Improving — a structured evolution path with governance at every stage.

Learn more

MARIA OS

See → Fix → Run

Harness Adoption Map

Every LP surface gets a harness placement.

Cross harnesses share episodes, gates, scorecards, and quarantine. Individual harnesses control the failure modes unique to Sales, Audit, Voice, Meeting, and the other surfaces.

surfaces

controls

dynamic

Raw harness

Turns inputs, evidence, turns, and diffs into episodes

Cross harness

Shared gates and scorecards across products

Dynamic harness

Adjusts constraints and autonomy from drift

Comprehensive Harness Cycle

Fail-open cycle aggregates every stage.

The Harness Designer style plan, cycle report, and stable fingerprints keep running analyzable stages after failures.

stages

smoke

release

research

Universe runtimes

Sales Universe

G1.U1.P1.Z1.A1

Deal Evidence Intake Harness / Deal Phase Harness

Attach episode scoring to proposal and estimate generation.

Audit Universe

G1.U1.P2.Z1.A1

Evidence Chain Harness / Procedure-Specific Audit Harness

Evaluate every generated finding through evidence completeness and risk-tier gates.

FAQ Universe

G1.U1.P3.Z1.A1

Source Crawl Harness / FAQ Voice Harness

Add source freshness and public-release gates to generated FAQ artifacts.

Auto-Dev Universe

G1.U1.P4.Z1.A1

Diff Episode Harness / Repository-Specific Dev Harness

Attach dynamic harness scoring to CI failure triage and repair proposals.

CPA Universe

G1.U1.P5.Z1.A1

Learning Evidence Harness / Exam Domain Harness

Gate pass readiness with source validity and repeated-correction signals.

Meeting Universe

G1.U1.P6.Z1.A1

Consent Episode Harness / Meeting Phase Harness

Extend gate evaluation with harness interventions and episode severity.

Scanner & service loops

Decision Scanner

G1.U2.P5.Z1.A1

Decision Evidence Harness / Decision Context Harness

Score live decision scans with evidence density, branch risk, and authority-gate pressure.

Value Scanner

G1.U2.P1.Z1.A1

Value Evidence Harness / Executive Values Harness

Add harness confidence and evidence density to value scan summaries.

Workflow Scanner

G1.U2.P2.Z1.A1

Process Evidence Harness / Workflow Domain Harness

Score recompose plans with flow-drift and evidence-density controls.

MVV OS Consulting

G1.U2.P3.Z1.A1

MVV Interview Harness / CEO Clone Harness

Add contradiction and rule-enforceability scoring to CEO Clone outputs.

Agentic Company Insight

G1.U2.P4.Z1.A1

Role Mapping Harness / Department Harness

Add role autonomy confidence and rollback conditions to insight output.

Platform surfaces

MARIA Voice

G1.U3.P1.Z1.A1

Turn Episode Harness / Voice Mode Harness

Attach harness severity to action-chat function-call rounds.

MARIA BOOKING

G1.U3.P6.Z1.A1

Booking Conversation Harness / Reservation Phase Harness

Gate booking voice and calendar-sync episodes with consent, slot, and notification evidence.

AI Office

G1.U3.P2.Z1.A1

Office Event Harness / Agent Lifecycle Harness

Score task-engine events with office-health and handoff-drift signals.

CEO Clone OS

G1.U3.P3.Z1.A1

Judgment Sample Harness / Executive Persona Harness

Add contradiction density and identity-boundary scoring to elicitation outputs.

MARIA VITAL

G1.U3.P4.Z1.A1

Vital Signal Harness / Agent Vital Harness

Unify vital signals with the runtime harness scorecard.

Agentic Company

G1.U3.P5.Z1.A1

Company Phase Harness / Evolution Path Harness

Add phase advancement criteria and rollback triggers to Agentic Company stages.

17 surfaces -> raw intake -> cross gates -> individual dynamic control

Harness Installation Plan

MARIA Self-Healing Runtime turns failures into reviewable repair PRs.

The goal is not raw autonomous repair. It is safe autonomous repair: Failure Analyzer, Meta-Harness, Envelope, Memory Store, Human Approval Gate, and Loop Control collect the episode, classify confidence, plan the smallest repair, re-run local and cross harnesses, and preserve the learning.

candidates

P0 first

layers

collection loop

1collect

2analyze

3plan

4repair

5re-run

6learn

first five mechanisms

Three-Layer Failure Analyzer

Classify failures through deterministic signals, LLM root-cause hypotheses, and historical memory before any repair is attempted.

Start with CI and build logs: classify error type, affected surface, confidence, owner, and escalation route.

KPI: Misclassification rate

Harness Coverage Meta-Harness

Detect new APIs, screens, agents, integrations, permissions, and prompts that lack the required harness coverage.

Compare changed files against route, screen, prompt, permission, and integration coverage rules on every PR.

KPI: Coverage gap rate

Fixer Agent Envelope Router

Route repairs into low, medium, high, or memory-write envelopes so the fixer cannot exceed its authority.

Allow low-risk test, type, copy, and local UI patches as draft PRs; route prompts, DB, permissions, and external actions to review.

KPI: Unauthorized mutation count

Failure Memory Store

Store failure evidence, cause, patch rationale, rerun result, side effects, review notes, human reviewer rationale, and prevention rules as reusable assets.

Persist CI and E2E repair episodes with reviewer rationale signatures that can retrieve similar incidents before future patch planning.

KPI: Repeat failure rate

Risk Calibration Ledger

Compare runtime risk scores, monitor findings, reviewer decisions, and later incidents so expert-prior thresholds can be calibrated from operational evidence.

Persist score basis, calibration version, risk breakdown, reviewer rationale, and incident outcome for every repair or approval episode.

KPI: Calibration error

PR-First Regression Loop

Make the final unit of autonomous repair a reviewable PR with human approval gates, loop controls, and scoped, cross, meta, deploy, and post-deploy harness evidence.

Generate repair PR bodies with failure summary, rationale, harnesses run, side-effect risk, rollback, and memory update.

KPI: Autonomous repair success rate

Spec Contract Harness

G1.U4.P1.Z1.A1

StaticPhase 1

observes

undefined errorsfield driftmissing owners

control: Blocks implementation when API, UI fields, DB columns, and acceptance criteria disagree.

analyzer: Deterministic schema diff first, LLM review only for ambiguous requirement language.

envelope: May block implementation and draft spec diffs; may not approve scope changes.

coverage: Flags new API, DB, or screen files that lack a spec-contract episode.

first slice: Generate a schema-to-screen diff for product specs before agent work starts.

owner: Product Architecture

Prompt Policy Harness

G1.U4.P1.Z2.A1

StaticPhase 3

observes

missing output contractunsafe delegationweak evaluation rubric

control: Quarantines prompts that lack prohibited actions, output format, evidence rules, or gate policy.

analyzer: Rule-based prompt checklist with memory lookup for prior prompt failures.

envelope: May quarantine prompts and propose edits; core authority prompts require reviewer approval.

coverage: Detects production prompts without output format, forbidden actions, or evaluation criteria.

first slice: Score production prompts for format, authority boundary, and evaluation coverage.

owner: Agent Governance

Client Data Preflight Harness

G1.U4.P2.Z1.A1

PreflightPhase 2

observes

tenant scopePII classagent permission tier

control: Stops an agent before it reads customer data outside its contract, role, or approval state.

analyzer: Deterministic tenant and role policy evaluation before any LLM reasoning.

envelope: May deny or request approval; may not expand customer-data access grants.

coverage: Finds data retrieval paths without tenant, PII, and permission preflight checks.

first slice: Attach a preflight decision to every customer-data retrieval and exported artifact.

owner: Security

External Action Preflight Harness

G1.U4.P2.Z2.A1

PreflightPhase 2

observes

blast radiusrecipient visibilitybusiness-hour policy

control: Routes public, financial, destructive, or production actions to human approval before execution.

analyzer: Structured action taxonomy with confidence threshold and human fallback.

envelope: May draft outbound actions; public, financial, destructive, and deploy actions require approval.

coverage: Reports external side-effect commands not covered by action preflight policy.

first slice: Gate outbound email, invoice issue, GitHub PR creation, and deploy commands with one policy matrix.

owner: Operations

Agent Runtime Telemetry Harness

G1.U4.P3.Z1.A1

RuntimePhase 4

observes

latencytoken spendRAG hit ratetool-call loop

control: Detects drift during execution and changes route, model, retrieval scope, or escalation state.

analyzer: Metric thresholds plus failure-taxonomy classifier backed by similar runtime episodes.

envelope: May reroute, degrade, retry, or escalate; may not change authority policy while running.

coverage: Finds agent runs missing cost, retrieval, gate, and correction telemetry.

first slice: Normalize every agent run into a runtime episode with cost, retrieval, gate, and correction signals.

owner: Runtime Platform

Voice Call Stability Harness

G1.U4.P3.Z2.A1

RuntimePhase 4

observes

turn gapsTTS failurerecognition restartemotion mismatch

control: Falls back to text, pauses tool execution, or escalates when voice state becomes unstable.

analyzer: Deterministic audio-state checks with LLM review for semantic or emotion mismatch.

envelope: May pause voice execution or switch channels; may not execute irreversible customer actions.

coverage: Flags voice flows without turn continuity, TTS completion, and fallback telemetry.

first slice: Score each voice turn for recognition continuity, TTS completion, and unsafe action pressure.

owner: Voice Platform

Artifact Evidence Review Harness

G1.U4.P4.Z1.A1

Post-runPhase 3

observes

source matchamount mismatchmissing TODOunsupported claim

control: Returns generated artifacts for repair when evidence, numbers, deadline, or owner is missing.

analyzer: Structured source comparison first, LLM panel only for semantic support checks.

envelope: May return artifacts for repair; may not send customer-visible artifacts automatically.

coverage: Finds generated artifacts without source episode, owner, or review outcome.

first slice: Review proposal, SOW, estimate, and meeting-minute artifacts against their source episode.

owner: Quality

Model Routing Harness

G1.U4.P5.Z1.A1

DynamicPhase 6

observes

confidence sloperetry densityprovider failurecost variance

control: Switches provider, narrows retrieval, downgrades autonomy, or regenerates queries from live signals.

analyzer: Scorecard slope and provider error analysis before model-choice LLM reasoning.

envelope: May switch models within approved tiers; budget or provider-policy changes require approval.

coverage: Detects model routes without confidence, cost, retry, and provider-failure records.

first slice: Add dynamic routing decisions to failed RAG and low-confidence answer episodes.

owner: Model Ops

CI Repair Harness

G1.U4.P6.Z1.A1

AutonomousPhase 5

observes

failing joblog signaturechanged filesverification command

control: Creates scoped repair PRs, reruns failed jobs, and quarantines flaky harness paths.

analyzer: Log-signature classifier, deterministic changed-file mapping, then LLM patch planning.

envelope: May create scoped repair PRs; may not merge, deploy, or weaken required checks.

coverage: Finds CI checks, harness jobs, and changed surfaces missing repair coverage.

first slice: Convert CI failures into repair scope, candidate files, validation commands, and PR body.

owner: Auto-Dev

Company Operating Harness

G1.U4.P7.Z1.A1

OrganizationPhase 8

observes

stalled dealfollow-up gapcontract-invoice mismatchbranch drift

control: Turns organizational anomalies into owner alerts, follow-up tasks, policy reviews, or repair workflows.

analyzer: Business-rule anomaly detection with memory lookup for repeated operating patterns.

envelope: May create tasks and escalation briefs; may not alter contracts, invoices, or staffing authority.

coverage: Finds business processes without event source, owner, SLA, or escalation route.

first slice: Connect CRM, contract, invoice, recruiting, and support events into one operating scorecard.

owner: Executive Office

Integration Contract Runtime Harness

G1.U4.P3.Z3.A1

RuntimePhase 4

observes

schema driftrate-limit pressureconnector auth decaypartial sync

control: Blocks write paths when connector schema, auth, or idempotency state is unsafe and creates bounded repair work for the owning integration.

analyzer: Connector telemetry and contract snapshots are compared first, then ambiguous partial-sync cases are routed to LLM-assisted impact analysis.

envelope: May pause connector writes, degrade to read-only, or open repair tasks; may not rotate credentials or expand third-party scopes.

coverage: Flags integrations that lack schema snapshots, retry policy, auth expiry telemetry, or partial-write reconciliation.

first slice: Attach runtime contract checks to Salesforce, freee, Google Calendar, and storage sync episodes.

owner: Integration Platform

Approval Latency Review Harness

G1.U4.P4.Z2.A1

Post-runPhase 5

observes

approval waitreviewer overridestale escalationdecision reversal

control: Converts slow or unstable approval paths into owner alerts, queue reshaping proposals, and gate-policy repair tickets.

analyzer: SLA and queue metrics are inspected deterministically before LLM review summarizes why approvals are delayed or repeatedly reversed.

envelope: May recommend reviewer reassignment, SLA changes, or gate copy updates; may not bypass approval or approve work on behalf of humans.

coverage: Finds human gates without explicit SLA, reviewer owner, escalation route, reversal tracking, or stale-approval handling.

first slice: Score finance, audit, deploy, and outbound customer approval gates for wait time and reversal patterns.

owner: Risk Operations

Memory Write Harness

G1.U4.P5.Z2.A1

DynamicPhase 6

observes

memory mutationsource evidenceretention policycontradiction

control: Stages learning-store writes until source evidence, retention class, contradiction status, and rollback path are attached.

analyzer: Structured provenance checks and retention rules run before semantic contradiction review decides whether a memory write is safe.

envelope: May stage or reject memory writes and request reviewer rationale; may not permanently mutate shared memory without source evidence.

coverage: Detects memory-writing agents without provenance, retention class, reviewer route, rollback key, or contradiction scan.

first slice: Gate CI repair, workflow repair, and customer-operations memory writes with provenance and contradiction checks.

owner: Memory Platform

Deployment Canary Harness

G1.U4.P6.Z2.A1

AutonomousPhase 7

observes

canary error ratefeature flag staterollback pathpost-deploy probe

control: Stops rollout and produces a rollback or flag-disable proposal when canary metrics exceed the approved blast-radius envelope.

analyzer: Deployment metrics, smoke probes, and flag diffs are checked first, with LLM analysis limited to summarizing blast-radius evidence.

envelope: May disable feature flags, stop rollout, or open rollback PRs; may not promote canaries to full rollout without approval.

coverage: Finds deployable surfaces without canary probes, flag owner, rollback command, post-deploy observation, or customer-impact tier.

first slice: Add canary probes and rollback evidence to Auto-Dev repair PRs and Vercel preview promotion.

owner: Release Engineering

Customer Operations Harness

G1.U4.P7.Z2.A1

OrganizationPhase 8

observes

support backlogSLA breachrenewal riskincident comms gap

control: Turns backlog, SLA, renewal, and incident-communication gaps into routed owner work with draft evidence packs.

analyzer: Operational thresholds and account-health rules are evaluated first, then LLM review drafts customer-safe escalation summaries.

envelope: May create internal tasks and draft customer updates; may not send incident, renewal, or contractual messages without approval.

coverage: Flags customer operations flows without SLA owner, customer visibility tier, account-risk signal, or approved communication path.

first slice: Join support tickets, account health, renewal dates, and incident events into one customer-ops harness scorecard.

owner: Customer Operations

Frontend Render Contract Harness

G1.U4.P1.Z3.A1

StaticPhase 1

observes

hydration riskserver-client boundaryroute metadataempty state

control: Blocks UI changes when route ownership, hydration boundaries, metadata, or user-visible fallback behavior is incomplete.

analyzer: Static route and component inspection checks client directives, async boundaries, metadata, and empty-state contracts before visual review.

envelope: May block component changes and propose boundary fixes; may not convert server components to client components without owner approval.

coverage: Flags new pages, layouts, or interactive components without render contract, loading state, empty state, or ownership evidence.

first slice: Run render-contract checks on product pages, dashboard panels, and experimental surfaces added in each PR.

owner: Frontend Platform

Responsive I18n Preflight Harness

G1.U4.P2.Z3.A1

PreflightPhase 2

observes

missing translation keytext overflowlocale route driftmobile snap break

control: Stops pages from shipping when English and Japanese content, route availability, or mobile layout behavior diverge.

analyzer: Message-key diffs and viewport constraints are evaluated deterministically before visual checks review overflow or layout regressions.

envelope: May block release and propose copy or layout fixes; may not change product messaging intent without content owner review.

coverage: Finds locale-aware pages without message parity, mobile viewport coverage, overflow checks, or translated route validation.

first slice: Attach locale parity and mobile text-fit checks to blog, product, dashboard, and experimental pages.

owner: Frontend Platform

UI Visual Richness Harness

G1.U4.P2.Z2.A2

Post-runPhase 2

observes

text-only viewportmissing product visualflat compositionlow color variety

control: Blocks market-facing visual acceptance when a route scores below the richness threshold and queues a UI-agent repair plan.

analyzer: Playwright captures first-viewport screenshots and deterministic DOM visual metrics, then emits scoped UI-agent repair tasks for low-scoring routes.

envelope: May draft visual improvement plans and low-risk UI patches; may not ship brand direction changes or remove governance evidence without review.

coverage: Finds public routes without enough primary visual asset density, color variety, layered surfaces, hierarchy, or screenshot evidence.

first slice: Score public routes above the fold and write screenshot-backed repair tasks for any page that feels visually underbuilt.

owner: Frontend Platform

Accessibility Visual Review Harness

G1.U4.P4.Z3.A1

Post-runPhase 3

observes

focus trapcontrast driftaria gapcanvas blank

control: Returns UI surfaces for repair when keyboard navigation, focus management, contrast, labels, or visual rendering evidence is missing.

analyzer: Automated accessibility and screenshot checks run first, with LLM review only for ambiguous visual hierarchy or interaction clarity.

envelope: May return UI artifacts for repair; may not waive accessibility regressions on production paths without documented approval.

coverage: Flags interactive screens without keyboard path, contrast check, semantic labels, screenshot evidence, or canvas fallback verification.

first slice: Add postrun accessibility and screenshot review to dense dashboards, voice UI, and canvas-heavy experimental pages.

owner: Design Systems

API Route Contract Harness

G1.U4.P1.Z4.A1

StaticPhase 1

observes

input schema gapresponse driftstatus mismatchmissing coordinate

control: Blocks backend endpoints when request validation, response shape, error behavior, or governance coordinates are missing.

analyzer: Route-handler AST and schema checks validate methods, input parsing, status codes, and response shape before semantic contract review.

envelope: May block API route changes and draft schema repairs; may not alter public API semantics without product and backend approval.

coverage: Finds route handlers without input validation, typed response envelope, error taxonomy, MARIA coordinate, or test coverage.

first slice: Score new and modified app/api route handlers for validation, typed envelopes, and explicit error outcomes.

owner: Backend Platform

Auth Permission Preflight Harness

G1.U4.P2.Z4.A1

PreflightPhase 2

observes

missing sessiontenant leakrole mismatchtool permission drift

control: Stops frontend, API, and agent actions before they cross tenant, role, data, or tool authority boundaries.

analyzer: Deterministic session, tenant, role, and tool-scope policy evaluation runs before any request or agent action mutates state.

envelope: May deny requests, downgrade to read-only, or request approval; may not grant roles, tenants, or tool permissions.

coverage: Flags server actions, API routes, and agent tools without session checks, tenant filters, role policy, or permission envelope.

first slice: Attach auth preflight results to write APIs, customer-data reads, agent tools, and external action routes.

owner: Security

DB Migration Preflight Harness

G1.U4.P2.Z5.A1

PreflightPhase 2

observes

destructive migrationmissing indexrls policy gapseed drift

control: Stops DB changes when reversibility, tenant policy, data migration, index coverage, or test evidence is incomplete.

analyzer: Schema diff, migration operation, index coverage, and RLS policy checks run before reviewer-guided data-risk analysis.

envelope: May block migrations and draft reversible plans; may not apply destructive DB changes or relax RLS without explicit approval.

coverage: Finds schema changes without rollback, RLS impact, seed update, data backfill, index analysis, or integration-test plan.

first slice: Evaluate db/schema changes for destructive operations, RLS coverage, rollback path, and dependent API surfaces.

owner: Data Platform

Data Provider Runtime Harness

G1.U4.P3.Z4.A1

RuntimePhase 4

observes

mock-live driftadapter timeoutshape mismatchfallback leak

control: Detects live adapter drift and switches views to bounded fallback states while routing repair work to the provider owner.

analyzer: Runtime adapter telemetry and response-shape checks compare mock and live provider contracts before fallback behavior is adjusted.

envelope: May degrade to mock-safe or read-only mode and open adapter repair tasks; may not silently mix tenant data across providers.

coverage: Flags data providers without mock-live parity tests, timeout policy, fallback state, tenant filter, or response-shape contract.

first slice: Monitor dashboard and product data providers for mock-live parity, adapter timeout, and shape mismatch episodes.

owner: Data Platform

Queue Cron Runtime Harness

G1.U4.P3.Z5.A1

RuntimePhase 4

observes

missed tickduplicate jobstale lockbacklog growth

control: Prevents duplicate or stale scheduled execution and routes missed ticks, queue backlogs, and lock failures to bounded recovery.

analyzer: Schedule, idempotency, lock, and backlog telemetry are checked first, then historical incident memory ranks likely repair paths.

envelope: May pause jobs, skip duplicate ticks, or enqueue repair tasks; may not replay side-effecting jobs without approval.

coverage: Finds cron and background workflows without idempotency key, stale-lock handling, backlog metrics, or replay policy.

first slice: Add runtime checks to Civilization daily advancement, intelligence scans, and automation harness jobs.

owner: Runtime Platform

RAG Index Runtime Harness

G1.U4.P3.Z6.A1

RuntimePhase 4

observes

index freshnesschunk-source mismatchretrieval misscitation gap

control: Blocks answer generation or downgrades confidence when retrieval freshness, source integrity, or citation coverage fails.

analyzer: Index timestamps, source hashes, retrieval hit rates, and citation coverage are checked before semantic answer support review.

envelope: May narrow retrieval, mark sources stale, or request reindex; may not publish unsupported answers or delete source corpora.

coverage: Flags ingestion and RAG paths without source hash, freshness SLA, retrieval metric, citation requirement, or reindex workflow.

first slice: Attach RAG freshness checks to FAQ, CPA, knowledge graph, and document-scanner answer episodes.

owner: Knowledge Platform

Streaming Output Runtime Harness

G1.U4.P3.Z7.A1

RuntimePhase 4

observes

partial unsafe outputstream aborttool-call leakschema fragment

control: Stops or rewrites streamed output when partial content violates schema, authority, safety, or customer-visibility rules.

analyzer: Chunk-level schema, safety, and tool-call guards run during streaming before postrun review evaluates full artifact quality.

envelope: May stop streams, redact partial chunks, or fall back to safe summary; may not continue unsafe public output after a guard trip.

coverage: Finds streaming endpoints without chunk guard, abort policy, redaction path, final envelope validation, or audit trace.

first slice: Add chunk-level guards to audit chat, voice responses, workflow scans, and model-generated report streams.

owner: Model Ops

Trace Observability Harness

G1.U4.P5.Z3.A1

DynamicPhase 6

observes

missing tracecoordinate gapmetric blind spotlog pii leak

control: Prevents blind autonomous execution by requiring traceable coordinates, redacted logs, owned metrics, and alert coverage.

analyzer: Trace coverage, coordinate presence, metric completeness, and PII log policy checks run before observability repair planning.

envelope: May add instrumentation tasks and block blind automation; may not expose sensitive logs or weaken retention policy.

coverage: Finds routes, jobs, agents, and UI workflows without trace ID, MARIA coordinate, metric owner, redaction, or alert rule.

first slice: Score new APIs, cron jobs, and agent workflows for trace coverage and coordinate completeness.

owner: Observability

E2E Journey Repair Harness

G1.U4.P6.Z3.A1

AutonomousPhase 5

observes

journey failurevisual diffselector driftnavigation dead end

control: Creates scoped repair plans when user-critical flows fail through selector drift, visual regression, navigation, or data fixture mismatch.

analyzer: Playwright traces, screenshots, selector changes, and route diffs are classified before repair planning proposes the smallest UI or test fix.

envelope: May update scoped selectors, fixtures, and low-risk UI defects; may not delete user-critical assertions or weaken journey coverage.

coverage: Finds product-critical flows without E2E journey, screenshot baseline, responsive coverage, fixture owner, or failure fingerprint.

first slice: Attach E2E journey repair loops to booking, workflow scanner, audit office, and dashboard critical paths.

owner: Quality Engineering

Edge Cache Runtime Harness

G1.U4.P3.Z8.A1

RuntimePhase 4

observes

cache poisoninglocale redirect loopstale pageheader drift

control: Detects stale, misrouted, or incorrectly cached responses and routes safe cache disablement or middleware repair proposals.

analyzer: Header, redirect, locale, and cache-control traces are checked deterministically before impact analysis reviews user-visible fallout.

envelope: May disable caching for affected routes or open middleware repair tasks; may not change global cache policy without approval.

coverage: Flags middleware and cached routes without cache-key policy, locale redirect tests, stale-content SLA, or header verification.

first slice: Monitor locale middleware, product pages, blog pages, and API cache headers for redirect and stale-content incidents.

owner: Web Platform

observe -> gate -> review -> route -> repair

Universe Builder

Watch a zone come to life

universe-builder

Scroll to start building...

Build Sequence

Goal

Scope

Team

Responsibility

Skills

Build

Gates

Validate

Test

Deploy

Goal > Scope > Team > Responsibility > Skills > Build > Gates > Validate > Test > Deploy

Skills (K1-K8) are dynamically fetched and auto-refilled from Skill Store

DECISION OSFOR AGENTCOMPANIES

Autonomy needs a harness, not a bigger prompt.

Static Harness checks whether an agent may start. Dynamic Harness checks whether it is still safe to continue.

Catches post-activation failures

Blocks runtime financial accidents

Isolates unhealthy agents

Stops self-repair from self-harm

Turns scanners into control signals

Makes autonomy reversible

Dynamic Harness changes the trajectory before an AI organization fails.

Observe the runtime

Classify the drift

Read the gradient

Control the phase

How to implement spinal-reflex neural wiring for AI agents

Normalize every event into a stimulus packet

Route known stimuli through bounded reflex arcs

Wrap each reflex in static and dynamic harnesses

Observe, tune, and promote field patterns into OS assets

The moat is not autonomy. It is knowing when autonomy must stop.

Fail-closed

Auto-recovery

HITL convergence

Responsibility envelope

Every new section strengthens the Bonginkan trust graph.

MARIA OS reference

MARIA OS Appliance

Engineering blog

@bongin_ai

See the reality. Fix the structure. Run it every day.

Sales Universe

Audit Universe

FAQ Universe

Auto-Dev Universe

CPA Universe

Value Scanning

Workflow Scanning

MVV OS Consulting

Agentic Company Insight

MARIA Voice

AI Office

CEO Clone OS

MARIA VITAL

Agentic Company

Life Support OS for Agent Orgs

The Destination, Not a Feature

Every LP surface gets a harness placement.

Raw harness

Cross harness

Dynamic harness

Fail-open cycle aggregates every stage.

Universe runtimes

Sales Universe

Audit Universe

FAQ Universe

Auto-Dev Universe

CPA Universe

Meeting Universe

Scanner & service loops

Decision Scanner

Value Scanner

Workflow Scanner

MVV OS Consulting

Agentic Company Insight

Platform surfaces

MARIA Voice

MARIA BOOKING

AI Office

CEO Clone OS

MARIA VITAL

Agentic Company

MARIA Self-Healing Runtime turns failures into reviewable repair PRs.

Three-Layer Failure Analyzer

Harness Coverage Meta-Harness

Fixer Agent Envelope Router

Failure Memory Store

Risk Calibration Ledger

PR-First Regression Loop

Spec Contract Harness

Prompt Policy Harness

DECISION OS
FOR AGENT
COMPANIES