MARIA OS

DECISION OS
FOR AGENT
COMPANIES

人間の判断構造をOS化し、AI Agentが企業活動を実行する。

MARIA OSは人間の意思決定をオペレーティングシステムとして定義し、AI Agentがその判断構造に従って業務を実行するDecision Operating Systemです。

自動化を目的にするのではなく、判断・責任・境界を事前に設計します。

その結果、企業は人間の権限を維持したままAgent Companyとしてスケールします。

その結果がAgent Companyです。AIエージェントが共有された価値観と明確な意思決定の境界に従い、説明責任を持つ組織の一員として機能します。

AIを動かすのではない。意思決定を動かす。

多くのAIツールはプロンプトを連携させ、自動化を加速させます。しかし企業に必要なのは単なる自動化ではありません — どこでAIに任せるのか、どこで停止するのか、どこで人間が責任を持つのか — この意思決定の構造です。MARIA OSはリーダーの判断をオペレーティングシステムとして定義し、AI Agentによる実行へと変換します。

AI実行をスケールしながら人間の権限を維持する

暗黙の判断を構造化された意思決定として再利用可能にする

AIの自律性が組織リスクになることをガバナンス構造で防ぐ

こんな組織のために

AI Agentが実際の業務や意思決定を実行する組織
スピードだけでなく責任と統治を重視するリーダー
一貫性と再利用可能な判断構造を求める企業

こんな用途には向きません

プロンプトを高速に連携させるだけのAIツール
人間の責任を排除する完全自動化
リーダーの判断を置き換えるAI

Dynamic Harness / Main Concept

自律性に必要なのは、
巨大なプロンプトではなく
ハーネス。

MARIA OSはAgentの実行を、目的・記憶・権限・品質・コスト・責任の状態ベクトルとして観測します。Dynamic Harnessはドリフトを検知し、リスクに応じてGate・証跡・HITLを締め直し、安定が証明された範囲だけ自律性を段階的に広げます。

意図

目的の安定

記憶

文脈の整合

権限

安全な自律

位相制御を見る研究記事へ

x(t) phase space

H(t) control surface

drift

harness

stable path

ズレを検知

Gateを締める

自律性を段階許可

Dynamic Harness / Why it works

静的ハーネスは、起動してよいかを確認する。Dynamic Harnessは、続けて安全かを実行中に判断する。

静的ハーネス — 起動時

起動前に、設計の契約が満たされているか。

権限・責任者rollback path証拠要件承認Gate

Dynamic Harness — 実行中

いま起きていることを踏まえ、まだ続けて安全か。

品質・証拠heartbeat・retryコスト・latency権限ドリフト

実行時の判定：passwarnblockquarantinehuman approvalproposal-only

起動後の失敗を捕まえる

設計が正しくても実行中に壊れる。品質低下、証拠欠落、ツール障害が起きれば、それでも不正なepisodeを止める。

実行中の金銭事故を止める

重複課金や予算超過をblockし、高コストなツール呼び出しはhuman approvalへ。副作用が発生する前に止める。

不健全なAgentを隔離する

heartbeat欠落、retry枯渇、権限拒否が出たAgentは、失敗が増幅する前にquarantineへ移す。

自己修復の暴走を防ぐ

loop guardが同一failure fingerprintと修復試行の反復を検知し、ループせずhuman approvalまたはquarantineへ落とす。

Scannerを制御信号に変える

Decision/Workflow Scanは実行時ベクトルになる。観測・制約・承認要求・修復・隔離へ。受け身のレポートで終わらない。

自律性を可逆にする

許可される行動空間は、実行時の証拠が安定している間だけ広がり、リスクが上がれば即座に縮む。

つまりMARIA OSは「このAgentは安全に設計された」だけでなく、「いま安全に動いており、危険になれば被害が広がる前に自律性を下げられる」と言える。

研究記事を読む

Dynamic Harness

Dynamic Harnessは、AI組織の軌道を壊れる前に変える。

最終出力だけではなく、目的・記憶・identity・品質・latency・cost・authorityを同じruntime episodeとして評価する。scorecardが悪化した時は、rerun、quarantine、human approval、repair proposalへ切り替える。

x(t) = [g,m,i,t,q,l,c,a]

H(t): observe / constrain / repair

Goal

Memory

Identity

Trust

Quality

Latency

Cost

Authority

stable

adapt

quarantine

Runtimeを観測する

意図、記憶、ツール、Gate、生成物、レイテンシ、修正履歴をruntime episodeへ正規化する。

ズレを分類する

失敗をowner、severity、confidence、user visibility、検証コマンドへ写像する。

勾配を読む

completion、pass rate、retry、advisory lift、failure densityを時間変化するscorecardとして扱う。

位相を制御する

不安定性をrerun、quarantine、draft repair PR、人間承認へ変換し、自律性の拡張を制御する。

Dynamic Harnessは、episode抽出、failure taxonomy、scorecard、repair proposal、controlled self-healingを、企業とAgentic Society全体のRuntime Governanceへ接続する。

研究記事を読む

実装パターン / 脊髄反射型配線

脊髄反射神経的配線を、AI Agentにどう実装するか

既知の刺激をすべてLLMへ投げない。MARIA OSでは、定型・限定・責任範囲が明確なイベントを反射弧で処理し、曖昧または高リスクなものだけ熟慮へ上げる。

既知業務は高速経路へ。未知業務は熟慮経路へ。

layer 1

刺激

layer 2

反射弧

layer 3

ハーネス

layer 4

Envelope

layer 5

証跡

known stimulusgoverned action

すべてのイベントを刺激パケットへ正規化する

反射は生のテキストから直接発火させない。会話、フォーム、ワークフロー変更、APIコールバック、ドキュメント更新を、文脈・行為者・対象・リスク・現在状態を持つ刺激パケットへ変換する。

重要なのは意図分類ではなく、業務上の型付け。

raw input

stimulus packet

actor行為者

object対象

risk危険度

state状態

authority権限

既知の刺激を、限定された反射弧へルーティングする

反射弧は、既知の業務クラスに対して事前設計された実行経路。不備入力の差し戻し、依頼分類、禁止送信の停止、証跡付与、高リスク案件のエスカレーション、決定論的ワークフローを担う。

反射は近道ではない。すでに設計済みの判断である。

既知 / 低リスク

反射弧

deterministic

曖昧 / 未分類

熟慮

LLM + human

権限不足

fail

closed

反射選択マトリクス

ルーティング判断は明示する。既知かつ限定された業務は反射へ。曖昧な業務は熟慮へ。権限が欠ける業務はfail-closedへ。

静的・動的ハーネスで反射を包む

静的ハーネスは権限、データ、ツール、禁止条件を固定する。動的ハーネスはリスク、信頼度、状態、期限、監査条件に応じて、実行可能範囲を実行時に調整する。

反射は、統治された行動空間の内側でだけ速く動く。

Envelope型責任契約を通してのみ実行する

Envelopeが欠ける、または不正ならfail-closedする。

control wrapper

反射を包む

reflex

check

静的境界

check

動的リスク

check

多段制御

check

停止条件

responsibility contract

責任を同伴させる

reflex

check

責任者

check

権限範囲

check

目的範囲

check

失敗時経路

観測・調整し、現場パターンをOS資産化する

反射は、発火・停止・エスカレーション・上書き・巻き戻しの証跡を残す。FDEチームはその証跡で局所反射を調整し、安定したパターンをMARIA OSの再利用資産へ昇格させる。

現場実装が、プラットフォームの学習になる。

MARIA OS

発火

停止

人間復帰

調整

資産化

FDE traces become reusable reflex libraries

Operational Governance

優位性は自律性そのものではない。どこで止めるかを知っていること。

MARIA OSでは、停止・復旧・証拠・人間エスカレーションを例外処理ではなく本番経路として扱う。内部では復旧経路を攻めて鍛え、顧客環境では信頼・証拠・反復性が揃うまでHITLを厚めに保つ。

評価記事を読む

見るべき指標

Runtime proof

Fail-closed

権限・証拠・文脈が足りなければ実行しない。

Auto-recovery

内部では原因ログと復旧後検証つきで回復経路を鍛える。

HITL convergence

安定が証明された反復ワークフローから人間レビューを減らす。

Responsibility envelope

責任者のない実行経路を有効にしない。

エビデンスハブ

新しいセクションが増えるたびに、Bonginkanのトラストグラフが強くなる。

MARIA OSのコンテンツは、孤立したマーケティングコピーではない。プロダクトページ、実験、アーキテクチャノート、技術記事は、発行者・構築者・説明責任を負うソースとしてBonginkanへと結び付けられている。

Bonginkan @bongin_ai をフォロー

企業ソース

プロダクトの証跡

MARIA OS リファレンス

オペレーティングシステムを会社の記録へと結びつける、Bonginkan公式のケース資料。

実装ソース

デプロイの証跡

MARIA OS Appliance

統治されたAI Agentランタイムを具体的な提供経路で必要とするチームのための、アプライアンス・リファレンス。

公開アーカイブ

研究の証跡

技術ブログ

MARIA OSの成長に合わせて公開される、技術記事・実験・意思決定アーキテクチャのノート。

ソーシャルシグナル

ライブの証跡

@bongin_ai

同じプロダクトグラフに接続された、継続的なプロダクトノートと創業者視点の思考。

Features & Products

現実を見る。構造を正す。毎日運用する。

インテグレーションで製品をつなぐのではなく、判断で整合させます。

01–06 Universe

07–10 Service

11–16 Platform

Sales Universe

6 agents + 3 packs

判断を考慮した自動化で商談を実行するAIエージェントチーム。すべての商談ステージに専門家がいます。

Learn more

Audit Universe

6 agents + 4 packs

高速なスプレッドシートではありません。すべての発見にエビデンスが伴う再現可能な監査エンジンです。

Learn more

FAQ Universe

4 planets + 17 agents

実際のドキュメントからFAQを自動生成。すべての回答がソース・ページ・エビデンス品質を引用します。

Learn more

Auto-Dev Universe

5 agents + 4 gates

エージェントが構築・テスト・レビュー・デプロイ。「AIがコードを書く」のではなく、データベースが変更を認可します。

Learn more

CPA Universe

Knowledge graph + pass gate

ナレッジグラフとスペースドリピティションを使ってCPA試験を学習するAIエージェント — エビデンスによるガバナンス付き。

Learn more

Meeting Universe

Value Scanning

10 agents + 4 gates

掲げる価値観と実践の価値観。組織の行動が信念と矛盾する箇所を可視化します。

Learn more

Value Scanning

Workflow Scanning

実際のプロセスをスキャンし、無駄・責任の空白・ボトルネックを特定。再構成を処方します。

Learn more

Workflow Scanning

MVV OS Consulting

ミッション・ビジョン・バリューを実行可能なガバナンスに変換。理念が運用制約になります。

Learn more

MVV OS Consulting

Agentic Company Insight

企業のエージェント化成熟度を評価。エージェントが活動できる場所、人間が判断すべき場所、リスクゲートが欠けている場所を特定します。

Learn more

Agentic Company Insight

MARIA Voice

Decision OSに話しかける。音声コマンドがガバナンスされたアクションに変換されます。

Learn more

MARIA Voice

AI Office

AIエージェントが部門として働くバーチャルオフィス — HR、経理、法務、開発 — MARIA OSが統治します。

Learn more

MARIA BOOKING

CEO Clone OS

音声インタビュー、Decision OS、5KB Genome、会議Agent、稟議ゲート、連携、Doctor Agent修復まで含む判断OS。

Learn more

AI Office

MARIA VITAL

Agent組織のための生命維持OS。行動健全性、判断品質、連携状態、回復可能性を継続的に監視・制御します。

Learn more

CEO Clone OS

Agentic Company

ヒューマンカンパニーから自己改善型へ。ガバナンスを各段階に組み込んだ、構造化された進化パスです。

Learn more

MARIA VITAL

Life Support OS for Agent Orgs

Continuously monitor agent vitals — behavior health, judgment quality, coordination state, and recoverability.

Learn more

Agentic Company

The Destination, Not a Feature

From Human Company to Self-Improving — a structured evolution path with governance at every stage.

Learn more

MARIA OS

See → Fix → Run

Harness Adoption Map

LP内の全プロダクトに、ハーネスの入れ所を定義する。

横断ハーネスはepisode、gate、scorecard、quarantineを共有し、個別ハーネスはSales、Audit、Voice、Meetingなど固有の失敗モードを制御する。

対象面

制御

動的対応

生的ハーネス

入力・証拠・会話・差分をepisode化

横断ハーネス

全プロダクト共通のgateとscorecard

動的ハーネス

ドリフトに応じて制約と自律性を調整

Comprehensive Harness Cycle

fail-openで全段階を集約する。

Harness Designer型のplan、cycle report、stable fingerprintを生成し、失敗しても後続の診断可能な段階を継続する。

段階

smoke

release

research

Universe 実行系

Sales Universe

G1.U1.P1.Z1.A1

Deal Evidence Intake Harness / Deal Phase Harness

Attach episode scoring to proposal and estimate generation.

Audit Universe

G1.U1.P2.Z1.A1

Evidence Chain Harness / Procedure-Specific Audit Harness

Evaluate every generated finding through evidence completeness and risk-tier gates.

FAQ Universe

G1.U1.P3.Z1.A1

Source Crawl Harness / FAQ Voice Harness

Add source freshness and public-release gates to generated FAQ artifacts.

Auto-Dev Universe

G1.U1.P4.Z1.A1

Diff Episode Harness / Repository-Specific Dev Harness

Attach dynamic harness scoring to CI failure triage and repair proposals.

CPA Universe

G1.U1.P5.Z1.A1

Learning Evidence Harness / Exam Domain Harness

Gate pass readiness with source validity and repeated-correction signals.

Meeting Universe

G1.U1.P6.Z1.A1

Consent Episode Harness / Meeting Phase Harness

Extend gate evaluation with harness interventions and episode severity.

Scanner / Service 系

Decision Scanner

G1.U2.P5.Z1.A1

Decision Evidence Harness / Decision Context Harness

Score live decision scans with evidence density, branch risk, and authority-gate pressure.

Value Scanner

G1.U2.P1.Z1.A1

Value Evidence Harness / Executive Values Harness

Add harness confidence and evidence density to value scan summaries.

Workflow Scanner

G1.U2.P2.Z1.A1

Process Evidence Harness / Workflow Domain Harness

Score recompose plans with flow-drift and evidence-density controls.

MVV OS Consulting

G1.U2.P3.Z1.A1

MVV Interview Harness / CEO Clone Harness

Add contradiction and rule-enforceability scoring to CEO Clone outputs.

Agentic Company Insight

G1.U2.P4.Z1.A1

Role Mapping Harness / Department Harness

Add role autonomy confidence and rollback conditions to insight output.

Platform 系

MARIA Voice

G1.U3.P1.Z1.A1

Turn Episode Harness / Voice Mode Harness

Attach harness severity to action-chat function-call rounds.

MARIA BOOKING

G1.U3.P6.Z1.A1

Booking Conversation Harness / Reservation Phase Harness

Gate booking voice and calendar-sync episodes with consent, slot, and notification evidence.

AI Office

G1.U3.P2.Z1.A1

Office Event Harness / Agent Lifecycle Harness

Score task-engine events with office-health and handoff-drift signals.

CEO Clone OS

G1.U3.P3.Z1.A1

Judgment Sample Harness / Executive Persona Harness

Add contradiction density and identity-boundary scoring to elicitation outputs.

MARIA VITAL

G1.U3.P4.Z1.A1

Vital Signal Harness / Agent Vital Harness

Unify vital signals with the runtime harness scorecard.

Agentic Company

G1.U3.P5.Z1.A1

Company Phase Harness / Evolution Path Harness

Add phase advancement criteria and rollback triggers to Agentic Company stages.

17 surfaces -> raw intake -> cross gates -> individual dynamic control

Harness Installation Plan

MARIA Self-Healing Runtimeが、失敗をreview可能な修正PRへ変換する。

目的は単なる自律修復ではなく、安全な自律修復です。Failure Analyzer、Meta-Harness、Envelope、Memory Store、Human Approval Gate、Loop Controlでepisodeを収集し、confidence付きで分類し、最小改修を計画し、個別/横断Harnessを再実行し、学習を残します。

候補

P0優先

層

収集ループ

1収集

2分析

3計画

4改修

5再実行

6学習

最初に採用する5機構

三層Failure Analyzer

ログ、型、HTTP、DB、traceを機械分類し、LLM仮説と過去Memoryで照合してから改修へ進める。

CI/build logからerror種別、影響surface、confidence、owner、escalation routeを出す。

KPI: 誤分類率

Harness Coverage Meta-Harness

新しいAPI、画面、Agent、外部連携、権限、promptに必要なHarnessがあるかを検知する。

PRごとに変更fileをroute、screen、prompt、permission、integrationのcoverage ruleと照合する。

KPI: Harness抜け漏れ率

Fixer Agent Envelope Router

改修をLow/Medium/High/Memory Write envelopeへ分類し、Fixerが権限を越えないようにする。

test、型、文言、軽微UIはdraft PRまで許可し、prompt、DB、権限、外部送信はreviewへ戻す。

KPI: 権限外変更数

Failure Memory Store

失敗内容、原因、証拠、修正差分、再実行結果、副作用、review、人間レビュアーの判断根拠、再発防止ruleを資産として保存する。

CI/E2Eのrepair episodeにreviewer rationaleを添えて保存し、次回planning前に類似incidentを検索できるようにする。

KPI: 同一失敗の再発率

Risk Calibration Ledger

runtime risk score、monitor finding、reviewer判断、後続incidentを照合し、専門家事前分布の閾値を運用証拠で較正する。

repair/approval episodeごとにscore basis、calibration version、risk breakdown、reviewer rationale、incident outcomeを保存する。

KPI: 較正誤差

PR単位Regression Loop

自律改修の最終単位をPRにし、Human Approval Gate、Loop Control、個別、横断、Meta、Deploy、Post-Deploy Harnessの証跡を添付する。

失敗概要、修正理由、再実行Harness、副作用risk、rollback、Memory更新内容をPR本文へ出す。

KPI: 自律改修成功率

Spec Contract Harness

G1.U4.P1.Z1.A1

静的Phase 1

観測

undefined errorsfield driftmissing owners

制御: Blocks implementation when API, UI fields, DB columns, and acceptance criteria disagree.

分類: Deterministic schema diff first, LLM review only for ambiguous requirement language.

責任境界: May block implementation and draft spec diffs; may not approve scope changes.

網羅性: Flags new API, DB, or screen files that lack a spec-contract episode.

初回実装: Generate a schema-to-screen diff for product specs before agent work starts.

責任: Product Architecture

Prompt Policy Harness

G1.U4.P1.Z2.A1

静的Phase 3

観測

missing output contractunsafe delegationweak evaluation rubric

制御: Quarantines prompts that lack prohibited actions, output format, evidence rules, or gate policy.

分類: Rule-based prompt checklist with memory lookup for prior prompt failures.

責任境界: May quarantine prompts and propose edits; core authority prompts require reviewer approval.

網羅性: Detects production prompts without output format, forbidden actions, or evaluation criteria.

初回実装: Score production prompts for format, authority boundary, and evaluation coverage.

責任: Agent Governance

Client Data Preflight Harness

G1.U4.P2.Z1.A1

実行前Phase 2

観測

tenant scopePII classagent permission tier

制御: Stops an agent before it reads customer data outside its contract, role, or approval state.

分類: Deterministic tenant and role policy evaluation before any LLM reasoning.

責任境界: May deny or request approval; may not expand customer-data access grants.

網羅性: Finds data retrieval paths without tenant, PII, and permission preflight checks.

初回実装: Attach a preflight decision to every customer-data retrieval and exported artifact.

責任: Security

External Action Preflight Harness

G1.U4.P2.Z2.A1

実行前Phase 2

観測

blast radiusrecipient visibilitybusiness-hour policy

制御: Routes public, financial, destructive, or production actions to human approval before execution.

分類: Structured action taxonomy with confidence threshold and human fallback.

責任境界: May draft outbound actions; public, financial, destructive, and deploy actions require approval.

網羅性: Reports external side-effect commands not covered by action preflight policy.

初回実装: Gate outbound email, invoice issue, GitHub PR creation, and deploy commands with one policy matrix.

責任: Operations

Agent Runtime Telemetry Harness

G1.U4.P3.Z1.A1

実行中Phase 4

観測

latencytoken spendRAG hit ratetool-call loop

制御: Detects drift during execution and changes route, model, retrieval scope, or escalation state.

分類: Metric thresholds plus failure-taxonomy classifier backed by similar runtime episodes.

責任境界: May reroute, degrade, retry, or escalate; may not change authority policy while running.

網羅性: Finds agent runs missing cost, retrieval, gate, and correction telemetry.

初回実装: Normalize every agent run into a runtime episode with cost, retrieval, gate, and correction signals.

責任: Runtime Platform

Voice Call Stability Harness

G1.U4.P3.Z2.A1

実行中Phase 4

観測

turn gapsTTS failurerecognition restartemotion mismatch

制御: Falls back to text, pauses tool execution, or escalates when voice state becomes unstable.

分類: Deterministic audio-state checks with LLM review for semantic or emotion mismatch.

責任境界: May pause voice execution or switch channels; may not execute irreversible customer actions.

網羅性: Flags voice flows without turn continuity, TTS completion, and fallback telemetry.

初回実装: Score each voice turn for recognition continuity, TTS completion, and unsafe action pressure.

責任: Voice Platform

Artifact Evidence Review Harness

G1.U4.P4.Z1.A1

実行後Phase 3

観測

source matchamount mismatchmissing TODOunsupported claim

制御: Returns generated artifacts for repair when evidence, numbers, deadline, or owner is missing.

分類: Structured source comparison first, LLM panel only for semantic support checks.

責任境界: May return artifacts for repair; may not send customer-visible artifacts automatically.

網羅性: Finds generated artifacts without source episode, owner, or review outcome.

初回実装: Review proposal, SOW, estimate, and meeting-minute artifacts against their source episode.

責任: Quality

Model Routing Harness

G1.U4.P5.Z1.A1

動的Phase 6

観測

confidence sloperetry densityprovider failurecost variance

制御: Switches provider, narrows retrieval, downgrades autonomy, or regenerates queries from live signals.

分類: Scorecard slope and provider error analysis before model-choice LLM reasoning.

責任境界: May switch models within approved tiers; budget or provider-policy changes require approval.

網羅性: Detects model routes without confidence, cost, retry, and provider-failure records.

初回実装: Add dynamic routing decisions to failed RAG and low-confidence answer episodes.

責任: Model Ops

CI Repair Harness

G1.U4.P6.Z1.A1

自律Phase 5

観測

failing joblog signaturechanged filesverification command

制御: Creates scoped repair PRs, reruns failed jobs, and quarantines flaky harness paths.

分類: Log-signature classifier, deterministic changed-file mapping, then LLM patch planning.

責任境界: May create scoped repair PRs; may not merge, deploy, or weaken required checks.

網羅性: Finds CI checks, harness jobs, and changed surfaces missing repair coverage.

初回実装: Convert CI failures into repair scope, candidate files, validation commands, and PR body.

責任: Auto-Dev

Company Operating Harness

G1.U4.P7.Z1.A1

組織Phase 8

観測

stalled dealfollow-up gapcontract-invoice mismatchbranch drift

制御: Turns organizational anomalies into owner alerts, follow-up tasks, policy reviews, or repair workflows.

分類: Business-rule anomaly detection with memory lookup for repeated operating patterns.

責任境界: May create tasks and escalation briefs; may not alter contracts, invoices, or staffing authority.

網羅性: Finds business processes without event source, owner, SLA, or escalation route.

初回実装: Connect CRM, contract, invoice, recruiting, and support events into one operating scorecard.

責任: Executive Office

Integration Contract Runtime Harness

G1.U4.P3.Z3.A1

実行中Phase 4

観測

schema driftrate-limit pressureconnector auth decaypartial sync

制御: Blocks write paths when connector schema, auth, or idempotency state is unsafe and creates bounded repair work for the owning integration.

分類: Connector telemetry and contract snapshots are compared first, then ambiguous partial-sync cases are routed to LLM-assisted impact analysis.

責任境界: May pause connector writes, degrade to read-only, or open repair tasks; may not rotate credentials or expand third-party scopes.

網羅性: Flags integrations that lack schema snapshots, retry policy, auth expiry telemetry, or partial-write reconciliation.

初回実装: Attach runtime contract checks to Salesforce, freee, Google Calendar, and storage sync episodes.

責任: Integration Platform

Approval Latency Review Harness

G1.U4.P4.Z2.A1

実行後Phase 5

観測

approval waitreviewer overridestale escalationdecision reversal

制御: Converts slow or unstable approval paths into owner alerts, queue reshaping proposals, and gate-policy repair tickets.

分類: SLA and queue metrics are inspected deterministically before LLM review summarizes why approvals are delayed or repeatedly reversed.

責任境界: May recommend reviewer reassignment, SLA changes, or gate copy updates; may not bypass approval or approve work on behalf of humans.

網羅性: Finds human gates without explicit SLA, reviewer owner, escalation route, reversal tracking, or stale-approval handling.

初回実装: Score finance, audit, deploy, and outbound customer approval gates for wait time and reversal patterns.

責任: Risk Operations

Memory Write Harness

G1.U4.P5.Z2.A1

動的Phase 6

観測

memory mutationsource evidenceretention policycontradiction

制御: Stages learning-store writes until source evidence, retention class, contradiction status, and rollback path are attached.

分類: Structured provenance checks and retention rules run before semantic contradiction review decides whether a memory write is safe.

責任境界: May stage or reject memory writes and request reviewer rationale; may not permanently mutate shared memory without source evidence.

網羅性: Detects memory-writing agents without provenance, retention class, reviewer route, rollback key, or contradiction scan.

初回実装: Gate CI repair, workflow repair, and customer-operations memory writes with provenance and contradiction checks.

責任: Memory Platform

Deployment Canary Harness

G1.U4.P6.Z2.A1

自律Phase 7

観測

canary error ratefeature flag staterollback pathpost-deploy probe

制御: Stops rollout and produces a rollback or flag-disable proposal when canary metrics exceed the approved blast-radius envelope.

分類: Deployment metrics, smoke probes, and flag diffs are checked first, with LLM analysis limited to summarizing blast-radius evidence.

責任境界: May disable feature flags, stop rollout, or open rollback PRs; may not promote canaries to full rollout without approval.

網羅性: Finds deployable surfaces without canary probes, flag owner, rollback command, post-deploy observation, or customer-impact tier.

初回実装: Add canary probes and rollback evidence to Auto-Dev repair PRs and Vercel preview promotion.

責任: Release Engineering

Customer Operations Harness

G1.U4.P7.Z2.A1

組織Phase 8

観測

support backlogSLA breachrenewal riskincident comms gap

制御: Turns backlog, SLA, renewal, and incident-communication gaps into routed owner work with draft evidence packs.

分類: Operational thresholds and account-health rules are evaluated first, then LLM review drafts customer-safe escalation summaries.

責任境界: May create internal tasks and draft customer updates; may not send incident, renewal, or contractual messages without approval.

網羅性: Flags customer operations flows without SLA owner, customer visibility tier, account-risk signal, or approved communication path.

初回実装: Join support tickets, account health, renewal dates, and incident events into one customer-ops harness scorecard.

責任: Customer Operations

Frontend Render Contract Harness

G1.U4.P1.Z3.A1

静的Phase 1

観測

hydration riskserver-client boundaryroute metadataempty state

制御: Blocks UI changes when route ownership, hydration boundaries, metadata, or user-visible fallback behavior is incomplete.

分類: Static route and component inspection checks client directives, async boundaries, metadata, and empty-state contracts before visual review.

責任境界: May block component changes and propose boundary fixes; may not convert server components to client components without owner approval.

網羅性: Flags new pages, layouts, or interactive components without render contract, loading state, empty state, or ownership evidence.

初回実装: Run render-contract checks on product pages, dashboard panels, and experimental surfaces added in each PR.

責任: Frontend Platform

Responsive I18n Preflight Harness

G1.U4.P2.Z3.A1

実行前Phase 2

観測

missing translation keytext overflowlocale route driftmobile snap break

制御: Stops pages from shipping when English and Japanese content, route availability, or mobile layout behavior diverge.

分類: Message-key diffs and viewport constraints are evaluated deterministically before visual checks review overflow or layout regressions.

責任境界: May block release and propose copy or layout fixes; may not change product messaging intent without content owner review.

網羅性: Finds locale-aware pages without message parity, mobile viewport coverage, overflow checks, or translated route validation.

初回実装: Attach locale parity and mobile text-fit checks to blog, product, dashboard, and experimental pages.

責任: Frontend Platform

UI Visual Richness Harness

G1.U4.P2.Z2.A2

実行後Phase 2

観測

text-only viewportmissing product visualflat compositionlow color variety

制御: Blocks market-facing visual acceptance when a route scores below the richness threshold and queues a UI-agent repair plan.

分類: Playwright captures first-viewport screenshots and deterministic DOM visual metrics, then emits scoped UI-agent repair tasks for low-scoring routes.

責任境界: May draft visual improvement plans and low-risk UI patches; may not ship brand direction changes or remove governance evidence without review.

網羅性: Finds public routes without enough primary visual asset density, color variety, layered surfaces, hierarchy, or screenshot evidence.

初回実装: Score public routes above the fold and write screenshot-backed repair tasks for any page that feels visually underbuilt.

責任: Frontend Platform

Accessibility Visual Review Harness

G1.U4.P4.Z3.A1

実行後Phase 3

観測

focus trapcontrast driftaria gapcanvas blank

制御: Returns UI surfaces for repair when keyboard navigation, focus management, contrast, labels, or visual rendering evidence is missing.

分類: Automated accessibility and screenshot checks run first, with LLM review only for ambiguous visual hierarchy or interaction clarity.

責任境界: May return UI artifacts for repair; may not waive accessibility regressions on production paths without documented approval.

網羅性: Flags interactive screens without keyboard path, contrast check, semantic labels, screenshot evidence, or canvas fallback verification.

初回実装: Add postrun accessibility and screenshot review to dense dashboards, voice UI, and canvas-heavy experimental pages.

責任: Design Systems

API Route Contract Harness

G1.U4.P1.Z4.A1

静的Phase 1

観測

input schema gapresponse driftstatus mismatchmissing coordinate

制御: Blocks backend endpoints when request validation, response shape, error behavior, or governance coordinates are missing.

分類: Route-handler AST and schema checks validate methods, input parsing, status codes, and response shape before semantic contract review.

責任境界: May block API route changes and draft schema repairs; may not alter public API semantics without product and backend approval.

網羅性: Finds route handlers without input validation, typed response envelope, error taxonomy, MARIA coordinate, or test coverage.

初回実装: Score new and modified app/api route handlers for validation, typed envelopes, and explicit error outcomes.

責任: Backend Platform

Auth Permission Preflight Harness

G1.U4.P2.Z4.A1

実行前Phase 2

観測

missing sessiontenant leakrole mismatchtool permission drift

制御: Stops frontend, API, and agent actions before they cross tenant, role, data, or tool authority boundaries.

分類: Deterministic session, tenant, role, and tool-scope policy evaluation runs before any request or agent action mutates state.

責任境界: May deny requests, downgrade to read-only, or request approval; may not grant roles, tenants, or tool permissions.

網羅性: Flags server actions, API routes, and agent tools without session checks, tenant filters, role policy, or permission envelope.

初回実装: Attach auth preflight results to write APIs, customer-data reads, agent tools, and external action routes.

責任: Security

DB Migration Preflight Harness

G1.U4.P2.Z5.A1

実行前Phase 2

観測

destructive migrationmissing indexrls policy gapseed drift

制御: Stops DB changes when reversibility, tenant policy, data migration, index coverage, or test evidence is incomplete.

分類: Schema diff, migration operation, index coverage, and RLS policy checks run before reviewer-guided data-risk analysis.

責任境界: May block migrations and draft reversible plans; may not apply destructive DB changes or relax RLS without explicit approval.

網羅性: Finds schema changes without rollback, RLS impact, seed update, data backfill, index analysis, or integration-test plan.

初回実装: Evaluate db/schema changes for destructive operations, RLS coverage, rollback path, and dependent API surfaces.

責任: Data Platform

Data Provider Runtime Harness

G1.U4.P3.Z4.A1

実行中Phase 4

観測

mock-live driftadapter timeoutshape mismatchfallback leak

制御: Detects live adapter drift and switches views to bounded fallback states while routing repair work to the provider owner.

分類: Runtime adapter telemetry and response-shape checks compare mock and live provider contracts before fallback behavior is adjusted.

責任境界: May degrade to mock-safe or read-only mode and open adapter repair tasks; may not silently mix tenant data across providers.

網羅性: Flags data providers without mock-live parity tests, timeout policy, fallback state, tenant filter, or response-shape contract.

初回実装: Monitor dashboard and product data providers for mock-live parity, adapter timeout, and shape mismatch episodes.

責任: Data Platform

Queue Cron Runtime Harness

G1.U4.P3.Z5.A1

実行中Phase 4

観測

missed tickduplicate jobstale lockbacklog growth

制御: Prevents duplicate or stale scheduled execution and routes missed ticks, queue backlogs, and lock failures to bounded recovery.

分類: Schedule, idempotency, lock, and backlog telemetry are checked first, then historical incident memory ranks likely repair paths.

責任境界: May pause jobs, skip duplicate ticks, or enqueue repair tasks; may not replay side-effecting jobs without approval.

網羅性: Finds cron and background workflows without idempotency key, stale-lock handling, backlog metrics, or replay policy.

初回実装: Add runtime checks to Civilization daily advancement, intelligence scans, and automation harness jobs.

責任: Runtime Platform

RAG Index Runtime Harness

G1.U4.P3.Z6.A1

実行中Phase 4

観測

index freshnesschunk-source mismatchretrieval misscitation gap

制御: Blocks answer generation or downgrades confidence when retrieval freshness, source integrity, or citation coverage fails.

分類: Index timestamps, source hashes, retrieval hit rates, and citation coverage are checked before semantic answer support review.

責任境界: May narrow retrieval, mark sources stale, or request reindex; may not publish unsupported answers or delete source corpora.

網羅性: Flags ingestion and RAG paths without source hash, freshness SLA, retrieval metric, citation requirement, or reindex workflow.

初回実装: Attach RAG freshness checks to FAQ, CPA, knowledge graph, and document-scanner answer episodes.

責任: Knowledge Platform

Streaming Output Runtime Harness

G1.U4.P3.Z7.A1

実行中Phase 4

観測

partial unsafe outputstream aborttool-call leakschema fragment

制御: Stops or rewrites streamed output when partial content violates schema, authority, safety, or customer-visibility rules.

分類: Chunk-level schema, safety, and tool-call guards run during streaming before postrun review evaluates full artifact quality.

責任境界: May stop streams, redact partial chunks, or fall back to safe summary; may not continue unsafe public output after a guard trip.

網羅性: Finds streaming endpoints without chunk guard, abort policy, redaction path, final envelope validation, or audit trace.

初回実装: Add chunk-level guards to audit chat, voice responses, workflow scans, and model-generated report streams.

責任: Model Ops

Trace Observability Harness

G1.U4.P5.Z3.A1

動的Phase 6

観測

missing tracecoordinate gapmetric blind spotlog pii leak

制御: Prevents blind autonomous execution by requiring traceable coordinates, redacted logs, owned metrics, and alert coverage.

分類: Trace coverage, coordinate presence, metric completeness, and PII log policy checks run before observability repair planning.

責任境界: May add instrumentation tasks and block blind automation; may not expose sensitive logs or weaken retention policy.

網羅性: Finds routes, jobs, agents, and UI workflows without trace ID, MARIA coordinate, metric owner, redaction, or alert rule.

初回実装: Score new APIs, cron jobs, and agent workflows for trace coverage and coordinate completeness.

責任: Observability

E2E Journey Repair Harness

G1.U4.P6.Z3.A1

自律Phase 5

観測

journey failurevisual diffselector driftnavigation dead end

制御: Creates scoped repair plans when user-critical flows fail through selector drift, visual regression, navigation, or data fixture mismatch.

分類: Playwright traces, screenshots, selector changes, and route diffs are classified before repair planning proposes the smallest UI or test fix.

責任境界: May update scoped selectors, fixtures, and low-risk UI defects; may not delete user-critical assertions or weaken journey coverage.

網羅性: Finds product-critical flows without E2E journey, screenshot baseline, responsive coverage, fixture owner, or failure fingerprint.

初回実装: Attach E2E journey repair loops to booking, workflow scanner, audit office, and dashboard critical paths.

責任: Quality Engineering

Edge Cache Runtime Harness

G1.U4.P3.Z8.A1

実行中Phase 4

観測

cache poisoninglocale redirect loopstale pageheader drift

制御: Detects stale, misrouted, or incorrectly cached responses and routes safe cache disablement or middleware repair proposals.

分類: Header, redirect, locale, and cache-control traces are checked deterministically before impact analysis reviews user-visible fallout.

責任境界: May disable caching for affected routes or open middleware repair tasks; may not change global cache policy without approval.

網羅性: Flags middleware and cached routes without cache-key policy, locale redirect tests, stale-content SLA, or header verification.

初回実装: Monitor locale middleware, product pages, blog pages, and API cache headers for redirect and stale-content incidents.

責任: Web Platform

observe -> gate -> review -> route -> repair

Universe Builder

ゾーンが生まれる瞬間を見る

universe-builder

スクロールして構築を開始...

構築シーケンス

Goal

Scope

Team

Responsibility

Skills

Build

Gates

Validate

Test

Deploy

目標 > スコープ > チーム > 責任 > スキル > 構築 > ゲート > 検証 > テスト > デプロイ

スキル（K1-K8）はSkill Storeから動的に取得・自動補充されます

DECISION OSFOR AGENTCOMPANIES

自律性に必要なのは、巨大なプロンプトではなくハーネス。

静的ハーネスは、起動してよいかを確認する。Dynamic Harnessは、続けて安全かを実行中に判断する。

起動後の失敗を捕まえる

実行中の金銭事故を止める

不健全なAgentを隔離する

自己修復の暴走を防ぐ

Scannerを制御信号に変える

自律性を可逆にする

Dynamic Harnessは、AI組織の軌道を壊れる前に変える。

Runtimeを観測する

ズレを分類する

勾配を読む

位相を制御する

脊髄反射神経的配線を、AI Agentにどう実装するか

すべてのイベントを刺激パケットへ正規化する

既知の刺激を、限定された反射弧へルーティングする

静的・動的ハーネスで反射を包む

観測・調整し、現場パターンをOS資産化する

優位性は自律性そのものではない。どこで止めるかを知っていること。

Fail-closed

Auto-recovery

HITL convergence

Responsibility envelope

新しいセクションが増えるたびに、Bonginkanのトラストグラフが強くなる。

MARIA OS リファレンス

MARIA OS Appliance

技術ブログ

@bongin_ai

現実を見る。構造を正す。毎日運用する。

Sales Universe

Audit Universe

FAQ Universe

Auto-Dev Universe

CPA Universe

Value Scanning

Workflow Scanning

MVV OS Consulting

Agentic Company Insight

MARIA Voice

AI Office

CEO Clone OS

MARIA VITAL

Agentic Company

Life Support OS for Agent Orgs

The Destination, Not a Feature

LP内の全プロダクトに、ハーネスの入れ所を定義する。

生的ハーネス

横断ハーネス

動的ハーネス

fail-openで全段階を集約する。

Universe 実行系

Sales Universe

Audit Universe

FAQ Universe

Auto-Dev Universe

CPA Universe

Meeting Universe

Scanner / Service 系

Decision Scanner

Value Scanner

Workflow Scanner

MVV OS Consulting

Agentic Company Insight

Platform 系

MARIA Voice

MARIA BOOKING

AI Office

CEO Clone OS

MARIA VITAL

Agentic Company

MARIA Self-Healing Runtimeが、失敗をreview可能な修正PRへ変換する。

三層Failure Analyzer

Harness Coverage Meta-Harness

Fixer Agent Envelope Router

Failure Memory Store

Risk Calibration Ledger

PR単位Regression Loop

Spec Contract Harness

Prompt Policy Harness

DECISION OS
FOR AGENT
COMPANIES

自律性に必要なのは、
巨大なプロンプトではなく
ハーネス。