DECISION OS
FOR AGENT
COMPANIES
Encode human judgment as an OS. AI Agents execute the business.
Encode human judgment as an OS. AI Agents execute the business.
Not running AI. Running decisions.
Most AI tools chain prompts to accelerate automation. But what companies truly need is not just automation — where to delegate to AI, where to stop, and where humans take responsibility — this structure of decision-making. MARIA OS defines leader judgment as an operating system and transforms it into execution by AI Agents.
This is for
This is not for
Dynamic Harness / Main Concept
MARIA OS treats each agent as a moving state in a phase space. The harness observes drift, tightens constraints, and expands autonomy only when quality, trust, and responsibility remain stable.
Implementation Pattern / Spinal Reflex Wiring
Do not send every known stimulus to a large model. MARIA OS routes routine, bounded, accountable events through reflex arcs, while ambiguous or risky cases rise into deliberation.
Fast path for known work. Deliberative path for unknown work.
01
A reflex cannot fire from raw text alone. First, the system converts messages, forms, workflow changes, API callbacks, and document updates into typed stimulus packets with context, actor, object, risk, and current state.
The key is not intent detection. The key is operational typing.
02
A reflex arc is a predesigned execution path for a known class of work: reject incomplete input, classify a request, stop a prohibited transfer, attach evidence, escalate a high-risk case, or run a deterministic workflow.
A reflex is not a shortcut. It is a decision that has already been designed.
The routing decision is explicit: reflex for known and bounded work, deliberation for ambiguous work, fail-closed for missing authority.
03
Static harnesses define fixed authority, data, tool, and prohibition boundaries. Dynamic harnesses adjust the allowed range at runtime based on risk, confidence, state, deadline, and audit conditions.
The reflex moves fast only inside a governed action space.
04
Execute only through Envelope responsibility contracts
If the Envelope is missing or invalid, the action fails closed.
05
Each reflex produces traces: fired, blocked, escalated, overridden, or rolled back. FDE teams use those traces to tune local reflexes, then promote stable patterns into reusable MARIA OS assets.
Field implementation becomes platform learning.
Operational Governance
MARIA OS treats stopping, recovery, evidence, and human escalation as production paths, not exception handling. Internally, recovery paths are stressed aggressively. In customer environments, HITL stays heavier until trust, evidence, and repetition justify more autonomy.
Read the assessmentWhat we measure
Runtime proofStop when authority, evidence, or context is insufficient.
Recover internally with causal logs and post-recovery checks.
Reduce repeated human review only after the workflow proves stable.
No execution path is valid without an accountable owner.
MARIA OS content is not isolated marketing copy. Product pages, experiments, architecture notes, and technical articles are connected back to Bonginkan as publisher, builder, and accountable source.
Product proof
Official Bonginkan case material that ties the operating system back to the company record.
Deployment proof
Appliance reference for teams that need a governed AI Agent runtime with a concrete delivery path.
Research proof
Technical articles, experiments, and decision architecture notes published as MARIA OS grows.
Live proof
Ongoing product notes and founder-level thinking connected to the same product graph.
We don't connect products with integrations. We align them with judgment.
AI agent teams that execute deals with judgment-aware automation. Every deal stage has a specialist.
Learn more02Not a faster spreadsheet. A reproducible audit engine where every finding carries evidence.
Learn more03Auto-generates FAQ from real documents. Every answer cites source, page, and evidence quality.
Learn more04Agents build, test, review, and deploy. Not 'AI writes code' — the database authorizes changes.
Learn more05AI agents that study for CPA exams using knowledge graphs and spaced repetition — governed by evidence.
Learn more06Stated values vs. practiced values. See where your organization's behavior contradicts its beliefs.
Learn more07Scan real processes, identify waste, responsibility gaps, and bottlenecks. Prescribe recomposition.
Learn more08Turn your mission, vision, and values into executable governance. Philosophy becomes operating constraints.
Learn more09Assess your company's agentic maturity. Find where agents can act, where humans must decide, and where risk gates are missing.
Learn more10Speak to your Decision OS. Voice commands become governed actions.
Learn more11A virtual office where AI agents work as departments — HR, Finance, Legal, Dev — governed by MARIA OS.
Learn more16Voice interview, Decision OS, 5KB Genome, meeting agents, approval gates, integrations, and Doctor Agent repair.
Learn more12Life support OS for agent organizations. Continuously monitor behavior health, judgment quality, coordination state, and recoverability.
Learn more13From Human Company to Self-Improving — a structured evolution path with governance at every stage.
Learn more14Continuously monitor agent vitals — behavior health, judgment quality, coordination state, and recoverability.
Learn more15From Human Company to Self-Improving — a structured evolution path with governance at every stage.
Learn moreSee → Fix → Run
Harness Adoption Map
Cross harnesses share episodes, gates, scorecards, and quarantine. Individual harnesses control the failure modes unique to Sales, Audit, Voice, Meeting, and the other surfaces.
Turns inputs, evidence, turns, and diffs into episodes
Shared gates and scorecards across products
Adjusts constraints and autonomy from drift
G1.U1.P1.Z1.A1
Deal Evidence Intake Harness / Deal Phase Harness
Attach episode scoring to proposal and estimate generation.
G1.U1.P2.Z1.A1
Evidence Chain Harness / Procedure-Specific Audit Harness
Evaluate every generated finding through evidence completeness and risk-tier gates.
G1.U1.P3.Z1.A1
Source Crawl Harness / FAQ Voice Harness
Add source freshness and public-release gates to generated FAQ artifacts.
G1.U1.P4.Z1.A1
Diff Episode Harness / Repository-Specific Dev Harness
Attach dynamic harness scoring to CI failure triage and repair proposals.
G1.U1.P5.Z1.A1
Learning Evidence Harness / Exam Domain Harness
Gate pass readiness with source validity and repeated-correction signals.
G1.U1.P6.Z1.A1
Consent Episode Harness / Meeting Phase Harness
Extend gate evaluation with harness interventions and episode severity.
G1.U2.P5.Z1.A1
Decision Evidence Harness / Decision Context Harness
Score live decision scans with evidence density, branch risk, and authority-gate pressure.
G1.U2.P1.Z1.A1
Value Evidence Harness / Executive Values Harness
Add harness confidence and evidence density to value scan summaries.
G1.U2.P2.Z1.A1
Process Evidence Harness / Workflow Domain Harness
Score recompose plans with flow-drift and evidence-density controls.
G1.U2.P3.Z1.A1
MVV Interview Harness / CEO Clone Harness
Add contradiction and rule-enforceability scoring to CEO Clone outputs.
G1.U2.P4.Z1.A1
Role Mapping Harness / Department Harness
Add role autonomy confidence and rollback conditions to insight output.
G1.U3.P1.Z1.A1
Turn Episode Harness / Voice Mode Harness
Attach harness severity to action-chat function-call rounds.
G1.U3.P6.Z1.A1
Booking Conversation Harness / Reservation Phase Harness
Gate booking voice and calendar-sync episodes with consent, slot, and notification evidence.
G1.U3.P2.Z1.A1
Office Event Harness / Agent Lifecycle Harness
Score task-engine events with office-health and handoff-drift signals.
G1.U3.P3.Z1.A1
Judgment Sample Harness / Executive Persona Harness
Add contradiction density and identity-boundary scoring to elicitation outputs.
G1.U3.P4.Z1.A1
Vital Signal Harness / Agent Vital Harness
Unify vital signals with the runtime harness scorecard.
G1.U3.P5.Z1.A1
Company Phase Harness / Evolution Path Harness
Add phase advancement criteria and rollback triggers to Agentic Company stages.
17 surfaces -> raw intake -> cross gates -> individual dynamic control
Harness Installation Plan
The goal is not raw autonomous repair. It is safe autonomous repair: Failure Analyzer, Meta-Harness, Envelope, Memory Store, Human Approval Gate, and Loop Control collect the episode, classify confidence, plan the smallest repair, re-run local and cross harnesses, and preserve the learning.
Classify failures through deterministic signals, LLM root-cause hypotheses, and historical memory before any repair is attempted.
KPI: Misclassification rate
Detect new APIs, screens, agents, integrations, permissions, and prompts that lack the required harness coverage.
KPI: Coverage gap rate
Route repairs into low, medium, high, or memory-write envelopes so the fixer cannot exceed its authority.
KPI: Unauthorized mutation count
Store failure evidence, cause, patch rationale, rerun result, side effects, review notes, human reviewer rationale, and prevention rules as reusable assets.
KPI: Repeat failure rate
Compare runtime risk scores, monitor findings, reviewer decisions, and later incidents so expert-prior thresholds can be calibrated from operational evidence.
KPI: Calibration error
Make the final unit of autonomous repair a reviewable PR with human approval gates, loop controls, and scoped, cross, meta, deploy, and post-deploy harness evidence.
KPI: Autonomous repair success rate
G1.U4.P1.Z1.A1
observes
control: Blocks implementation when API, UI fields, DB columns, and acceptance criteria disagree.
analyzer: Deterministic schema diff first, LLM review only for ambiguous requirement language.
envelope: May block implementation and draft spec diffs; may not approve scope changes.
coverage: Flags new API, DB, or screen files that lack a spec-contract episode.
first slice: Generate a schema-to-screen diff for product specs before agent work starts.
G1.U4.P1.Z2.A1
observes
control: Quarantines prompts that lack prohibited actions, output format, evidence rules, or gate policy.
analyzer: Rule-based prompt checklist with memory lookup for prior prompt failures.
envelope: May quarantine prompts and propose edits; core authority prompts require reviewer approval.
coverage: Detects production prompts without output format, forbidden actions, or evaluation criteria.
first slice: Score production prompts for format, authority boundary, and evaluation coverage.
G1.U4.P2.Z1.A1
observes
control: Stops an agent before it reads customer data outside its contract, role, or approval state.
analyzer: Deterministic tenant and role policy evaluation before any LLM reasoning.
envelope: May deny or request approval; may not expand customer-data access grants.
coverage: Finds data retrieval paths without tenant, PII, and permission preflight checks.
first slice: Attach a preflight decision to every customer-data retrieval and exported artifact.
G1.U4.P2.Z2.A1
observes
control: Routes public, financial, destructive, or production actions to human approval before execution.
analyzer: Structured action taxonomy with confidence threshold and human fallback.
envelope: May draft outbound actions; public, financial, destructive, and deploy actions require approval.
coverage: Reports external side-effect commands not covered by action preflight policy.
first slice: Gate outbound email, invoice issue, GitHub PR creation, and deploy commands with one policy matrix.
G1.U4.P3.Z1.A1
observes
control: Detects drift during execution and changes route, model, retrieval scope, or escalation state.
analyzer: Metric thresholds plus failure-taxonomy classifier backed by similar runtime episodes.
envelope: May reroute, degrade, retry, or escalate; may not change authority policy while running.
coverage: Finds agent runs missing cost, retrieval, gate, and correction telemetry.
first slice: Normalize every agent run into a runtime episode with cost, retrieval, gate, and correction signals.
G1.U4.P3.Z2.A1
observes
control: Falls back to text, pauses tool execution, or escalates when voice state becomes unstable.
analyzer: Deterministic audio-state checks with LLM review for semantic or emotion mismatch.
envelope: May pause voice execution or switch channels; may not execute irreversible customer actions.
coverage: Flags voice flows without turn continuity, TTS completion, and fallback telemetry.
first slice: Score each voice turn for recognition continuity, TTS completion, and unsafe action pressure.
G1.U4.P4.Z1.A1
observes
control: Returns generated artifacts for repair when evidence, numbers, deadline, or owner is missing.
analyzer: Structured source comparison first, LLM panel only for semantic support checks.
envelope: May return artifacts for repair; may not send customer-visible artifacts automatically.
coverage: Finds generated artifacts without source episode, owner, or review outcome.
first slice: Review proposal, SOW, estimate, and meeting-minute artifacts against their source episode.
G1.U4.P5.Z1.A1
observes
control: Switches provider, narrows retrieval, downgrades autonomy, or regenerates queries from live signals.
analyzer: Scorecard slope and provider error analysis before model-choice LLM reasoning.
envelope: May switch models within approved tiers; budget or provider-policy changes require approval.
coverage: Detects model routes without confidence, cost, retry, and provider-failure records.
first slice: Add dynamic routing decisions to failed RAG and low-confidence answer episodes.
G1.U4.P6.Z1.A1
observes
control: Creates scoped repair PRs, reruns failed jobs, and quarantines flaky harness paths.
analyzer: Log-signature classifier, deterministic changed-file mapping, then LLM patch planning.
envelope: May create scoped repair PRs; may not merge, deploy, or weaken required checks.
coverage: Finds CI checks, harness jobs, and changed surfaces missing repair coverage.
first slice: Convert CI failures into repair scope, candidate files, validation commands, and PR body.
G1.U4.P7.Z1.A1
observes
control: Turns organizational anomalies into owner alerts, follow-up tasks, policy reviews, or repair workflows.
analyzer: Business-rule anomaly detection with memory lookup for repeated operating patterns.
envelope: May create tasks and escalation briefs; may not alter contracts, invoices, or staffing authority.
coverage: Finds business processes without event source, owner, SLA, or escalation route.
first slice: Connect CRM, contract, invoice, recruiting, and support events into one operating scorecard.
G1.U4.P3.Z3.A1
observes
control: Blocks write paths when connector schema, auth, or idempotency state is unsafe and creates bounded repair work for the owning integration.
analyzer: Connector telemetry and contract snapshots are compared first, then ambiguous partial-sync cases are routed to LLM-assisted impact analysis.
envelope: May pause connector writes, degrade to read-only, or open repair tasks; may not rotate credentials or expand third-party scopes.
coverage: Flags integrations that lack schema snapshots, retry policy, auth expiry telemetry, or partial-write reconciliation.
first slice: Attach runtime contract checks to Salesforce, freee, Google Calendar, and storage sync episodes.
G1.U4.P4.Z2.A1
observes
control: Converts slow or unstable approval paths into owner alerts, queue reshaping proposals, and gate-policy repair tickets.
analyzer: SLA and queue metrics are inspected deterministically before LLM review summarizes why approvals are delayed or repeatedly reversed.
envelope: May recommend reviewer reassignment, SLA changes, or gate copy updates; may not bypass approval or approve work on behalf of humans.
coverage: Finds human gates without explicit SLA, reviewer owner, escalation route, reversal tracking, or stale-approval handling.
first slice: Score finance, audit, deploy, and outbound customer approval gates for wait time and reversal patterns.
G1.U4.P5.Z2.A1
observes
control: Stages learning-store writes until source evidence, retention class, contradiction status, and rollback path are attached.
analyzer: Structured provenance checks and retention rules run before semantic contradiction review decides whether a memory write is safe.
envelope: May stage or reject memory writes and request reviewer rationale; may not permanently mutate shared memory without source evidence.
coverage: Detects memory-writing agents without provenance, retention class, reviewer route, rollback key, or contradiction scan.
first slice: Gate CI repair, workflow repair, and customer-operations memory writes with provenance and contradiction checks.
G1.U4.P6.Z2.A1
observes
control: Stops rollout and produces a rollback or flag-disable proposal when canary metrics exceed the approved blast-radius envelope.
analyzer: Deployment metrics, smoke probes, and flag diffs are checked first, with LLM analysis limited to summarizing blast-radius evidence.
envelope: May disable feature flags, stop rollout, or open rollback PRs; may not promote canaries to full rollout without approval.
coverage: Finds deployable surfaces without canary probes, flag owner, rollback command, post-deploy observation, or customer-impact tier.
first slice: Add canary probes and rollback evidence to Auto-Dev repair PRs and Vercel preview promotion.
G1.U4.P7.Z2.A1
observes
control: Turns backlog, SLA, renewal, and incident-communication gaps into routed owner work with draft evidence packs.
analyzer: Operational thresholds and account-health rules are evaluated first, then LLM review drafts customer-safe escalation summaries.
envelope: May create internal tasks and draft customer updates; may not send incident, renewal, or contractual messages without approval.
coverage: Flags customer operations flows without SLA owner, customer visibility tier, account-risk signal, or approved communication path.
first slice: Join support tickets, account health, renewal dates, and incident events into one customer-ops harness scorecard.
G1.U4.P1.Z3.A1
observes
control: Blocks UI changes when route ownership, hydration boundaries, metadata, or user-visible fallback behavior is incomplete.
analyzer: Static route and component inspection checks client directives, async boundaries, metadata, and empty-state contracts before visual review.
envelope: May block component changes and propose boundary fixes; may not convert server components to client components without owner approval.
coverage: Flags new pages, layouts, or interactive components without render contract, loading state, empty state, or ownership evidence.
first slice: Run render-contract checks on product pages, dashboard panels, and experimental surfaces added in each PR.
G1.U4.P2.Z3.A1
observes
control: Stops pages from shipping when English and Japanese content, route availability, or mobile layout behavior diverge.
analyzer: Message-key diffs and viewport constraints are evaluated deterministically before visual checks review overflow or layout regressions.
envelope: May block release and propose copy or layout fixes; may not change product messaging intent without content owner review.
coverage: Finds locale-aware pages without message parity, mobile viewport coverage, overflow checks, or translated route validation.
first slice: Attach locale parity and mobile text-fit checks to blog, product, dashboard, and experimental pages.
G1.U4.P2.Z2.A2
observes
control: Blocks market-facing visual acceptance when a route scores below the richness threshold and queues a UI-agent repair plan.
analyzer: Playwright captures first-viewport screenshots and deterministic DOM visual metrics, then emits scoped UI-agent repair tasks for low-scoring routes.
envelope: May draft visual improvement plans and low-risk UI patches; may not ship brand direction changes or remove governance evidence without review.
coverage: Finds public routes without enough primary visual asset density, color variety, layered surfaces, hierarchy, or screenshot evidence.
first slice: Score public routes above the fold and write screenshot-backed repair tasks for any page that feels visually underbuilt.
G1.U4.P4.Z3.A1
observes
control: Returns UI surfaces for repair when keyboard navigation, focus management, contrast, labels, or visual rendering evidence is missing.
analyzer: Automated accessibility and screenshot checks run first, with LLM review only for ambiguous visual hierarchy or interaction clarity.
envelope: May return UI artifacts for repair; may not waive accessibility regressions on production paths without documented approval.
coverage: Flags interactive screens without keyboard path, contrast check, semantic labels, screenshot evidence, or canvas fallback verification.
first slice: Add postrun accessibility and screenshot review to dense dashboards, voice UI, and canvas-heavy experimental pages.
G1.U4.P1.Z4.A1
observes
control: Blocks backend endpoints when request validation, response shape, error behavior, or governance coordinates are missing.
analyzer: Route-handler AST and schema checks validate methods, input parsing, status codes, and response shape before semantic contract review.
envelope: May block API route changes and draft schema repairs; may not alter public API semantics without product and backend approval.
coverage: Finds route handlers without input validation, typed response envelope, error taxonomy, MARIA coordinate, or test coverage.
first slice: Score new and modified app/api route handlers for validation, typed envelopes, and explicit error outcomes.
G1.U4.P2.Z4.A1
observes
control: Stops frontend, API, and agent actions before they cross tenant, role, data, or tool authority boundaries.
analyzer: Deterministic session, tenant, role, and tool-scope policy evaluation runs before any request or agent action mutates state.
envelope: May deny requests, downgrade to read-only, or request approval; may not grant roles, tenants, or tool permissions.
coverage: Flags server actions, API routes, and agent tools without session checks, tenant filters, role policy, or permission envelope.
first slice: Attach auth preflight results to write APIs, customer-data reads, agent tools, and external action routes.
G1.U4.P2.Z5.A1
observes
control: Stops DB changes when reversibility, tenant policy, data migration, index coverage, or test evidence is incomplete.
analyzer: Schema diff, migration operation, index coverage, and RLS policy checks run before reviewer-guided data-risk analysis.
envelope: May block migrations and draft reversible plans; may not apply destructive DB changes or relax RLS without explicit approval.
coverage: Finds schema changes without rollback, RLS impact, seed update, data backfill, index analysis, or integration-test plan.
first slice: Evaluate db/schema changes for destructive operations, RLS coverage, rollback path, and dependent API surfaces.
G1.U4.P3.Z4.A1
observes
control: Detects live adapter drift and switches views to bounded fallback states while routing repair work to the provider owner.
analyzer: Runtime adapter telemetry and response-shape checks compare mock and live provider contracts before fallback behavior is adjusted.
envelope: May degrade to mock-safe or read-only mode and open adapter repair tasks; may not silently mix tenant data across providers.
coverage: Flags data providers without mock-live parity tests, timeout policy, fallback state, tenant filter, or response-shape contract.
first slice: Monitor dashboard and product data providers for mock-live parity, adapter timeout, and shape mismatch episodes.
G1.U4.P3.Z5.A1
observes
control: Prevents duplicate or stale scheduled execution and routes missed ticks, queue backlogs, and lock failures to bounded recovery.
analyzer: Schedule, idempotency, lock, and backlog telemetry are checked first, then historical incident memory ranks likely repair paths.
envelope: May pause jobs, skip duplicate ticks, or enqueue repair tasks; may not replay side-effecting jobs without approval.
coverage: Finds cron and background workflows without idempotency key, stale-lock handling, backlog metrics, or replay policy.
first slice: Add runtime checks to Civilization daily advancement, intelligence scans, and automation harness jobs.
G1.U4.P3.Z6.A1
observes
control: Blocks answer generation or downgrades confidence when retrieval freshness, source integrity, or citation coverage fails.
analyzer: Index timestamps, source hashes, retrieval hit rates, and citation coverage are checked before semantic answer support review.
envelope: May narrow retrieval, mark sources stale, or request reindex; may not publish unsupported answers or delete source corpora.
coverage: Flags ingestion and RAG paths without source hash, freshness SLA, retrieval metric, citation requirement, or reindex workflow.
first slice: Attach RAG freshness checks to FAQ, CPA, knowledge graph, and document-scanner answer episodes.
G1.U4.P3.Z7.A1
observes
control: Stops or rewrites streamed output when partial content violates schema, authority, safety, or customer-visibility rules.
analyzer: Chunk-level schema, safety, and tool-call guards run during streaming before postrun review evaluates full artifact quality.
envelope: May stop streams, redact partial chunks, or fall back to safe summary; may not continue unsafe public output after a guard trip.
coverage: Finds streaming endpoints without chunk guard, abort policy, redaction path, final envelope validation, or audit trace.
first slice: Add chunk-level guards to audit chat, voice responses, workflow scans, and model-generated report streams.
G1.U4.P5.Z3.A1
observes
control: Prevents blind autonomous execution by requiring traceable coordinates, redacted logs, owned metrics, and alert coverage.
analyzer: Trace coverage, coordinate presence, metric completeness, and PII log policy checks run before observability repair planning.
envelope: May add instrumentation tasks and block blind automation; may not expose sensitive logs or weaken retention policy.
coverage: Finds routes, jobs, agents, and UI workflows without trace ID, MARIA coordinate, metric owner, redaction, or alert rule.
first slice: Score new APIs, cron jobs, and agent workflows for trace coverage and coordinate completeness.
G1.U4.P6.Z3.A1
observes
control: Creates scoped repair plans when user-critical flows fail through selector drift, visual regression, navigation, or data fixture mismatch.
analyzer: Playwright traces, screenshots, selector changes, and route diffs are classified before repair planning proposes the smallest UI or test fix.
envelope: May update scoped selectors, fixtures, and low-risk UI defects; may not delete user-critical assertions or weaken journey coverage.
coverage: Finds product-critical flows without E2E journey, screenshot baseline, responsive coverage, fixture owner, or failure fingerprint.
first slice: Attach E2E journey repair loops to booking, workflow scanner, audit office, and dashboard critical paths.
G1.U4.P3.Z8.A1
observes
control: Detects stale, misrouted, or incorrectly cached responses and routes safe cache disablement or middleware repair proposals.
analyzer: Header, redirect, locale, and cache-control traces are checked deterministically before impact analysis reviews user-visible fallout.
envelope: May disable caching for affected routes or open middleware repair tasks; may not change global cache policy without approval.
coverage: Flags middleware and cached routes without cache-key policy, locale redirect tests, stale-content SLA, or header verification.
first slice: Monitor locale middleware, product pages, blog pages, and API cache headers for redirect and stale-content incidents.
Scroll to start building...
Goal > Scope > Team > Responsibility > Skills > Build > Gates > Validate > Test > Deploy
Skills (K1-K8) are dynamically fetched and auto-refilled from Skill Store
Instead of judging only final outputs, MARIA OS observes goal, memory, identity, quality, latency, cost, and authority as one state vector, then changes the trajectory before the system breaks.
Normalize every agent run into a runtime episode with intent, memory, tools, gates, assets, latency, and corrections.
Map failures to owner, severity, confidence, user visibility, and the verification command that can prove the fix.
Track completion, pass rate, retry rate, advisory lift, and failure density as a time-varying scorecard.
Convert instability into reruns, quarantine, draft repair PRs, or human approval before autonomy can expand.
The episode extraction, failure taxonomy, scorecards, repair proposals, and controlled self-healing proven in virtual-talent become the runtime governance layer for MARIA OS and agentic society.
Read the research