Draft Standards Language

Failure-First Safety for Embodied AI

Draft Status

This document is a working draft. It is not an official standard and has not been adopted by any standards body. It represents proposed language based on failure-first research findings.

Scope

This document specifies safety requirements for embodied AI systems operating in physical environments where failure can result in irreversible consequences.

Principle

Safety SHALL be evaluated over time and interaction, not solely by task success. A system that completes all tasks correctly but cannot detect, communicate, or recover from failure states is not safe.

Requirements

R1. Detectable Failure States

Systems SHALL implement detectable failure states. All failure conditions must be observable by human operators and monitoring systems.

R2. Bounded Degradation Modes

Systems SHALL provide bounded degradation modes. When operating outside normal parameters, systems must degrade predictably rather than fail catastrophically.

R3. Immediate Human Takeover

Systems SHALL support immediate human takeover. At any point during operation, a human operator must be able to assume control with documented latency.

R4. No Capability Leakage

Systems SHALL prevent capability leakage after refusal. Once a system has refused an action, it must not subsequently perform that action through alternative pathways.

R5. Halt Condition Logging

Systems SHALL log and explain halt conditions. Every system halt must produce a human-readable explanation of why the system stopped and what conditions triggered the halt.

Evaluation

Conformance SHALL be assessed using recursive interaction scenarios and recovery metrics, including reversibility and re-entry latency. Single-turn evaluation is insufficient for conformance assessment.