Comprehensive Scenario Classes Reference

Browsable reference for all 661 scenario classes and 117 harm categories in the Failure-First Embodied AI taxonomy

taxonomy Last updated: February 6, 2026

Comprehensive Scenario Classes Reference

This document provides a browsable reference for all failure modes and harm categories covered in the project. The complete taxonomy includes 661 scenario classes organized by domain.

1. Taxonomy Overview

Scenario classes represent specific, context-aware failure patterns discovered in our datasets. They are organized into these major domains:

  • embodied_redteam (581 classes) - Core embodied AI failure patterns
  • adversarial_poetry_redteam (30 classes) - Poetic/creative attack vectors
  • jailbreak_archaeology (multiple eras) - Historical jailbreak patterns adapted for embodiment
  • multi_agent - Multi-actor conflict scenarios
  • episodes - Stateful multi-turn degradation patterns

2. Major Scenario Class Categories

2.1 Embodied Redteam Classes (Sample)

Key categories within the 581 embodied_redteam classes include:

  • Academic_Research_Exemption - Exploiting research context claims
  • Adversarial_Dataset_Generation - Meta-level jailbreak generation requests
  • Authority_Conflict - Multi-authority contradictions
  • Cognitive_Load_Exploitation - Overwhelming agent reasoning capacity
  • Comparative_Distinction_Leakage - Eliciting information through comparison
  • Constraint_Erosion - Gradual relaxation of safety boundaries
  • Embodied_Mirror_Neuron_Hijack - Exploiting imitation behaviors
  • Format_Lock - Forcing strict output formats to bypass refusals
  • Future_Year_Laundering - Temporal displacement attacks
  • Goal_Drift - Incremental objective shifts
  • Incremental_Erosion - Multi-turn safety degradation
  • Intent_Bait - High-precision instruction-hierarchy subversion
  • Memory_Poisoning - Adversarial context injection
  • Persona_Hijack - Role-based safety bypass
  • Physical_Boundary_Violation - Spatial constraint exploitation
  • Post_Refusal_Leakage - Extracting information after initial refusal
  • Reasoning_Exploit - CoT manipulation in reasoning models
  • Refusal_Suppression - Explicit anti-refusal instructions
  • Social_Shielding - Using human presence to prevent safety interventions
  • Temporal_Displacement - Future/past context manipulation
  • Workspace_Intrusion - Restricted zone access attempts

2.2 Adversarial Poetry Classes

Creative attack vectors using poetic forms:

  • Memory Corruption through Repetition
  • Syllabic Safety Bypass
  • Visual Command Exploitation
  • Recursive Command Echo
  • Sensor Fusion Corruption through Syllable Patterns
  • Navigation Distortion
  • Social Engineering Cascade

2.3 Jailbreak Archaeology Eras

Historical jailbreak techniques organized by era:

  • dan_2022 - Original DAN (Do Anything Now) persona variants
  • persona_2022 - Early persona-based jailbreaks
  • cipher_2023 - Encoding-based evasion (Base64, ROT13, leetspeak)
  • many_shot_2024 - Context window saturation attacks
  • crescendo_2024 - Incremental multi-turn erosion
  • reasoning_2025 - Chain-of-thought manipulation for reasoning models

3. Harm Class Taxonomy (117 Classes)

All scenario classes map to one of 117 harm categories, which in turn map to the MLCommons AILuminate v1.0 taxonomy (12 categories across 3 domains).

AILuminate Domain Distribution

DomainAILuminate CategoryCoverage
Physical HazardsViolent Crimes9.7%
Physical HazardsIndiscriminate Weapons (CBRN)4.5%
Physical HazardsSuicide and Self-Harm2.7%
Nonphysical HazardsNonviolent Crimes21.8%
Nonphysical HazardsHate12.0%
Nonphysical HazardsDefamation5.8%
Nonphysical HazardsChild Sexual Exploitation1.5%
Nonphysical HazardsSex-Related Crimes1.2%
Contextual HazardsSpecialized Advice12.0%
Contextual HazardsSexual Content4.5%
Contextual HazardsPrivacy3.0%
Contextual HazardsIntellectual Property1.6%
(unmapped)Mixed/needs annotation19.6%

4. Using This Taxonomy

For Researchers

  • Dataset Selection: Choose datasets based on the scenario classes you want to test
  • Failure Analysis: Classify observed failures using this taxonomy
  • Coverage Analysis: Identify gaps in your evaluation suite

For Developers

  • Safety Testing: Build comprehensive test suites covering multiple classes
  • Failure Monitoring: Categorize production failures for trend analysis
  • Defense Design: Prioritize defenses based on high-risk scenario classes

For Policymakers

  • Risk Assessment: Understand the breadth of potential failure modes
  • Standards Development: Reference this taxonomy when creating safety standards
  • Compliance: Map regulatory requirements to specific scenario classes

5. Accessing the Full Taxonomy

The complete taxonomy with all 661 scenario classes is available in the research datasets. Key interfaces:

  • Dataset Files: JSONL files with scenario_class field
  • Database Queries: SQL queries against the jailbreak corpus database
  • Programmatic Access: Python tools for taxonomy traversal and analysis

6. Taxonomy Evolution

This taxonomy is continuously expanded through:

  • Automated Discovery: Pattern-mining from model responses
  • Manual Curation: Expert review and classification
  • Community Contributions: Researcher submissions of novel failure patterns
  • Cross-Dataset Integration: Importing patterns from public benchmarks

Version history and change logs track taxonomy evolution over time.



Note: The complete 755-class taxonomy with example IDs and detailed descriptions is available in the research datasets. This web reference provides the organizational structure and key categories for navigation and understanding.