Lab & Completion | SafeOps Academy

Lab Scenarios

Chaos Triage: Trigger a controlled chaos drill (from Chapter 12). Observe how the Guardian captures and normalizes the resulting alerts into a single incident record.
Confidence Analysis: Analyze an incident where the Guardian has a low confidence score (< 0.7). Explain why the provided context was insufficient for a high-confidence hypothesis.
Escalation Drill: Trigger the same failure three times within an hour. Verify that the Guardian correctly transitions the incident from Fresh to Recurring to Persistent.

Core Exercises (Required)

View Incidents: Port-forward to the k8s-ai-monitor service and use curl to view the current list of active incidents.
Audit Sanitization: Check the Guardian’s logs for “redacted” entries during a triage event. Identify which specific fields were removed.
Acknowledge & Resolve: Use the Guardian’s API or CLI to acknowledge an incident, perform a manual fix, and then resolve the incident record.

Challenge Exercise (Optional)

Custom Alert Routing Rule: Configure a new alert routing rule in k8s-ai-monitor for a custom metric threshold. Verify that the guardian correctly normalizes, enriches, and routes the alert through the expected channel.

Done When

You have completed this chapter when:

The Guardian has captured and analyzed at least one controlled chaos scenario.
You can explain why the Guardian has a read-only RBAC boundary.
You have successfully acknowledged and resolved an incident via the API or CLI.
You can demonstrate an escalation scenario (Recurring or Persistent).
You understand the importance of redacting secrets before LLM analysis.

Knowledge Check

Before finishing this chapter, complete the Quiz to verify your understanding of the guardrail principles.

Estimated Time

Prerequisites

Source Code References

What You Will Produce

Lab Scenarios

Core Exercises (Required)

Challenge Exercise (Optional)

Done When

Knowledge Check