The Incident: Lateral Movement

Incident Hook

A debug pod in the develop namespace reaches internal services it should never touch. No exploit sophistication is needed—only open east-west traffic. When the incident starts, responders cannot quickly prove or limit the blast radius because the network has no baseline protection.

Result: A small compromise in a non-production environment becomes a major security risk because an attacker can move laterally across the cluster.

Observed Symptoms

What the team sees first:

A pod in develop can connect to services outside its intended boundary.
Responders cannot quickly answer what is reachable and what is not.

Proof of Lateral Movement:

# Executing from a simple debug pod in 'develop'
kubectl exec -it debug-pod -n develop -- curl -I http://payment-db.production.svc.cluster.local:5432

# Response:
HTTP/1.1 200 OK  # ❌ CRITICAL: The connection succeeded. 
                 # Dev pods should NEVER reach Production DBs.

The issue is not just one bad connection; it is the absence of a trustworthy traffic model.

Confusion Phase

Without policies, every connectivity question becomes manual investigative work. The team now has to discover:

Which paths are legitimately required?
Which paths are accidental exposures?
How to contain the pod without breaking the entire namespace?

What AI Would Propose (Brave Junior):

“Skip policies for now to avoid breaking traffic.”
“We can secure networking later, after the release.”
“Use broad allow rules (like 0.0.0.0/0) to get the app running quickly.”

Pause and Predict: Before reading the investigation, write down your top 3 hypotheses. What would you check first?

Estimated Time

Prerequisites

Source Code References

What You Will Produce

Incident Hook

Observed Symptoms

Confusion Phase