Incident Hook
A debug pod in the develop namespace reaches internal services it should never touch. No exploit sophistication is needed—only open east-west traffic. When the incident starts, responders cannot quickly prove or limit the blast radius because the network has no baseline protection.
Result: A small compromise in a non-production environment becomes a major security risk because an attacker can move laterally across the cluster.
Observed Symptoms
What the team sees first:
- A pod in
developcan connect to services outside its intended boundary. - Responders cannot quickly answer what is reachable and what is not.
Proof of Lateral Movement:
# Executing from a simple debug pod in 'develop'
kubectl exec -it debug-pod -n develop -- curl -I http://payment-db.production.svc.cluster.local:5432
# Response:
HTTP/1.1 200 OK # ❌ CRITICAL: The connection succeeded.
# Dev pods should NEVER reach Production DBs.
The issue is not just one bad connection; it is the absence of a trustworthy traffic model.
Confusion Phase
Without policies, every connectivity question becomes manual investigative work. The team now has to discover:
- Which paths are legitimately required?
- Which paths are accidental exposures?
- How to contain the pod without breaking the entire namespace?
What AI Would Propose (Brave Junior):
- “Skip policies for now to avoid breaking traffic.”
- “We can secure networking later, after the release.”
- “Use broad allow rules (like
0.0.0.0/0) to get the app running quickly.”
Pause and Predict: Before reading the investigation, write down your top 3 hypotheses. What would you check first?