Chapter 07: Security Context & Pod Hardening

Incident Hook

A container compromise lands shell access inside a pod. If the pod runs with broad privileges, escalation is fast. If security context is hardened, attacker movement is constrained. This chapter teaches those constraints as default behavior.

Observed Symptoms

What the team sees first:

a shell exists inside a compromised container
the pod may be able to write broadly or escalate privileges
responders need to know whether the workload is hardened or soft by default

The difference between inconvenience and incident expansion is often the security context.

Confusion Phase

When the workload is failing, broad privilege shortcuts feel tempting. That is exactly when teams blur debugging with escalation risk.

The real question is:

does the app need a specific writable path
or is the team about to grant root-like power because it is faster

Why This Chapter Exists

Container defaults are not production-safe. This chapter enforces baseline pod hardening:

non-root execution
read-only root filesystem where possible
dropped Linux capabilities
runtime-default seccomp

What AI Would Propose (Brave Junior)

“Set privileged: true just for debugging.”
“Disable readOnlyRootFilesystem to make tools work quickly.”
“Run as root for this one release.”

Why this sounds reasonable:

fixes permissions fast
removes immediate deploy friction

Why This Is Dangerous

privilege shortcuts create direct escalation paths.
root + writable filesystem increases persistence options after compromise.
temporary relaxations are often never rolled back.

Investigation

Treat the manifest as evidence of blast radius.

Safe investigation sequence:

inspect pod and container security context fields
confirm whether the workload runs non-root and with dropped capabilities
identify the exact path or permission the app actually needs
fix the narrow requirement instead of widening the whole container privilege model

Containment

Containment keeps the baseline intact:

preserve non-root execution
add only explicit writable volumes where needed
keep allowPrivilegeEscalation: false
verify the workload still runs with the hardened baseline after the fix

Guardrails That Stop It

runAsNonRoot: true for pod and container.
allowPrivilegeEscalation: false.
capabilities.drop: [ALL].
seccompProfile: RuntimeDefault.
writable paths only via explicit volumes (/tmp, runtime/cache dirs).

Golden Baseline vs Insecure Diff

Secure baseline (minimum):

securityContext:
  runAsNonRoot: true
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  capabilities:
    drop: ["ALL"]
  seccompProfile:
    type: RuntimeDefault

Insecure anti-pattern:

runAsUser: 0 / root by default
privileged: true
writable root filesystem with broad capabilities

Investigation Snapshots

Here is the hardened backend deployment used in the SafeOps system. This is what non-root execution, read-only filesystems, and dropped capabilities look like in a real workload.

Hardened backend deployment

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
  labels:
    app: backend
    app.kubernetes.io/name: backend
    app.kubernetes.io/component: api
spec:
  replicas: 1
  revisionHistoryLimit: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: 1
  selector:
    matchLabels:
      app: backend
  template:
    metadata:
      labels:
        app: backend
        app.kubernetes.io/name: backend
        app.kubernetes.io/component: api
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      imagePullSecrets:
      - name: ghcr-credentials-docker
      securityContext:
        runAsNonRoot: true
        runAsUser: 10001
        runAsGroup: 10001
        fsGroup: 10001
        seccompProfile:
          type: RuntimeDefault
      containers:
      - name: backend
        image: ghcr.io/ldbl/backend:latest
        imagePullPolicy: IfNotPresent
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 10001
          runAsGroup: 10001
          capabilities:
            drop:
              - ALL
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        env:
        - name: PORT
          value: "8080"
        - name: NAMESPACE
          value: "${NAMESPACE}"
        - name: ENVIRONMENT
          value: "${ENVIRONMENT}"
        - name: LOG_LEVEL
          value: "${LOG_LEVEL}"
        - name: SERVICE_NAME
          value: "backend"
        - name: SERVICE_VERSION
          value: "v1.0.0"
        - name: DEPLOYMENT_ENVIRONMENT
          value: "${ENVIRONMENT}"
        - name: OTEL_RESOURCE_ATTRIBUTES
          value: "k8s.cluster.name=${cluster_name}"
        - name: UPTRACE_DSN
          valueFrom:
            secretKeyRef:
              name: backend-secrets
              key: uptrace-dsn
        - name: OTEL_EXPORTER_OTLP_HEADERS
          valueFrom:
            secretKeyRef:
              name: backend-secrets
              key: uptrace-headers
        - name: JWT_SECRET
          valueFrom:
            secretKeyRef:
              name: backend-secrets
              key: jwt-secret
        - name: POSTGRES_USER
          valueFrom:
            secretKeyRef:
              name: app-postgres-app
              key: username
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: app-postgres-app
              key: password
        - name: POSTGRES_HOST
          value: app-postgres-rw
        - name: POSTGRES_DB
          value: app
        livenessProbe:
          httpGet:
            path: /livez
            port: http
          initialDelaySeconds: 15
          periodSeconds: 20
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /readyz
            port: http
          initialDelaySeconds: 5
          periodSeconds: 10
          timeoutSeconds: 3
          failureThreshold: 3
        startupProbe:
          httpGet:
            path: /healthz
            port: http
          initialDelaySeconds: 0
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 30
        resources:
          requests:
            cpu: 10m
            memory: 32Mi
            ephemeral-storage: 64Mi
          limits:
            cpu: 100m
            memory: 128Mi
            ephemeral-storage: 128Mi
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: cache
          mountPath: /home/app/.cache
      volumes:
      - name: tmp
        emptyDir: {}
      - name: cache
        emptyDir:
          sizeLimit: 10Mi

Here is the namespace baseline that the workload inherits from.

Namespace security baseline

Show the namespace baseline

---
apiVersion: v1
kind: Namespace
metadata:
  name: cert-manager
  labels:
    name: cert-manager
    managed-by: flux
---
apiVersion: v1
kind: Namespace
metadata:
  name: develop
  labels:
    name: develop
    environment: development
    managed-by: flux
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
---
apiVersion: v1
kind: Namespace
metadata:
  name: staging
  labels:
    name: staging
    environment: staging
    managed-by: flux
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
---
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    name: production
    environment: production
    managed-by: flux
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest

System Context

This chapter hardens what network policy alone cannot stop.

It supports:

Chapter 06 network isolation by reducing what a reached workload can do
Chapter 08 resource safety by keeping runtime behavior disciplined
Chapter 16 admission policy, where these hardening rules become enforceable at cluster boundary

What a Hardened Baseline Looks Like

Non-root user, read-only root filesystem, dropped capabilities, seccomp runtime default.
Writable paths are explicitly mounted through emptyDir volumes.
Every container follows the same baseline — exceptions require documented justification.

Safe Workflow (Step-by-Step)

Start from hardened baseline manifest and compare any requested exception as diff.
Validate pod/container security context before rollout.
If app fails from permissions, add explicit writable volume path instead of root/privileged mode.
Re-run workload and confirm it remains non-root with dropped caps.
Document why any exception exists and expiration/removal plan.

Break/Fix Drill Focus

Intentionally trigger one permission failure with hardened settings.
Fix it without root or privileged mode.
Capture before/after manifest diff as evidence.

Lab Files

lab.md
quiz.md

Done When

learner can prove both workloads run non-root with hardened contexts
learner can diagnose and fix permission failures without enabling root
learner can explain why privileged shortcuts are rejected

Estimated Time

Prerequisites

Source Code References

What You Will Produce