Guardrails That Stop It
- Multi-Replica Baseline:
stagingandproductionstart with a minimum of 2 replicas for all critical services. - Scaling Bounds: Every service must define
minReplicasandmaxReplicasand target resource utilization (CPU/Memory). - Disruption Budgeting: Every service must have a Pod Disruption Budget (PDB) to prevent unsafe disruption during maintenance.
- Pre-Drain Verification: Never execute a node drain or rollout without first checking the PDB and HPA state.
Expected Baseline
- HPA (
autoscaling/v2): Configured for bothbackendandfrontendin all environments. - PDB (
policy/v1): Configured to allow exactly 1 disruption at a time for multi-replica services.
Backend HPA and PDB layout
Show the backend availability layout
flux/apps/backend/develop/hpa.yamlflux/apps/backend/develop/image-automation.yamlflux/apps/backend/develop/image-policy.yamlflux/apps/backend/develop/kustomization.yamlflux/apps/backend/develop/patches/feature-flags.yamlflux/apps/backend/develop/pdb.yaml
Safe Workflow (Step-by-Step)
- Preflight Check: Verify the current HPA status and its defined bounds.
- Confirm PDB Allowance: Ensure
ALLOWED DISRUPTIONSis greater than 0. - Trigger Disruption: Perform the maintenance action (e.g., node drain).
- Observe Scaling: Watch the HPA react if load shifts to the remaining pods.
- Final Verification: Confirm the workload remains available and returns to its desired replica count after the action is complete.
Frontend HPA and PDB layout
Show the frontend availability layout
flux/apps/frontend/overlays/develop/hpa.yamlflux/apps/frontend/overlays/develop/image-automation.yamlflux/apps/frontend/overlays/develop/image-policy.yamlflux/apps/frontend/overlays/develop/kustomization.yamlflux/apps/frontend/overlays/develop/namespace.yamlflux/apps/frontend/overlays/develop/patches/deployment.yamlflux/apps/frontend/overlays/develop/patches/ingress.yamlflux/apps/frontend/overlays/develop/pdb.yaml
This builds on: Resource management (Chapter 08) — HPA and PDB build on resource contracts. This enables: Observability (Chapter 10) — availability signals feed the monitoring stack.