Guardrails-First Course Materials
What You Get
This course is organized as a practical production path:
- each chapter focuses on one common incident/failure mode
- each chapter defines a safe operational workflow
- each chapter includes hands-on materials (
lab.md, andquiz.md; many include runbooks)
Recommended Learning Path
- Start from Chapter 01 and go sequentially through the core track.
- Run the lab before moving to the next chapter.
- Use the quiz at the end of each chapter to validate understanding.
- Move to advanced modules only after finishing the core path.
Tracks
Core track:
- Chapters 01-13 (platform fundamentals, GitOps, security, observability, reliability, on-call)
Advanced track:
- Chapter 14: Supply Chain Security
- Chapter 15: Admission Policy Guardrails
- Chapter 16: Rollback and Data Migrations
- Module: Linkerd + Progressive Delivery (Canary / A-B)
Chapter-by-Chapter Implementation Coverage (sre -> course)
| Chapter/Module | Platform Status in sre | Evidence in Platform Repo |
|---|---|---|
| Chapter 01 Intro Guardrails | Implemented | scripts/guard-kube-context.sh, scripts/guard-terraform-plan.sh, .pre-commit-config.yaml |
| Chapter 02 IaC | Implemented | infra/terraform/hcloud_cluster/, infra/terraform/kind_cluster/ |
| Chapter 03 Secrets | Implemented | .sops.yaml, flux/secrets/**, scripts/sops-setup.sh |
| Chapter 04 GitOps | Implemented | flux/bootstrap/infrastructure/image-automation/, flux/apps/**/image-policy.yaml |
| Chapter 05 Network Policies | Implemented | flux/infrastructure/network-policies/** |
| Chapter 06 Security Context | Implemented | flux/apps/backend/base/deployment.yaml, flux/apps/frontend/base/deployment.yaml |
| Chapter 07 Resource Management | Implemented | flux/infrastructure/resource-management/** |
| Chapter 08 Availability (HPA+PDB) | Implemented | flux/apps/backend/*/{hpa,pdb}.yaml, flux/apps/frontend/overlays/*/{hpa,pdb}.yaml |
| Chapter 09 Observability | Implemented | flux/infrastructure/observability/**, backend/frontend telemetry instrumentation |
| Chapter 10 Backup & Restore | Implemented | flux/infrastructure/data/cnpg-operator/, flux/infrastructure/data/cnpg-clusters/** |
| Chapter 11 Controlled Chaos | Implemented | flux/infrastructure/chaos/develop/** |
| Chapter 12 AI Guardian | Partially implemented (course-ready + integration contract) | chapter design and runbook ready; guardian service manifests are external integration target |
| Chapter 13 24/7 Production SRE | Implemented (operational content + alerts baseline) | chapter-13 runbooks/postmortem + observability alert rules |
| Chapter 14 Supply Chain Security | Scaffolded in platform, fully documented in course | flux/infrastructure/policy/kyverno/, policy/packs/chapter-14-supply-chain/ |
| Chapter 15 Admission Policy Guardrails | Scaffolded in platform, fully documented in course | flux/infrastructure/policy/kyverno/, policy/packs/chapter-15-admission-guardrails/ |
| Chapter 16 Rollback & Data Migrations | Course workflow ready (simulation), implementation follows app DB evolution | CNPG platform baseline + chapter lab/runbook for rollout/rollback sequence |
| Module Linkerd Progressive Delivery | Controllers/manifests present; sample pack opt-in | flux/infrastructure/progressive-delivery/{linkerd,flagger,develop}/ |
Advanced-track policy packs are intentionally shipped in safe scaffold mode first (Audit-first workflow in chapters), then moved to enforced runtime policy as rollout evidence matures.
References
- Full structure and outcomes: Curriculum
- Intro mental model: Intro: AI as a Very Well-Read Junior Engineer