Cover visual for Node Lifecycle and Capacity Stewardship

Cluster Operations

Node Lifecycle and Capacity Stewardship

Scheduling pressure, disruption budgets, and drain choreography for teams that own both nodes and tenant expectations.

Duration
3 weeks
Format
Hybrid studio days
Tuition (informational)
1,180,000 KRW

Operators learn to read scheduler decisions, justify resource profiles, and coordinate drains with workload SLOs. The syllabus spends time on mixed-arch fleets and GPU reservations without diving into vendor-specific drivers beyond what Kubernetes surfaces.

What is included

  • Taints, tolerations, and topology spread constraints in paired labs
  • PriorityClass tradeoffs with fair sharing examples
  • DisruptionBudget authoring against real microservice graphs
  • Metrics signals that precede eviction storms
  • Node problem detector patterns without vendor lock-in
  • Hands-on drain sequencing with staged traffic shifts
  • Written post-lab decision memo template

Outcomes

  • Schedule a maintenance window with defensible PDB coverage
  • Tune a resource profile with measurable latency guardrails
  • Communicate node health signals to service owners

Lead instructor

Jonas Iyer

Spent six years on regional bare-metal fleets before moving to cloud-agnostic training.

Participant notes

  • “The PDB lab used our own service graph template—surprisingly close to reality.”

    — Eun · 4/5 · Google

Common questions

Will we cover autoscaling?
Horizontal pod autoscaling yes; cluster autoscaler only at the conceptual level unless your cohort requests a deep dive in week three office hours.
Limitations?
We do not install proprietary node agents; bring your curiosity about upstream signals only.
Prerequisite?
Comfort with kubectl and YAML manifests; prior on-call exposure helps but is not mandatory.

Refund rules live under Returns & Refunds. No payments are processed on this marketing site.

Schedule a call