scheduling

5 posts

Kueue: The Community Standard for Kubernetes AI Batch Scheduling

Kueue manages GPU quotas, enforces fair sharing across teams, and dispatches jobs to remote HPC clusters — the standard for production AI batch scheduling.

by KubeDojo

kubernetes scheduling workload-api

Workload-Aware Scheduling in Kubernetes 1.36: The Decoupled PodGroup Model

Kubernetes 1.36 decouples scheduling policy from runtime instances with Workload API v1alpha2, standalone PodGroups, and a dedicated group scheduling cycle.

by KubeDojo

nvidia kai-scheduler gpu

NVIDIA KAI Scheduler: Open-Source GPU-Aware Kubernetes Scheduling

NVIDIA open-sourced KAI Scheduler (Apache 2.0), a Kubernetes-native GPU scheduling solution originally from the Run:ai platform.

by KubeDojo

dra kubernetes gpu

Dynamic Resource Allocation (DRA) GA: The New GPU Interface for Kubernetes

DRA went GA in Kubernetes v1.34 and continues evolving — replacing Device Plugins with richer semantics including DeviceClass, ResourceClaim, CEL-based filtering, and topology awareness.

by KubeDojo

armada multi-cluster scheduling

Armada: Multi-Cluster GPU Scheduling as a Single Resource Pool

Armada treats multiple Kubernetes clusters as a single resource pool for GPU-intensive AI workloads, with global queue management, gang scheduling, and production-scale throughput.

by KubeDojo