nvidia

5 posts

GPU Sharing Strategies for Multi-Tenant Kubernetes: MIG, Time-Slicing, and MPS

NVIDIA's GPU sharing mechanisms — MIG, time-slicing, and MPS — are gaining traction as teams run multiple inference workloads per GPU.

by KubeDojo

nvidia gpu kubernetes

NVIDIA AI Cluster Runtime: Validated GPU Kubernetes Recipes

NVIDIA released AI Cluster Runtime, an open-source project providing validated, version-locked Kubernetes configurations for GPU infrastructure.

by KubeDojo

nvidia kai-scheduler gpu

NVIDIA KAI Scheduler: Open-Source GPU-Aware Kubernetes Scheduling

NVIDIA open-sourced KAI Scheduler (Apache 2.0), a Kubernetes-native GPU scheduling solution originally from the Run:ai platform.

by KubeDojo

nvidia dynamo inference

NVIDIA Dynamo 1.0: The Inference Operating System for AI Factories

Production deployment patterns for NVIDIA Dynamo 1.0 on EKS and GKE — disaggregated serving, KV-aware routing, and gotchas from real deployments.

by KubeDojo

karpenter gpu ai-ml

GPU and AI/ML Workload Scaling with Karpenter

Dedicated GPU NodePools, cold start fixes for 10GB+ AI images, disruption protection for training jobs, and gang scheduling for distributed workloads.

by KubeDojo