
HAMi: GPU Virtualization as the Reference Pattern for AI Infrastructure
HAMi (CNCF Sandbox) emerged at KubeCon EU 2026 as the reference implementation for GPU resource management across NVIDIA, AMD, Huawei, and Cambricon accelerators.
Recent Posts

HAMi (CNCF Sandbox) emerged at KubeCon EU 2026 as the reference implementation for GPU resource management across NVIDIA, AMD, Huawei, and Cambricon accelerators.

Gateway API Inference Extension GA standardizes model-aware routing for self-hosted LLMs with KV-cache-aware scheduling and LoRA adapter affinity.

DRA went GA in Kubernetes v1.34 and continues evolving — replacing Device Plugins with richer semantics including DeviceClass, ResourceClaim, CEL-based filtering, and topology awareness.

Cloudflare made AI Security for Apps GA with prompt injection protection, while RFC 9457 structured error responses cut AI agent token costs by 98%.

How Karpenter's groupless, pod-driven provisioning model solves the scaling limitations that plagued Kubernetes Cluster Autoscaler for years.

How Karpenter automatically replaces drifted nodes, consolidates underutilized capacity, and respects disruption budgets to keep clusters lean and current.

A hands-on walkthrough of installing Karpenter on EKS, configuring NodePools and EC2NodeClasses, and hardening the setup for production workloads.

Dedicated GPU NodePools, cold start fixes for 10GB+ AI images, disruption protection for training jobs, and gang scheduling for distributed workloads.

Spot-to-Spot consolidation, instance diversification, right-sizing pod requests, and the real-world strategies that cut Kubernetes compute costs by 20-40%.