kubecon cncf ai-infrastructure platform-engineering agentic-ai sovereign-ai

KubeCon Europe 2026 Amsterdam: AI Infrastructure, Agentic Systems, and Platform Engineering

by KubeDojo·May 6, 2026·12 min read·

KubeCon Europe 2026 Amsterdam: AI Infrastructure, Agentic Systems, and Platform Engineering

KubeCon + CloudNativeCon Europe 2026 runs March 23-26 in Amsterdam with 224 sessions across 10+ tracks. The numbers tell a story on their own: 15.6 million cloud native developers globally, 41% of AI developers now working cloud native, and 56% of organizations reporting understaffed platform engineering teams. This is not a conference about containers anymore. It is a conference about operating AI infrastructure at scale, and Kubernetes is the control plane.

The schedule is dense. If you are attending (or following along remotely), here is where to focus your time across the five themes that define this edition: agentic AI systems, sovereign inference, GPU scheduling, platform engineering, and the observability-security intersection.

Agentic AI Arrives on Kubernetes

The most significant signal at KubeCon EU 2026 is the brand-new Agentics Day: MCP + Agents co-located event on Monday March 23. A half-day dedicated entirely to the Model Context Protocol and agentic workflows. The fact that CNCF carved out a dedicated event for this topic, alongside established events like CiliumCon and Observability Day, tells you where the ecosystem is heading.

MCP is emerging as a shared, interoperable layer for connecting AI models to tools, data, and workflows. The Agentics Day organizers frame it clearly: "Cloud native teams are now being asked to connect models to real tools, data, and workflows in reliable, secure ways, without relying on brittle, one-off integrations." The target audience is platform, SRE, and infrastructure teams who will operate and secure these capabilities.

The keynote to watch

Microsoft's Jorge Palma and Natan Yellin (Robusta) present "Scaling Platform Ops with AI Agents: Troubleshooting to Remediation" on Tuesday morning. The session demonstrates HolmesGPT, a CNCF Sandbox project that connects LLMs to operational and observability data for production diagnosis. The interesting part is not the diagnosis. It is what comes after: remediation policies that let agents detect and fix issues autonomously within strict RBAC boundaries, approval workflows, and audit trails. Palma frames it honestly: "This isn't about replacing SREs; it's about multiplying their effectiveness."

Solo.io's Idit Levine and Keith Babo follow with "From Pilot to Production: Scaling and Optimizing Agentic Workloads on Kubernetes", tackling the operational challenges of running agentic systems beyond prototypes.

Agents in the real world

Netflix's Nick Rutigliano and Andrew Halaney present "Is the Agent in the Room with Us Right Now?", addressing isolation, RBAC, network configurations, and noisy-neighbor concerns for agentic workloads at Netflix scale. At Netflix's scale, moving traditional workloads to agentic AI requires rethinking container isolation, user namespaces, and network boundaries. This is the session for anyone wondering what happens when you give AI agents access to production systems serving millions of users.

Over at BackstageCon, "Agentic Backstage: How to Manage an AI Software Catalog" explores making the developer portal the trusted context layer for AI copilots and agents. The thesis: golden paths, standardized metadata, and governance need to be in place before you let agents interact with your software catalog.

Christian Posta (Solo.io) tackles the governance side directly in "Enterprise Challenges with MCP Adoption", discussing what happens when MCP servers provide access to enterprise APIs on behalf of users without proper guardrails. His recent writing on MCP authorization gaps frames the problem: remote, multi-tenant MCP servers need enterprise-grade governance, and the spec is not there yet.

Sovereign AI and Inference at Scale

Europe adds a dimension you will not find at KubeCon North America: sovereignty as an architectural constraint. DORA, NIS2, and the Cyber Resilience Act are not theoretical. They are driving real infrastructure decisions about where workloads run, how data moves, and what "rebuilding from scratch" actually means.

Kubernetes-native inference orchestration

Red Hat's Vincent Caldeira and Cara Delia present the sponsored keynote "Inference and Sovereign AI: Scaling Cloud-Native AI with Control and Compliance" on Wednesday morning. The session introduces a Kubernetes-native approach to inference orchestration that balances performance, scalability, and compliance. Hardware-aware scheduling, dynamic scaling, multi-tenant resource management, and observability, all while maintaining regulatory compliance.

SAP Labs and NVIDIA go deeper with "Towards Building an Open Source AI Reference Stack for EU Sovereign Cloud", directly addressing the intersection of open source AI and European regulatory requirements.

Inference workloads in production

Wayve's Mukund Muralikrishnan delivers one of the most practical keynotes of the week: "Rules of the road for shared GPUs: AI inference scheduling at Wayve". Wayve runs multi-tenant inference workloads on Kubernetes for their autonomous driving platform. Evaluation, validation, and synthetic data generation all competing for the same GPU capacity. The key insight: "default Kubernetes scheduling falls short" for AI workloads. They use Kueue for queueing and admission control, achieving predictable GPU allocations, improved cluster utilization, and reduced operational noise.

Red Hat's Nir Rozenbaum and Google's Kellen Swain present "Route, Serve, Adapt, Repeat: Adaptive Routing for AI Inference Workloads in Kubernetes", while a separate Red Hat session, "Redefining SLIs for LLM Inference: Managing Hybrid Cloud with vLLM & LLM-D", tackles what reliability metrics actually mean for LLM serving. Traditional SLIs (latency percentiles, error rates) do not map cleanly to inference workloads where token throughput, time-to-first-token, and queue depth matter more.

A new Kubernetes working group makes its debut: "AI'm at the Gate! Introducing the AI Gateway Working Group" proposes extending Gateway API with AI-specific capabilities. Token-based rate limiting, payload inspection for prompt guardrails, semantic routing, and egress controls for external inference services. The WG charter already has active proposals for payload processing and egress gateways. This is not speculative: it is the beginning of standardized AI networking in Kubernetes.

GPU Scheduling: Kueue Takes Center Stage

GPU scheduling is no longer a niche concern. Multiple sessions across keynotes, tutorials, and BoFs converge on the same message: default Kubernetes scheduling was built for web workloads, not for AI.

Wayve's keynote makes the production case for Kueue. CoreWeave's Amy Chen and Google's Gabriel Saba follow up with "Instrumenting Kueue Scheduling for ML Training", addressing a critical gap: observability into advanced scheduling decisions. If you are running GPU workloads and cannot see why jobs are queued, preempted, or rescheduled, you are flying blind. Kueue manages admission control and fair-sharing across workloads, but without instrumentation you cannot distinguish between "the cluster is full" and "your workload's priority is too low."

For hands-on learning, Red Hat's "DRA-matically Simple: On-Demand GPUs for MLOps" tutorial walks through Dynamic Resource Allocation for GPU access. DRA is the newer Kubernetes mechanism (alpha in 1.30, graduating in 1.31+) for requesting specialized hardware like GPUs through ResourceClaim objects, replacing the older device plugin model for more flexible, pod-level resource requests.

Microsoft's demo theatre session shows a Kubernetes-native pattern for cross-cloud AI inference. The stack uses Karpenter for elastic autoscaling and a GPU flex nodes project for scheduling capacity across multiple cloud providers into a single cluster. Models, inference endpoints, and GPU resources are treated as first-class Kubernetes objects.

The BoF "Infrastructure Optimization for GPUs / Inference / Training / Networking", convened by CERN's Ricardo Rocha, covers CPU/GPU NUMA affinity, pinning, low-latency networking, and heterogeneous hardware support. Small room, practitioner-only conversation. These are often the most valuable sessions at KubeCon.

Platform Engineering Under Pressure

The talent gap frames every platform engineering discussion this year. The CNCF 2025 State of Tech Talent Report found that 56% of organizations report understaffing in platform engineering roles. The sessions reflect this: how do you build self-service platforms that scale without scaling the team behind them?

Platform Engineering Day

The fifth edition of Platform Engineering Day (Monday, co-located) runs a two-track format with an increased focus on AI within platform engineering. The co-chairs note: "We have seen an increase in CFP submissions on the topics of Platform Engineering and AI, and we are starting to see more organisations able to speak about the practical application of AI and how it can deliver value."

Adobe's "Enterprise-Scale Migrations Using Agentic Workflows with Human-in-the-loop" is the session that bridges the agentic AI and platform engineering themes. Applying agent-driven automation to large-scale migrations, the unglamorous work that consumes enormous engineering time.

Helm 4 and ecosystem maturity

"Helm 4 Is Here. So Now What?" from Red Hat's Andrew Block and SUSE's Robert Sirchia addresses the migration path from Helm 3. If your team has hundreds of releases managed by Helm 3, this session maps what breaks, what changes, and what the upgrade path looks like. Major version bumps in package managers are where production incidents hide.

The Backstage ecosystem continues to mature with sessions on runtime plugins, platform CLIs, and building sustainable plugin ecosystems. Internal developer portals are moving from "nice to have" to "critical infrastructure," especially as AI adds a new category of assets (models, agents, inference endpoints) that need cataloging, governance, and golden paths.

Red Hat's Roland Huss and Diagrid's Bilgin Ibryam present "Make GenAI Production-Ready With Kubernetes Patterns", applying established Kubernetes patterns (sidecar, init container, ambassador) to the specific challenges of running generative AI in production. If you have spent years learning Kubernetes patterns for web workloads, this session shows which patterns transfer to AI and which need rethinking.

Observability, Security, and the AI Intersection

The OTel migration story

DigitalOcean's "We Deleted Our Observability Stack and Rebuilt It With OTel: 12 Engineers to 4 at 20K+ Clusters" is the headline observability session. Rebuilding observability around OpenTelemetry across 20,000+ clusters while shrinking the team from 12 to 4 engineers. The End User TAB recommends this as "a masterclass in focusing on the right signals, rethinking architecture, and balancing visibility with cost."

Grafana Labs' Liudmila Molkova presents "GenAI Observability: Keeping GenAI Honest Without Oversharing", covering how to instrument AI applications with OpenTelemetry's Generative AI Semantic Conventions. The key challenge: performance and usage telemetry needs different handling than compliance and cost data. You want token counts and latency in your dashboards, but prompt content and model outputs need separate pipelines with access controls.

Security meets AI

Datadog's Rory McCune presents "What LLMs Do, and Don't, Know About Securing Kubernetes" on Tuesday. An evidence-based examination of where AI assistance genuinely helps with Kubernetes security versus where it hallucinates into dangerous misconfigurations. With 65% of organizations reporting a lack of cybersecurity and compliance specialists, the pressure to use AI for security is real. Understanding its limits is essential.

Dell and Ericsson present "Securing the AI/ML Lifecycle With MLSecOps: Open Source Best Practices", addressing security across the full AI pipeline.

Networking for AI

"SIG Network: The State of Networking for AI on Kubernetes" brings together contributors from Red Hat, Google, NVIDIA, and IBM to address one of the hardest operational challenges in AI infrastructure. Getting networking right for AI workloads (RDMA, GPU-to-GPU communication, multi-node training) is genuinely hard, and this is the room where those trade-offs get worked out.

Gateway API continues its evolution with "Building the Next Generation of Multi-Cluster with Gateway API" from Microsoft and Google. If you are running multi-cluster deployments, this session maps the future of how traffic flows between them.

Practical Advice for Attendees

Co-located events require an All-Access Pass. If you have been to KubeCon before, you know the co-located events are where the practitioner conversations happen. They still require the All-Access Pass. Agentics Day, Platform Engineering Day, BackstageCon, CiliumCon, Observability Day, OpenTofu Day, and WasmCon all compete for Monday. Pick one and commit.

End User BoFs are the best-kept secret. Three practitioner-focused Birds of a Feather sessions run during the main conference: AI Infrastructure and Platform, GPU/Inference Optimization, and AI Observability. Smaller rooms, real operational conversations, no vendor pitches. These are where you learn what actually works in production.

Budget transit time. The RAI Amsterdam is a large venue with sessions spread across multiple halls. Don't schedule back-to-back sessions in different buildings. The End User TAB's consistent advice across years: leave time between sessions, spend time in the hallway, and note down contacts who can help you debug things once you are back in the office.

The hallway track matters more than any single session. The TAB town hall (new format this year with short 5-minute presentations), the end user reception (Tuesday 17:30-18:30), and the reference architecture poster sessions at the Project Pavilion are where you meet the people solving the same problems you are. The published reference architectures are worth reviewing before you arrive.

Wrap-Up

KubeCon EU 2026 marks a maturity inflection for AI on Kubernetes. The conversation has moved from "can we run AI workloads on Kubernetes" to "how do we govern, schedule, observe, and secure them at scale." Three developments deserve attention beyond Amsterdam.

First, MCP standardization under vendor-neutral governance. The Agentics Day signals that the community is treating agent-to-tool connectivity as durable infrastructure, not an experiment. Second, Kueue as the GPU scheduling standard. Wayve's production use case and the CoreWeave/Google instrumentation work show Kueue moving from "promising project" to "required component" for any serious GPU workload. Third, sovereign AI as an architectural requirement. European regulations are forcing infrastructure decisions that will ripple through global organizations operating in EU markets.

If you are not attending, the full schedule is public. Start with the keynotes, then follow the threads that matter to your stack.

kubecon cncf ai-infrastructure platform-engineering agentic-ai sovereign-ai

KubeDojo

Mastering the Kubernetes ecosystem — depth-first, no hype.

Subscribe to KubeDojo

Get the latest articles delivered to your inbox.

agentic-ai kubernetes keda

Agentic AI Workloads on Kubernetes

How kagent, Agent Sandbox, KEDA, and OPA/Kyverno form the production stack for agentic AI on Kubernetes.

by KubeDojo

cncf kubernetes ai-conformance

CNCF Certified Kubernetes AI Conformance Program

CNCF launched v1.0 of the Kubernetes AI Conformance Program defining baseline capabilities for running AI workloads across conformant clusters.

by KubeDojo

llm-d cncf sandbox

llm-d Joins CNCF Sandbox: Kubernetes-Native Distributed LLM Inference

llm-d was accepted as a CNCF Sandbox project, providing Kubernetes-native distributed inference with KV-cache-aware routing, prefill/decode disaggregation, and accelerator-agnostic serving.

by KubeDojo

Agentic AI Arrives on Kubernetes

The keynote to watch

Agents in the real world

Sovereign AI and Inference at Scale

Kubernetes-native inference orchestration

Inference workloads in production

GPU Scheduling: Kueue Takes Center Stage

Platform Engineering Under Pressure

Platform Engineering Day

Helm 4 and ecosystem maturity

Observability, Security, and the AI Intersection

The OTel migration story

Security meets AI

Networking for AI

Practical Advice for Attendees

Wrap-Up

Subscribe to KubeDojo

Related Articles

Agentic AI Workloads on Kubernetes

CNCF Certified Kubernetes AI Conformance Program

llm-d Joins CNCF Sandbox: Kubernetes-Native Distributed LLM Inference