Published
July 2, 2024

Production-ready KubeVirt architecture for VMs on Kubernetes

Kevin Reeuwijk
Kevin Reeuwijk
Principal Solution Architect

VMware is a burning platform

Virtual machines (VMs) are here to stay, at least for a decent while longer. But the virtualization platform market is shifting. 

The market leader VMware has seen confidence in its platform take a massive blow as a result of the changes that Broadcom made to its business after the acquisition. 

The responses to the 2024 State of Production Kubernetes report show how much the sentiment has changed:

  • 59% say Broadcom’s takeover has accelerated cloud-native adoption
  • 51% of senior leaders say they’re motivated by strategic effort to reduce VMware dependence
  • 30% are investigating moving to bare metal data centers
  • 43% say they’re investigating shifting to less costly software vendors

Many organizations now are contemplating what their virtualization platform strategy should look like for the next 10 years. 

This is happening against the backdrop of containerization, and in particular Kubernetes, making inroads in the IT landscape, changing the way applications are developed and maintained. So, let’s discuss what your options are.

Swapping to another hypervisor is not the easy option

The simplest, yet least integrated, solution is a like-for-like replacement of one hypervisor platform by another. Decent candidates here are Nutanix, Azure Stack and Proxmox. Red Hat Openstack could have been a contender, but its time has passed.

Of the three, we’re seeing increased interest in Proxmox for its open source technology (KVM) and low cost. 

Nutanix tends to be the most feature rich but also the most expensive option, while Azure Stack is more of an all-in commitment to Microsoft that not everyone is willing to make. 

However, moving to yet another hypervisor—whether it’s Nutanix, Proxmox, or Azure Stack —means swapping one set of technologies for another, only to end up where you started from a functional point of view. 

Switching technologies always comes with significant short-term pains:

  • Partial double licensing as you’re not going to swap out one stack for another over a weekend. When  both technologies are in use, you’ll be paying for both.
  • Migration costs. Converting workloads from one technology to another will require data conversion, additional hardware capacity while 2 stacks are in use and let’s not forget the significant number of manhours that goes into a project like this.
  • Reskilling of people and retooling of infrastructure. New technology means new operational procedures, new tools to learn and new experience to build up.

If the goal is to move away from vSphere due to its current challenges, jumping to another major hypervisor vendor with similar business principles and potential long-term issues may not be the best approach when looking at what you get for all the effort and resources you put in.

A better approach: go cloud-native

Instead of directly replacing one hypervisor technology with another, why not consider a cloud-native approach? Adopting a virtualization platform like KubeVirt can future-proof your investments. 

KubeVirt is a virtual machine management solution that leverages the Kubernetes ecosystem. It provides a modern and flexible infrastructure that integrates both containerized and traditional virtual machine workloads, paving the way for a more versatile and resilient IT environment.

Since we first started blogging and presenting and running webinars about KubeVirt, it has been steadily picking up steam. 

It officially came out of beta with a 1.0.0 KubeVirt release on July 6th of 2023, and new updates are released regularly. As of this writing, a 1.3.0 release candidate version was just published, adding dozens more features and improvements.

Building the production-ready KubeVirt stack

KubeVirt is an increasingly mature way to run and manage KVM-based virtual machines in a Kubernetes environment. But just KubeVirt by itself is not enough. 

In order to build a robust and capable platform that has feature parity with the core abilities of traditional hypervisor platforms, we need additional components:

  • A storage platform that provides essential capabilities for VMs like live migration, cloning, snapshotting, site replication and backup/restore.
  • A network interface solution that enables connecting VMs to layer 2 networks.
  • A way of load-balancing VM workloads across the nodes of the Kubernetes cluster.
  • For the data center, a way to automate the deployment of bare metal Kubernetes cluster nodes.
  • For the edge, a way to automate and secure the deployment of edge Kubernetes clusters in remote, poorly controlled locations.

We at Spectro Cloud call this combined set of technologies VMO, for “Virtual Machine Orchestrator”. Spectro Cloud provides a reference architecture for VMO in different use cases, featuring solutions like Portworx, Prometheus/Grafana, and Cilium. 

For example, this is the architecture for the datacenter use case (similar to a VMware platform):

 architecture for the datacenter use case (similar to a VMware platform)

You can find our full VMO Reference Architecture for bare metal Kubernetes here.

Adapting your architecture concepts

The stack we just described covers deploying the technology. But the way you run virtual machines on Kubernetes is also quite different from a traditional hypervisor platform. Let’s look at three aspects.

1. Mixing VMs and containers

The general idea behind KubeVirt is that you want to be able to sprinkle in VMs with containers, in a kind of ‘hybrid’ model, especially for those scenarios where converting the VM to a container is a waste of time vs the potential benefits. 

I always like the example of PostgreSQL for this. Yes, there is a container-based version of PostgreSQL available, for example here

But due to the ephemeral nature of containers, you need this HA implementation of PostgreSQL to get the type of uptime that you’d otherwise get from a basic PostgreSQL VM and vMotion.

So if your application depends on a PostgreSQL database (on a VM) today, do you really want to spend the effort to convert all that to an HA PostgreSQL container implementation, when you could just run the PostgreSQL VM alongside the rest of the (containerized) application? 

That is where KubeVirt really shines, and where running the VM in the namespace of the application makes perfect sense.

Example of a hybrid application, where the VM provides database services to the app

Example of a hybrid application, where the VM provides database services to the app

2. Namespace planning

In a traditional hypervisor platform like VMware, your VMs are running in essentially a flat hierarchy. Yes there are folders, resource pools and permissions to lock down access, but given enough permissions, you can move a VM anywhere inside the platform.

Conversely, in a Kubernetes cluster, each VM lives in a namespace. A namespace is a hard boundary in Kubernetes: you cannot move objects from one namespace to another. You can clone a VM into another namespace, but as you can imagine that can be an expensive (and offline) operation. 

Since VMs are essentially locked in their namespaces, you don’t want to put VMs in namespaces based on structures that tend to change over time, such as departments, teams or projects. 

While namespaces offer a quick and easy way to create security boundaries, it might be better for generic use to just have a single namespace called “virtual-machines” that houses all VMs.

You can still use Kubernetes RBAC rules to lock down access to individual VMs within the same namespace, but by default this per-vm approach is a bit unwieldy in Kubernetes RBAC. 

We at Spectro Cloud are working on dynamic RBAC capabilities for VMs that make this approach easier to live with.

3. Networking considerations

Another thing to consider is what network connectivity model to run the VM in. KubeVirt does not have the concept of a virtual switch. Instead, you can either:

  • Run the VM on the pod network in the cluster, sharing the same network construct as the rest of the containers.
  • Connect the VM to a layer 2 network (VLAN), a simpler paradigm, mostly comparable to vSphere port groups but without the additional options like enabling/disabling promiscuous mode or forged transmits, or tapping a virtual port for traffic monitoring.

Each option has its own pros and cons. For the PostgreSQL VM example mentioned earlier, the pod network option is perfectly fine:

  • We don’t need the VM to sit on a layer 2 network, because only a specific application inside the cluster will use it.
  • We only need to publish a small number of TCP/UDP ports from that VM to the rest of the cluster. We can use a Kubernetes-native Service resource for this.
  • If the VM gets live migrated to another node, the Service resource ensures it stays reachable on the same address, but the connection will momentarily break as connections are not maintained during live migration for VMs running on the pod network. However, the frontend application will simply reconnect to the database after live migration.

If you have a VM providing a different function that needs to maintain connections during live migrations, then you must connect it to a layer 2 network (VLAN) instead. 

This option is also great for moving a VM from vSphere to KubeVirt, as you can keep the same IP address and not impact any existing applications that depend on it. This is the most popular approach for teams looking to move workloads away from VMware onto a more modern platform, while avoiding the need to perform significant reconfiguration due to things like IP address changes.

Production-ready KubeVirt? It’s possible, and we can help

So here’s the bottom line:

  • Jumping from VMware to another hypervisor is a costly migration that functionally gains you very little.
  • Don’t expect to be able to get off of VMware in just a matter of months. You can’t escape your next renewal.
  • Any migration will require significant investment. Don't believe anyone who offers you an instant lift and shift.
  • Now is the time to adopt a cloud-native platform that can do both Kubernetes and manage virtual machines — like KubeVirt. But know that KubeVirt may not be right for all workloads, so while you can reduce your VMware bill, you may not eliminate it entirely.
  • When adopting KubeVirt, you’ll need a full stack to make it production ready, and there are important architectural considerations you have to plan for.

The guidance above is a good starting point for really getting serious about taking KubeVirt into production as your next modern hypervisor platform. Of course there are plenty of other things you may want to explore as well, and we are here to help. 

We have a load of webinars, videos, blogs and documentation covering KubeVirt, including our reference architecture

If you’re interested in discovering our VMO solution, based on KubeVirt, book a 1:1 demo and we can give you some tailored guidance.

Of course, you can always reach out to me or hop on our community Slack if you have any questions.

Tags:
Open Source
Cloud
Migrations
Networking
Subscribe to our newsletter
By signing up, you agree with our Terms of Service and our Privacy Policy