The dance of open source and enterprise
Time and again, open source projects initiate major shifts in how all industries use technology.
From the Linux operating system, to hypervisors, cloud technologies, infrastructure-as-code, Git, containers, microservices — and now container schedulers like Kubernetes. They’ve drastically changed the landscape of infrastructure management.
Across all these changes, one thing remains constant: the need for enterprise-grade solutions to support mission-critical use of these technologies.
In this blog we examine the challenges that Kubernetes introduces for organizations that adopt it, and show why a Kubernetes Management Platform is the natural solution.
Enterprise technologies always evolve management platforms!
They say that history repeats itself. And in the case of open source infrastructure, it’s certainly true.
Chapter 1: Linux
Let’s start with Linux. Decades back it became wildly popular, which led to an explosive adoption amongst developers that were able to use an operating system for free, as well as contribute back improvements or even create their own versions (distributions).
Linux was great as long as it was being used in academic or R&D settings, but when IT pros started using it to support financial applications, government systems, healthcare, transportation, and other ‘serious’ use cases, a new set of requirements emerged.
Organizations needed to be able to ensure that Linux was safe and secure. Things like role based access controls, centralized audit, security, support, and management at scale became a top priority.
A couple of guys got together and a little company called Red Hat was formed to address the main concerns for businesses using Linux in support of their critical operations. Red Hat ultimately created a number of different products over the years (including Red Hat OpenShift), but its greatest contribution was pioneering the idea of a “Linux management platform”.
In 2019 Red Hat was acquired by IBM for ~$34 billion.
Chapter 2: Git
The Linux story was not an isolated incident. The Git project also took off with developers, allowing them to collaborate on their software coding efforts.
Again this was an open source tool, freely available for anyone to use. But when it started to show up in large corporations, all of the same requirements that applied to Linux applied again.
And wouldn’t you know it, another group of guys got together and formed a little company called GitHub.
GitHub introduced paid versions of the Git tool, or a “Git management platform”, that addressed things like role based access controls, centralized audit, security, support, and management at scale. Spot the pattern yet?
In 2018 GitHub was acquired by Microsoft for ~$7.5 billion.
Chapter 3: Cloud management, IaC and Terraform
After AWS kickstarted cloud computing in 2006, a whole ecosystem emerged, including open source projects like OpenStack and a host of commercial public cloud players including Microsoft Azure, Google Cloud and Oracle Cloud.
As enterprises grew their multi-cloud and hybrid cloud consumption, it became clear that they needed cloud management platform technologies.
Enter a couple of guys who set up HashiCorp and released Terraform, Vault and other open source projects to address this market, and from there commercial offerings or a “cloud management platform” to address, yep you guessed it, things like role based access controls, centralized audit, security, support, and management at scale.
In April 2024, IBM announced plans to acquire HashiCorp for ~$6.4 billion.
What do Linux, Git, and HashiCorp have to do with Kubernetes management?
So what does all of this have to do with Kubernetes? By now I hope you’re seeing the common thread.
While enterprises love the innovation that open source brings, they care about a lot of other things too.
- They generally prefer to buy vs build in order to manage their headcount budgets.
- They look for SLAs, support and professional services.
- They care a lot about things like role based access controls, centralized audit, security, resiliency and scalability.
Now that Kubernetes is here to stay, how will companies address the fundamental concerns with operating this technology?
Kubernetes can run anywhere, from laptops, to public clouds, to private data centers, bare metal servers, and edge deployments. Asserting control and compliance over Kubernetes clusters is a very challenging task.
The answer is a Kubernetes Management Platform (KMP).
What is a Kubernetes Management Platform?
A KMP is a centralized software system that provides capabilities like, you guessed it, role based access controls, centralized audit, security, support, and management at scale.
You might have heard of Kubernetes platforms by a variety of different names:
- Enterprise Kubernetes management platform
- Kubernetes fleet management
- Edge management and orchestration (EMO), a Gartner term specific to edge computing
- Cluster lifecycle management tool
You might also want to watch out for things that are definitely not the same as a KMP:
- Managed Kubernetes — which often refers to a managed service model rather than a technology platform
- Infrastructure as Code — tools like Terraform that compose infrastructure with much wider remit than Kubernetes
- Deployment tools like Kubespray that just provision Kubernetes clusters
- Kubernetes distributions like Rancher RKE, which don’t offer multicluster management
- Specialized tools eg SIEM, observability and cost management tools, which don’t cover generalized cluster lifecycle management
A KMP essentially gives platform engineers and ops teams a place to go to deploy and manage Kubernetes clusters. Most of them support multiple clusters, and they provide a degree of standardization and automation of operational tasks. They may be open source or commercial products.
But not all KMPs are created equal, and we believe there are several important requirements that you should look for, responding to the evolving challenges you face with running Kubernetes environments in production in the modern enterprise.
1. Full-stack management
A lot goes into a production Kubernetes cluster, from the underlying OS (operating system), Kubernetes distribution, CNI (container network interface), CSI (container storage interface) to the virtual machine (VM) and containerized apps on top. In between? Ingress, monitoring, logging, security, load balancing, and much more.
A Kubernetes Management Platform is there to provision and manage the lifecycle of each cluster — if it’s not provisioning all of these different layers, it’s only doing half the job. A KMP must be ‘aware’ of all of these full-stack elements and can document them in its record of the cluster’s declarative desired state.
What’s more, different kinds of clusters will have different things in them — for example, an edge cluster may run a lightweight distribution of Kubernetes different from public cloud use cases. Each organization will have its own preference for what goes into its Kubernetes stacks.
All of which means that a Kubernetes management platform has to be more than just full-stack, it has to be open, too. In other words, it should not be ‘opinionated’, or restrict your choice of what it deploys.
2. Multi-cluster, multi-environment support
Nowadays it’s absolutely table-stakes for a KMP to have multi-cluster support. The average enterprise running Kubernetes in production has more than 20 clusters, and logging in to each one individually to make changes is ridiculously inefficient.
But even that is not enough. Kubernetes can run anywhere, and indeed that is one of the main reasons why people use it. Our research found that around half of enterprises run Kubernetes clusters in four or more different environments!
From the managed Kubernetes services in public clouds (EKS, AKS, GKE, etc.), to private datacenter Kubernetes solutions, you ideally want one tool, the proverbial ‘single pane of glass’. Otherwise you’re duplicating work (potentially inserting human errors) defining policies and monitoring clusters across each environment.
The KMP becomes your system of record that has the ability to create and manage Kubernetes clusters across all these locations. Every operating target has nuances and different tooling, such as the differences between public clouds and private data center technologies.
A KMP should introduce workflows to address the unique requirements of each operating target without having to maintain numerous workstreams and tool chains per target.
When Kubernetes is being used in multiple operating targets such as multiple public cloud providers, private datacenter hypervisors, bare metal servers, or edge locations, Kubernetes cluster sprawl can happen very quickly.
If each operating environment has unique workflows to deploy and manage the lifecycle of Kubernetes clusters this will make it hard to scale operations and address things like common vulnerabilities & exposures (CVEs) or even tasks as basic as upgrading Kubernetes distributions from one version to the next.
Now extend the problem further up the stack to include logging, monitoring, and security tools to understand how managing the lifecycle of these tools creates complexity if each operating target has a separate workflow or tooling in place.
Additionally, if every operating target has different workflows and tooling it becomes very difficult to audit that best practices and enterprise standards are being followed and enforced without a centralized view into all Kubernetes clusters.
Some operating environments require special compliance considerations specifically in regulated environments.
These requirements can come in the form of the software adhering to particular best practices outlined by a governing body such as SOC2, ISO 27001, and FIPS to name a few. There are also requirements that a software solution be run in a privately hosted deployment or air-gapped from any communication with the Internet.
These types of compliance requirements have to be taken under consideration when exploring KMP options based on your industry regulation needs.
3. True lifecycle management
Without the ability to provision and act on clusters, your KMP is just a monitoring tool or dashboard. And without the ability to actually automate tasks, it’s never going to solve your management challenges around team skills and resources.
A KMP should repeatedly and consistently apply your ‘desired state’ to all of your clusters, and maintain that over time to avoid the dreaded ‘configuration drift’. Through its interfaces and policies it should provide standards and guidelines for how new clusters are configured.
Different kinds of users may need access to the KMP, both in terms of role types (developers vs platform engineers) and different business units or teams. To maintain governance, the KMP should provide granular role based access controls, to ensure that there is a separation of duties when it comes to who can create or alter a Kubernetes cluster. For example, developers should never have access to modify things like security tools.
A KMP should provide workflows to ensure proper controls and standards are being enforced without creating barriers for the developers and consumers of the Kubernetes clusters.
Lifecycle management also means streamlining routine tasks that consume platform engineer time: patching, upgrades, certificate management, configuration and deployment scaling changes, monitoring and auditing, health and security scans — ideally across multiple clusters in parallel, with an automated workflow.
And let’s remember that we’re talking about integrating a new platform into an existing enterprise IT landscape. Any KMP should integrate well with IaC tools like Terraform or Crossplane, plus GitOps and CI/CD tools. They should be API driven and offer experts a powerful CLI, or intuitive UI, without compromising functionality.
4. Resilience, flexibility and performance at scale
The greatest challenge of all when running Kubernetes is scale. The more Kubernetes clusters and the more different operating targets a company must support, the more difficult the task becomes.
For instance a company may start in a single cloud provider to host Kubernetes clusters, then find themselves becoming a multi-cloud user from one day to the next due to an acquisition or cloud provider diversification initiative.
The requirement of running Kubernetes at edge locations, or locations other than public clouds or traditional datacenters, can emerge quickly leaving a company paralyzed in its ability to deploy and manage hundreds or thousands of Kubernetes clusters.
A KMP should help with scale, not hinder it. Yet some platforms that we’ve evaluated simply choke when you get to more than a few dozen clusters, or require you to stand up multiple management solution instances.
In the real world of enterprise IT, network failures and other outages are a fact of life. What happens to your Kubernetes clusters if they can’t access the central management platform — do they continue to function? Relying on clusters to be tightly tethered to a central ‘brain’ is a risky architectural approach.
But I use a cloud managed K8s service — do I need a KMP?
You may be thinking: I use a cloud managed Kubernetes service like Amazon EKS or Google Kubernetes Engine, secured and supported by a hyperscaler. Why would I need a KMP?
Managed Kubernetes is a great example of why a KMP is so critical. When you look at public clouds they do have a lot of security built into the provider, but clouds operate under a “shared responsibility model”. This means that security and support fall largely on the Kubernetes consumer.
An EKS cluster is an empty Kubernetes cluster that is not ready for production use. An enterprise company would never run production workloads on an EKS cluster provisioned with only the default configuration that the EKS service offers.
It is up to the AWS customer to harden their EKS clusters, orchestrate the installation and configuration of logging, monitoring, networking, security, middleware, and even GitOps or CI/CD tools for developers to be able to deploy their software into the EKS cluster.
It’s also worth mentioning that the AWS EKS management plane is great for one cluster, but very time consuming and challenging to replicate across multiple clusters. This is where a KMP can help customers accelerate scaling their AWS EKS usage as they grow their business.
AWS support will only address the cluster’s Kubernetes control plane nodes, and key components of the EKS service and cloud region health, leaving EKS consumers to develop their own support teams made up of Kubernetes experts.
This leads to even more challenges as Kubernetes talent is hard to come by or has to be developed in-house. As most people have experienced, once people get good at cloud technologies and Kubernetes they become hard to retain.
A KMP should address the security and support issues that come along with Kubernetes operations.
But I use a managed service — do I need a KMP?
You may not, but your managed service provider certainly does. They too will need to efficiently manage multiple clusters, controlling secure access and segregating the clusters owned by different customer tenants.
The more efficiently and consistently they can manage your clusters, the more responsive they can be to your changing needs, and the safer your workloads will be. They may also be able to deliver service at a lower cost while maintaining standards, by automating and freeing up their experts from manual toil.
Do we agree that a KMP is a thing?
At this point you hopefully see the problem statement, why, and some of the evidence in support of the need for a Kubernetes Management Platform. What now?
Do you build your own KMP or seek out a vendor provided solution, build vs. buy? Based on the history of companies like Red Hat, GitHub, and HashiCorp, I think it is fair to say that most companies will look to buy a KMP over building their own.
In fact, this year was the tipping point where commercial management platforms became the most popular way for companies to manage their Kubernetes clusters.
Now brace yourself for a shameless plug. Why not take a look at the Spectro Cloud Palette Kubernetes Management Platform? It’s been recognized by the analysts at GigaOm as the leading platform from edge to cloud.