The most popular networking plugin for Kubernetes
Cilium is an extremely popular open-source Kubernetes network plugin. Just to give you an idea of how popular Cilium is, have a look at the diagram below, to see the amount of Github stars it has compared to other popular network plugins. Cilium is now also a CNCF Incubating project.
Cilium provides networking, security, and observability for cloud-native environments like Kubernetes and other orchestration platforms, but can also provide a communication link to external workloads (such as VMs) into your Kubernetes cluster and enforce security policies to restrict access (currently in beta). Another new (beta) feature is Cilium Cluster Mesh, which allows you to connect Kubernetes clusters together and enable pod-to-pod connectivity, plus the ability to define global services which are load balanced between clusters and enforce security policies to restrict access.
eBPF: the secret ingredient behind Cilium’s success
At the heart of Cilium’s powerful feature set is a technology called eBPF. What is eBPF and why are we hearing more and more about this magical tech? To find out, we have to first talk about BPF.
The Berkeley Packet Filter or BPF for short (and no, I did not forget an ‘e’), has been around for a few years now and is a tool for observing the Linux operating system. BPF gives users access to the kernel, by running a small piece of code quickly and safely inside the operating system.
While BPF was originally used for packet filtering, it has since been enhanced to support dynamic tracing of the Linux kernel. Now a user can, for example, create a cgroup-related program that denies or allows access to system resources like CPU, memory, network bandwidth and other groups of processes.
Now extended BPF (eBPF), is a modernized extension of BPF, with 64 bit registers and a bunch of new special functions to interact with the kernel, to request more information and execute a wider range of tasks. Just like Docker made Linux containers cool, Isovalent (creators of Cilium) and other vendors are making eBPF the next big thing in Kubernetes tools and technologies.
Thanks to eBPF, Cilium has a very simple architecture.
Cilium runs one ‘cilium’ agent on every node in the cluster, as a DaemonSet and a ‘cilium-operator’ deployment with one replica. That’s about it.
These resources provide networking, security and observability to the workloads running on the nodes. These workloads don’t even have to be containerized, but could just be natively running on the node.
Now that you know what makes Cilium special, let’s dive right into using it!
I’ll show you how to set up Cilium on K3s (a lightweight Kubernetes distribution) in a second, but there are a few things you might need to configure first to get the most out of Cilium.
First, check your Linux kernel version
eBPF can be described as a fast-moving target, because developers around the world are working hard on making constant improvements to the Linux kernel and eBPF. For that reason I would recommend updating your Linux kernel, to get all of the new features. The minimum kernel you need is 4.9.17, but here is a list of features that are available starting from specific kernel versions.
For more advanced deployment options, I would definitely recommend going through the corresponding requirements. However, in most cases you’ll be using the Cilium container image ‘cilium/cilium’, deployed in a new or existing Kubernetes cluster with its own etcd, so most dependencies will be managed by the container and orchestrator.
You can check your Linux version by running uname -sr, if you see something like Linux 5.18.8-051808-generic, then you are on kernel version 5.18.8. You probably won’t be though, because most distros aren’t using the latest kernel version. They choose a stable version and build on top of that. Let’s say you run Ubuntu 20.04.4 LTS, then your kernel version will most likely be Linux 5.13.0-51-generic and that’s perfectly suitable for running all the latest features that Cilium has to offer. Upgrading your kernel will depend on your Linux distro and if you’re brave enough to do so. In all fairness, it really isn’t that hard and you can basically always roll back, but I don’t want anyone blaming me for their failed kernel upgrade!
Spin up your VMs and install K3s
For my test environment I’ll be using the following setup:
- 3 VMs with Ubuntu Server 18.04.6 LTS, with the kernel upgraded to 5.18.8
- K3s running as a single server, with two agents
To follow along, spin up three VMs (or use bare metal of course). Then we need to create the /etc/rancher/k3s/ folder and add the config.yaml file and populate its contents.
Add the following content:
Don’t forget to change the token value, if you’d like a more secure token than what I wrote down. Now save and close the file. You can find most of these settings in the Cilium on K3s installation docs: https://docs.cilium.io/en/v1.9/gettingstarted/k3s/. However, I’ve also added the disable-kube-proxy: true setting, because I want Cilium to take care of all of the networking components. There’s no point in having a slower kube-proxy service. The other settings are for disabling the built in service load balancer, flannel, ingress controller (traefik) and the network policy controller. All of these tasks will be taken over by Cilium.
Install K3s
Now let’s start K3s:
Make sure the last line in the installer output is: ‘[INFO] systemd: Starting k3s ’ and that there are no errors after, nor that the terminal is stuck waiting for K3s to start. You can also check if the service is running by executing the following command:
Once the server has been deployed, deploy the agents. Go to the other two VMs and apply the following command:
Of course make sure to change the token, if you’ve done so in the previous step and put in the IP address of the K3s server. You can go back to the machine running the K3s server and check if the nodes are coming online.
And check the running pods.
Don’t worry about the pods stuck in ‘Pending’ or perhaps ‘ContainerCreating’, because that’s a feature, not a bug. We’ve removed the default Flannel network plugin, so the pods have no way of communicating to each other or the outside world. That’s what Cilium is for.
Install Cilium via its CLI
There are multiple ways of installing Cilium; with a Helm chart, with the Cilium CLI or with quick-install scripts. I’ve found that the Cilium CLI is amazingly awesome! So I would definitely recommend using that, where possible. It handles all of the resource creation, roles, service accounts, certificates, overwriting ConfigMaps and much more! Better yet, if you break anything and need to clean up, you won’t have to fiddle around and end up missing something. The CLI also cleans everything back up, so you can redeploy without any issues.
Before we install Cilium CLI, let’s prep the node by mounting the eBPF filesystem on all three nodes. After this step we will be mostly working out of the first K3s server node, as that node contains the kubectl client, which was installed by K3s.
Run the following commands to install Cilium CLI:
Before we can deploy Cilium, we need to make sure that the cluster is known to the CLI. We can do that in a couple of ways, one of which is to set the KUBECONFIG variable and point it to the location where K3s generates the kubeconfig file.
We can test if the kubeconfig is correct by running:
We can see that the version deployed by the Cilium CLI was v1.11.6 and that it can’t find any cilium pods running in the cluster. This means that our kubeconfig has been set correctly and that we can proceed with the installation.
Deploy Cilium into the cluster
With the Cilium CLI installed, we can finally deploy Cilium as our network plugin, into our K3s cluster. To install Cilium, just run:
It can take a minute for all three cilium pods to come online, so just hang in there. You should eventually get a similar output.
As the last line recommends, let’s run:
and you should see something similar to the screenshot below. I’ve added the
flag so the status only shows as soon as all pods are running.
Now if we check our pods, we should see that everything is up and running and that there are no ‘Pending’ pods.
Enable Hubble for observability
In order for us to observe what’s happening in our cluster, we will enable Hubble. Hubble is built on top of Cilium and eBPF and is a fully distributed networking and security observability platform. Hubble allows us to see what services are communicating with each other, what HTTP calls are being made, are any packets dropping and much more. Check out the docs for more info. https://docs.cilium.io/en/v1.11/intro/#what-is-hubble
All we need to do to enable hubble is run: cilium hubble enable
Now let’s also enable the UI, with:
All of the necessary resources have now been installed, but we can’t reach Hubble at the moment without using the kube-proxy command. However, Cilium CLI also manages that for you! All you need to do is run:
Of course this won’t work if you aren’t running K3s on your local machine, so replace ‘localhost’ with the IP address of the VM (if that’s what you’re using) you’re running this command from. For me, that looks something like the image below.
Deploy a simple Nginx service
You’re now pretty much set up to actually start messing around with Cilium! Let’s deploy a simple nginx example with a LoadBalancer service and see what happens. Create a file named
add the contents below and run:
Now let’s create a service of type LoadBalancer and see what happens. Go ahead and create a file named
add the contents below and run:
Let’s check if a LoadBalancer service has actually been created.
As you can see, the EXTERNAL-IP field has been populated and we can reach our service on the NodePort AND the external-ip with port 80.
Of course we don’t have MetalLB or Kube-vip deployed, so the actual external-ip allocation is done by K3s and its internal Klipper-LB, but Cilium does have a beta implementation of BGP, which uses an integrated MetalLB. https://docs.cilium.io/en/v1.11/gettingstarted/bgp/#bgp
In Hubble we can see the communication happening for the my-nginx service.
Since it’s not that spectacular, I’ll run the
command, just for a bit more wow factor. This will spin up a bunch of resources and test the connectivity between them, the outside world and with different policies. I’d recommend running this yourself as well, but don’t count on everything passing in one go.
Replace this with the following image: 75-images/image12.png
There we go! That looks a lot more fun! 😁My advice is to check out the output and see what’s being tested. You can also find the test resources in github: https://github.com/cilium/cilium/tree/master/examples/kubernetes/connectivity-check
Final thoughts
We’ve barely scratched the surface of what we can do with Cilium, but now you have a bit of context and a small lab environment (at least, I hope you do) where you can test as much as you’d like.
If you would like to create a test environment without all of the hassle, I’d recommend trying out Spectro Cloud Palette. Palette allows you to simply select Cilium as your network plugin, while you define your Cluster Profiles and deploy your clusters on any infrastructure. This way you can create a blueprint (a Cluster Profile) that has Cilium as the default network plugin for any cluster you deploy. You can try Palette absolutely free.
As you have seen, we can use Cilium to replace many crucial networking components and make them even faster and easier to observe and secure. No more kube-proxy or iptables! That by itself, gets a gold star from me. ⭐️