Building High-Availability Kubernetes Cluster with Ansible

Table of Content

Building High-Availability Kubernetes Cluster with Ansible¶

NOTE TO MYSELF

use rfc5737 ipv4 address ranges for documentation on this blog series as well as public git repository in example files
- The blocks 192.0.2.0/24 (TEST-NET-1), 198.51.100.0/24 (TEST-NET-2), and 203.0.113.0/24 (TEST-NET-3) are provided for use in documentation.
use rfc6761 example domain names for documentation
- "example.", "example.com.", "example.net.", "example.org."
write up the documentation, and then prepare repositories, and finally run ansible plays
2025-02-05
- I'm writing up to the bootstrap section
- I'm getting confused if I should cover all from scratch... No, of course not. So, I should just have readers to refer to the repository I'm going to prepare, and only the hands on will be about the preparation specific to each's own
- repo: https://github.com/pkkudo/homelab-v3-k8s

Introduction¶

This blog series is on building kubernetes cluster with HA control planes and external etcd cluster using ansible.

I have been doing homelab scrap & build every once in a while, trying out something new every time. I got a machine powerful enough to run as hypervisor, and decided to do my homelab build with following new challenges:

Servers
- More virtual machines
Build and run process
- More automated setup tasks using ansible
Kubernetes cluster design
- External etcd topology
- Highly available control plane using haproxy and keepalived
- Use cilium instead of calico as network addon for the kubernetes cluster
  - Replaces MetalLB to provide loadbalancer and L2 advertisement features
  - Replaces NGINX fabric gateway to provide gateway api implementation
- Use longhorn instead of minio directpv for the volume provisioning feature
- Test out percona everest to see if this can replace database operators I'm currently using

What's covered in this series of blog posts¶

In this blog series I'll try to cover following items:

Preparing Ansible project to build the kubernetes cluster
Setting up DNS servers
Setting up etcd cluster
Setting up kubernetes cluster
Demo on using cilium features
Demo on using longhorn features

The project content is available on public git repository, and there are some modifications and changes you need to make to run the same ansible playbooks. I will insert the repository links wherever applicable.

Link to the repository: https://github.com/pkkudo/homelab-v3-k8s

Explaining the design¶

Let me explain about the very first kubernetes cluster I have built, and then what I am going to build this time.

Basic Kubernetes Cluster¶

The very first cluster was composed of three nodes: one control plane and two worker nodes. You prepare three machines and configure and install requirements, and initialize the cluster on control plane node, and then have other worker nodes join the cluster.

---
title: basic kubernetes cluster
---
flowchart TD
    subgraph kubernetes[Kubernetes Cluster]
      subgraph worker[Worker Nodes]
        worker1
        worker2
      end
      subgraph control_plane[Control Plane Node]
        cp1
      end
    end

Kubernetes Cluster to be built in this series¶

This time round, since I have more capability to boot up servers as virtual machines, I am going to have an etcd cluster with three nodes and a kubernetes cluster with three control plane nodes and one worker node.

---
title: kubernetes cluster with external etcd cluster
---
flowchart LR
    subgraph kubernetes[Kubernetes Cluster]
      subgraph worker[Worker Nodes]
        worker1
        worker2
      end
      subgraph control_plane[Control Plane Node]
        subgraph cp1
          kube-apiserver
          etcd
          others[and other k8s control plane components]
        end
      end
    end

Each of the three control plane nodes runs keepalived to host a highly-available virtual IP address (VIP), and haproxy to listen to the requests for kube-apiservers using the VIP and loadbalance it and pass them on to any of the control plane node

Here is a simple diagram on keepalived. They speak among each other and have one node host the VIP, 192.0.2.9 in this case.

---
title: keepalived on each control plane node
---
flowchart TD
    subgraph control_plane[control plane nodes]
      subgraph lab-cp1[lab-cp1 192.0.2.1]
        keepalived1[keepalived vip 192.0.2.8]
      end
      subgraph lab-cp2[lab-cp2 192.0.2.2]
        keepalived2[keepalived vip 192.0.2.8]
      end
      subgraph lab-cp3[lab-cp3 192.0.2.3]
        keepalived3[keepalived vip 192.0.2.8]
      end
    end

Here is a diagram on haproxy. They will be configured to listen on port 8443 and pass on the traffic to any available kube-apiserver.

---
title: haproxy on each control plane node
---
flowchart TD
    subgraph control_plane[control plane nodes]
      subgraph lab-cp1[lab-cp1 192.0.2.1]
        haproxy1[haproxy:8443] --- kubeapi1[kube-apiserver:6443]
      end
      subgraph lab-cp2[lab-cp2 192.0.2.2]
        haproxy2[haproxy:8443] --- kubeapi2[kube-apiserver:6443]
      end
      subgraph lab-cp3[lab-cp3 192.0.2.3]
        haproxy3[haproxy:8443] --- kubeapi3[kube-apiserver:6443]
      end
    end

    haproxy1 --- kubeapi2
    haproxy1 --- kubeapi3
    haproxy2 --- kubeapi1
    haproxy2 --- kubeapi3
    haproxy3 --- kubeapi1
    haproxy3 --- kubeapi2

Server List¶

Finally, here is the list of servers. The IP addresses listed here are for documentation purpose, and they will be consistent throughout the series as well as on git repository.

The last two nodes lab-ns1 and lab-ns2 are going to be the DNS servers.

hostname	ipaddr	role	os	cpu	memory	disk	hypervisor
lab-cp1	192.0.2.1	kubernetes control plane	debian	4	4GB	64GB	hyper-v
lab-cp2	192.0.2.2	kubernetes control plane	rocky	4	4GB	64GB	proxmox
lab-cp3	192.0.2.3	kubernetes control plane	ubuntu	4	4GB	64GB	proxmox
lab-worker1	192.0.2.4	kubernetes worker node	debian	4	4GB	64GB	hyper-v
lab-etcd1	192.0.2.5	etcd node	debian	2	4GB	64GB	hyper-v
lab-etcd2	192.0.2.6	etcd node	debian	2	4GB	64GB	proxmox
lab-etcd3	192.0.2.7	etcd node	oracle	2	4GB	64GB	proxmox
lab-ns1	192.0.2.16	run dns server using docker	rhel	1	2GB	32GB	proxmox
lab-ns2	192.0.2.17	run dns server using docker	debian	1	1GB	10GB	hyper-v

Project Preparation¶

Let me start with project directory preparation and introduction on following ansible components:

ansible config
ansible-galaxy collection
ansible-vault
ansible inventory and variables

Dedicated repository for this project¶

You can clone the public repository I prepared for this series to get started. I will cover what should be modified and what should be created anew on this project as we move on with the build.

git clone https://github.com/pkkudo/homelab-v3-k8s

Or you can start from scratch, and even without VCS.

Installing ansible on ansible master node¶

Ultimately, just install ansible whichever way you prefer, and many different ways are described in the official documentation here. The first way described in the official documentation is to use pipx.

I use mise to manage different versions of different programming languages on my machine. I place .mise.toml at the project root to specify which version of python to use, and then install poetry to manage packages to use on this project.

The getting started page shows different ways to install mise on different OS.

Here is what I did on debian to install mise.

# installation
curl https://mise.run | sh
# activation script in .bashrc as instructed by the mise installer
echo "eval \"\$(/home/$USER/.local/bin/mise activate bash)\"" >> ~/.bashrc
# reload
source .bashrc
# verify
mise doctor

Let's say the cloned repository "homelab-v3-k8s" is at ~/homelab-v3-k8s. Continue on with the following steps to prepare python.

cd homelab-v3-k8s

# run "mise up" to install what's configured in the .mise.toml, [email protected] in this case
mise up

# mise warns about untrusted .mise.toml file
mise trust
# mise warns of about python venv activation
mise settings experimental=true
# mise warns that there is no virtual env created
python -m venv ~/homelab-v3-k8s/.venv

# .venv is listed in the .gitignore

And then I install poetry following this instruction.

pip install -U pip
pip install "poetry>=2.0.1"

The cloned repository already has the pyproject file poetry can use to install requirements.

# install requirements such as ansible
poetry install --no-root

Generating pyproject.toml using poetry from scratch¶

Following two commands is all you need to run if you want to setup an ansible project with no pyproject.toml file.

# run init at the project root directory
poetry init

# install ansible
poetry add "ansible>=11.2.0"

Preparing Ansible Configuration File¶

This is the ansible configuration file, ansible.cfg.

[defaults]
inventory={{CWD}}/inventory/hosts.yml
vault_password_file={{CWD}}/.vault_pass
roles_path={{CWD}}/roles
collections_path={{CWD}}/collections
# callbacks_enabled = ansible.posix.profile_tasks
timeout = 60
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null

inventory specifies where the ansible inventory file is
vault_password_file specifies where the ansible vault password file is
roles_path specifies where the ansible roles directory is
collections_path specifies where ansible collections directory is
callbacks_enabled is optional, and is something you can use to modify the output format when you run ansible plays, and commented-out, but you can do however you like and see how things change
ssh_args can be used to set ssh options on how ansible master access ansible-managed machines using ssh
- I'm ignoring ssh host keys because everything is on my homelab and is more convenient when I re-create VM using the same IP address

Sample Ansible Configuration File¶

You can generate one using ansible command with all available configuration by running ansible-config init --disabled -t all > ansible.cfg, and see for yourself what's available.

Ansible Community Plugins and requirements.yml File¶

There are ansible community plugins I want to use, and here is the requirements file I prepare at the project root, requirements.yml.

# ansible-galaxy collection install -r requirements.yml
collections:
  - name: community.docker
    # https://github.com/ansible-collections/community.docker
    version: 4.0.1
  - name: community.general
    # https://github.com/ansible-collections/community.general
    version: 10.0.1
  - name: community.crypto
    # https://github.com/ansible-collections/community.crypto
    version: 2.22.3
  - name: ansible.posix
    # https://github.com/ansible-collections/ansible.posix
    version: 1.6.2
  - name: kubernetes.core
    # https://github.com/ansible-collections/kubernetes.core
    version: 5.1.0

Run ansible-galaxy collection install -r requirements.yml to have ansible install plugins listed in the file.

Using Ansible Vault to Encrypt Data on the Project Repository¶

https://docs.ansible.com/ansible/latest/vault_guide/vault.html

Ansible Vault encrypts variables and files so you can protect sensitive content such as passwords or keys rather than leaving it visible as plaintext in playbooks or roles

Although I keep my things mainly on my self-hosted GitLab, I still do encrypt them to keep a good habit.

All you need to use ansible vault is .vault_pass file which I specified in the ansible configuration file with one line of text to be used for encryption and decryption.

This file should not be on a git repository, so be sure to add it in the .gitignore file.

# generating random 31 characters string used as ansible vault password
tr -dc '[:alnum:]' < /dev/urandom | head -c 31 > .vault_pass
# add it in the gitignore
echo ".vault_pass" >> .gitignore

Inventory and Variable¶

The list of ansible-managed machines are found in the inventory file, and the inventory file was specified in the ansible configuration file, inventory={inventory_file_path}.

Variable is something you can use in combination with ansible plays. There will be a tons of example coming up throughout this series, but hope this example helps to get the idea.

server1 listed in the inventory file and variable role=webserver is set
there is an ansible playbook with tasks to install packages
one task says it's for web servers (when: role == "webserver") and the task is to install nginx using apt
- there may be server2 with role=webserver set, and server3 with role=database, and this task runs on server2 but not on server3

Inventory File¶

Here is the inventory/hosts.yml ansible inventory file I am going to use on this blog series.

# prepare the directory
mkdir inventory

# hosts.yml file
cat >inventory/hosts.yml <<EOF
lab:
  children:
    lab_docker:
      hosts:
        lab-ns1: # debian on proxmox
        lab-ns2: # debian on hyper-v
    lab_kubernetes:
      children:
        lab_k8s_cp:
          hosts:
            lab-cp1: # debian on hyper-v
            lab-cp2: # rocky on hyper-v
            lab-cp3: # ubuntu on proxmox
        lab_k8s_worker:
          hosts:
            lab-worker1: # debian on hyper-v
        lab_etcd:
          hosts:
            lab-etcd1: # debian on hyper-v
            lab-etcd2: # debian on proxmox
            lab-etcd3: # oracle linux on proxmox
EOF

I wanted to have servers with mixed OS to practice writing ansible scripts, so here they are. And they are all running as VM on either Windows 11 Pro Hyper-V or Proxmox VE.

Here is the breakdown of the inventory:

lab-ns1 and lab-ns2 under lab_docker group as hosts running services using docker
lab-cp1, lab-cp2, and lab-cp3 under lab_k8s_cp group as kubernetes ansible control plane nodes
lab-worker1 under lab_k8s_worker group as kubernetes worker node(s)
lab-etcd1, lab-etcd2, and lab-etcd3 under lab_etcd group as etcd nodes
examples
- limit ansible play target to lab_etcd to run the tasks on the three etcd nodes
- limit ansible play target to lab_kubernetes to run the tasks to all nodes that are part of kubernetes

I actually have the top groups named lab and hlv3 (for homelab version 3), and have sub-groups such as kubernetes, docker, and bastion. Let's say I have a common ansible playbook to run apt upgrade or dnf upgrade. I can just run the playbook for all nodes on my homelab, but I can also specify certain group to run the package upgrade task. Also when I am testing things out, I would run the test playbook targeting only lab group, and see how it goes, and run against hlv3 group when all is good.

Validating Hosts and Groups in the Ansible Inventory File¶

# list all hosts in inventory/hosts.yml file
ansible all --list-hosts -i inventory/hosts.yml

# -i option to specify inventory file can be omitted since it's configured in the configuration file
# list hosts in lab group
ansible lab --list-hosts
# and other groups
ansible lab_docker --list-hosts
ansible lab_kubernetes --list-hosts
ansible lab_k8s_cp --list-hosts
ansible lab_k8s_worker --list-hosts
ansible lab_etcd --list-hosts

Variable File¶

Variables can be set per host and per group.

There are different ways to set variables. In this project, I am placing vars.yml (and vault.yml) files in the directory named with the corresponding host or group.

In the example below, ./inventory/group_vars/lab_kubernetes/vars.yml contains the variables that apply to hosts in "lab_kubernetes" group, and vault.yml file sitting next to contains the variables in encrypted form. Likewise for ./inventory/host_vars/lab-etcd2/vars.yml, it contains vars for this specific host "lab-etcd2".

.
 |-.vault_pass
 |-inventory
 | |-group_vars
 | | |-lab_kubernetes
 | | | |-vars.yml
 | | | |-vault.yml
 | |-host_vars
 | | |-lab-etcd2
 | | | |-vars.yml
 | | | |-vault.yml
 | |-hosts.yml
 |-ansible.cfg

Example vars.yml and vault.yml¶

I am going to next go over variable file and encrypted vault file for lab_kubernetes group to see one example. Each file and variable will be covered later when it's actually used.

Here is the group variable file inventory/group_vars/lab_kubernetes/vars.yml. The variables set here can be used in ansible play tasks. For example, kube_svc_cidr can be used in a task to prepare kubeadm configuration file.

# group variable
hlv3_function: kubernetes

# kube-endpoint
kube_endpoint: "{{ vault_kube_endpoint }}"
kube_endpoint_vip: "{{ vault_kube_endpoint_vip }}"
kube_endpoint_port: "{{ vault_kube_endpoint_port }}"
kube_clustername: lab
kube_cluster_domain: lab.example.com

# kubeadm config
kube_svc_cidr: 10.96.0.0/16
kube_pod_cidr: 10.244.0.0/24

As you can see, the value of the variable kube_endpoint is unknown by just looking at this file. It's pointing to vault_kube_endpoint variable, and this can be stored in the encrypted vault inventory/group_vars/lab_kubernetes/vault.yml.

Run ansible-vault create inventory/group_vars/lab_kubernetes/vault.yml to create and edit the file. Create the file that look like below.

vault_kube_endpoint: kube-lab.example.com
vault_kube_endpoint_vip: 192.0.2.8
vault_kube_endpoint_port: 8443

If you take a look at this file, say, using cat, you can see the meaningless strings.

You can edit and view anytime by running ansible-vault edit inventory/group_vars/lab_kubernetes/vault.yml and ansible-vault view inventory/group_vars/lab_kubernetes/vault.yml.

First Playbook - bootstrap¶

I'd like run the first playbook to create a new user account with password-less ssh and sudo and dedicated for ansible master on each ansible-managed node. Once this is done, the following ansible plays will be executed by this specific ansible user.

This is how this plays out:

the servers are running and I have the credentials to access them
run the bootstrap playbook specifying the logon username and password (and/or ssh private key) for ansible to use to access the servers
ansible play executes tasks:
- to check if sudo is installed
- to create the new account which ansible will use going forward
- to install sudo if missing and setup password-less sudo for the created ansible account

Name Resolution¶

Since I already do have DNS servers running on my homelab, I added records for lab-cp1 and other hosts there to make it accessible by name from the ansible master machine.

For the sake of this blog series, I am going to cover the steps to prepare DNS servers along the way. And until we have the DNS servers, I am going to use /etc/hosts file to help with the name resolution on the ansible master until we get the DNS servers running.

# append the content of inventory/hosts-list.txt file to the /etc/hosts on the local, ansible master
sudo tee -a /etc/hosts < inventory/hosts-list.txt

Here is the inventory/hosts-list.txt file.

# temporarily added on ansible master to help access lab nodes by name
192.0.2.1 lab-cp1.lab.example.net lab-cp1
192.0.2.2 lab-cp2.lab.example.net lab-cp2
192.0.2.3 lab-cp3.lab.example.net lab-cp3
192.0.2.4 lab-worker1.lab.example.net lab-worker1
192.0.2.5 lab-etcd1.lab.example.net lab-etcd1
192.0.2.6 lab-etcd2.lab.example.net lab-etcd2
192.0.2.7 lab-etcd3.lab.example.net lab-etcd3
192.0.2.16 lab-ns1.lab.example.net lab-ns1
192.0.2.17 lab-ns2.lab.example.net lab-ns2

The domain "lab.example.net" is there because that is the domain suffix ansible master uses along with the hostnames listed in the inventory file.

Let me also explain about the variables set in the inventory/group_vars/lab/vars.yml file.

inventory/group_vars/lab/vars.yml

# environment
hlv3_environment: lab

# ansible remote access
ansible_user: "{{ vault_ansible_user }}"
ansible_username: "{{ vault_ansible_user }}" # required by bootstrap playbook
ansible_ssh_private_key_file: "{{ playbook_dir }}/playbooks/files/ssh/id_ed25519_ansible"
ansible_ssh_pubkey: "{{ playbook_dir }}/playbooks/files/ssh/id_ed25519_ansible.pub"
domain_suffix: "{{ vault_domain_suffix }}"
ansible_host: "{{ inventory_hostname_short }}.{{ domain_suffix }}"

This ansible_host is actual target ansible tries to access, and it's the combination of inventory_hostname_short which is the hostname in the inventory file and domain_suffix which is defined in the same file (pointing to the encrypted vault_domain_suffix) and is set to "lab.example.net".

For example, if you run the bootstrap playbook against lab_etcd group, ansible runs the play against lab-etcd1.lab.example.net, lab-etcd2.lab.example.net, and lab-etcd3.lab.example.net hosts and the local, ansible master needs to be able to resolve those names and access over tcpip.

Ansible User¶

There are also ansible_user and ansible_username variables set in this file. The former is the username ansible uses to access other hosts. Combining with ansible_host and it's going to look like ssh ${ansible_user}@${ansible_host}. In this blog series let's just assume the ansible username is ansiblemaster-lab, and so the access to lab-etcd1 will be ssh [email protected].

Now the latter ansible_username has the same vault_ansible_user value. This ansible_username is used in a task in the bootstrap playbook to create a new ansible account on the hosts onboarding. It's a bit confusing, but this bootstrap process is run before the ansible account is created and so ansible_user value is overridden by the existing user account, thus a slightly different name is used for this variable just for the sake of the bootstrap tasks.

SSH Key for Ansible User¶

This ansible_ssh_private_key_file is the ssh private key used when doing ssh [email protected]. Its public key, ansible_ssh_pubkey, is used in the bootstrap task to add it to the ~/.ssh/authorized_keys list of the newly created ansible account.

So, let's generate this new ssh key pair.

# prepare playbooks dir
# and files/ssh directory to place ssh key pair used by ansible master
mkdir -p playbooks/files/ssh
cd playbooks/files/ssh
ssh-keygen -t ed25519 -f id_ed25519_ansible

# the ssh directory is listed in the gitignore list
# echo "playbooks/files/ssh" >> {prj_root}/.gitignore

Ansible ping-pong test¶

Let's say the username already available no the remote hosts is "happyansibleuser", you can run ansible ping-pong test to see if the ansible master can access target hosts.

# with "-k" option, ansible will prompt you to enter ssh password to use to logon as happyansibleuser on target hosts
ansible all -m ping -e ansible_user=happyansibleuser -k

If sshpass is not installed¶

In case if you get error messages like this, you can install sshpass (sudo apt install sshpass for example).

lab-etcd1 | FAILED! => {
    "msg": "to use the 'ssh' connection type with passwords or pkcs11_provider, you must install the sshpass program"
}

Use existing ssh private key¶

If you are using ssh private key located at ~/.ssh/keyfile to access the target hosts, you can run ping like this.

ansible all -m ping -e ansible_ssh_private_key_file=~/.ssh/keyfile -e ansible_user=happyansibleuser

Run bootstrap playbook¶

# this will try ssh key logon and then password
# and use "happyansibleuser" username to connect
ansible-playbook playbooks/bootstrap.yml -e ansible_ssh_private_key_file=playbooks/files/ssh/id_ed25519_ansible -e ansible_user=happyansilbleuser -k -K

Verification¶

Once the play has been successfully executed, you can run the same playbook without any options and it works all good, and you can also run ping to confirm.

ansible-playbook playbooks/bootstrap.yml
ansible all -e ansible_ssh_private_key_file=playbooks/files/ssh/id_ed25519_ansible -m ping

Gather facts¶

I have prepared the playgook to gather facts and save the result as json file per each host on local ansible master.

# run the playbook
ansible-playbook playbooks/gather_facts.yml

# check the facts json files
ls playbooks/facts
cat playbooks/facts/lab-cp1.json

Setting up docker-ready nodes¶

These two hosts as mentioned earlier are going to run services using docker. I am going to install docker using ansible playbook.

ansible-playbook playbooks/docker.yml --tags cleaninstall

Here are the tasks executed in brief:

install ansible community.docker requirements on the target host using package manager such as apt and dnf
- packages to install are defined in ./roles/docker/defaults/main.yml
uninstall existing docker packages and clean up docker directories
- also defined in ./roles/docker/defaults/main.yml
install docker
- add official docker repository to the package manager
- install docker packages from the official repository
  - version defined in ./roles/docker/defaults/main.yml
- lock the package version
- make sure the docker is enabled on systemd
- add ansible user to the docker group
reboot

There is "test" tag to verify the installation by checking installed docker version and running hello-world.

ansible-playbook playbooks/docker.yml --tags test

Running DNS server using docker¶

The way I am doing this on my real homelab is to prepare a git repository including docker compose file and DNS configuration file, clone the repository on remote docker hosts and spin it up.

For the sake of this blog series, I have prepared a playbook to deploy similar DNS service.

Unbound DNS¶

The playbook uploads DNS configuration file and run DNS server.

The image I'm going to use is mvance/unbound available on Docker Hub. Here are the links to the GitHub repository and Docker Hub.

https://github.com/MatthewVance/unbound-docker

https://hub.docker.com/r/mvance/unbound

Edit .roles/dns/templates/env.j2 file to change the image tag. It is set to "1.21.1" which is the latest as of the timing of writing. It uses Cloudflare DNS over TLS for the name resolution (root forwarder destinations).

Preparing DNS configuration file¶

Customize ./roles/dns/templates/a-records.j2 file to match the names and IP addresses for your actual homelab environment.

The file stored in the repository contains the hosts listed in the server list section using the example domain and example IP address for each host (example.net in rfc6761 and 192.0.2.0/24 TEST-NET-1 in rfc5737).

Also, note that there is a record for kubernetes apiserver virtual IP address, lab-kube-endpoint.lab.example.net.. This will be the VIP for the highly-available kube-apiserver hosted by the three control plane nodes, lab-cp1, lab-cp2, and lab-cp3.

local-data: "lab-cp1.lab.example.net. IN A 192.0.2.1"
local-data: "lab-cp2.lab.example.net. IN A 192.0.2.2"
local-data: "lab-cp3.lab.example.net. IN A 192.0.2.3"
local-data: "lab-worker1.lab.example.net. IN A 192.0.2.4"
local-data: "lab-etcd1.lab.example.net. IN A 192.0.2.5"
local-data: "lab-etcd2.lab.example.net. IN A 192.0.2.6"
local-data: "lab-etcd3.lab.example.net. IN A 192.0.2.7"
local-data: "lab-ns1.lab.example.net. IN A 192.0.2.16"
local-data: "lab-ns2.lab.example.net. IN A 192.0.2.17"
local-data: "lab-kube-endpoint.lab.example.net. IN A 192.0.2.8"

Running DNS playbook¶

ansible-playbook playbooks/dns.yml  # the untagged play will just display tags available on this playbook and exits

# start
# - stop if already running
# - upload files, verify docker compose, and pull image if missing
# - start the service
# - test name resolution from the localhost using dig command
ansible-playbook playbooks/dns.yml --tags start

# stop
ansible-playbook playbooks/dns.yml --tags stop

# enable/disable on systemd
ansible-playbook playbooks/dns.yml --tags enable
ansible-playbook playbooks/dns.yml --tags disable

Change nameservers settings¶

Once the DNS service is ready on lab-ns1 and lab-ns2, let us change the nameservers settings on each host to use the DNS service running on lab-ns1 and lab-ns2.

There are different variations and combinations of services to manage network related settings now. I barely managed to get things working at least for what I have running on my lab and actual homelab environment.

# to double check what you have
ansible-playbook playbooks/nameservers.yml --tags check

# to update the settings
ansible-playbook playbooks/nameservers.yml --tags update

The IP addresses of the nameservers are retrieved from the host command ran locally, so it either comes from the /etc/hosts file setup previously or your existing DNS server.

Here is the list of what is done in brief:

identify what is running: networking, networkd (netplan), or NetworkManager
get the IP addresses of lab-ns1 and lab-ns2 by looking up
NetworkManager
- apply new IP4.DNS using nmcli
- if no IP4.DNS is present, modify /etc/resolv.conf
networking
- modify /etc/resolv.conf
networkd (netplan)
- identify netplan yaml file used and upload new templated one, then apply

Setting up kubernetes-ready nodes¶

Multiple items must be installed and configured to make a host kubernetes-ready node. I will run my kubernetes playbook to check the configuration and installed kubernetes component version, and also to install and configure whatever needed.

# check and generate a report in a markdown file
ansible-playbook playbooks/kubernetes.yml --tags check

# make a kubernetes-ready host
ansible-playbook playbooks/kubernetes.yml --tags prepare

Here is the list of things done in 'prepare' tagged tasks:

disable swap memory
enable ipv4 forwarding
install and setup containerd, runc, and cni
disable selinux and firewalld
install kubeadm, kubelet, and kubectl
install other necessary packages

Version of each component is defined in the defaults file at ./roles/kubernetes/defaults/main.yml along with all the variables used in the playbook. The additional packages to install are also listed here, and the package installation tasks refer to this list in the defaults file.

The 'check' tasks will check above items and list them in tables. See ./playbooks/files/kubernetes/*.md files after running --tags check.

Now, the external etcd cluster nodes have a slightly different requirement than the kubernetes cluster nodes, but the same preparatory changes are made in this project.

Example markdown report on kube-ready nodes¶

./playbooks/files/kubernetes/lab.md

# Kubernetes readiness check report for lab environment

## Host Summary

| hostname    | product_uuid                         | swap | systemd swap unit | ipv4_forward            | cgroup    | selinux    | firewalld |
| ----------- | ------------------------------------ | ---- | ----------------- | ----------------------- | --------- | ---------- | --------- |
| lab-cp1     | bd45cd73-1fd5-44b1-b66a-481b8100deb6 | 0    | none              | net.ipv4.ip_forward = 1 | cgroup2fs | n/a        | n/a       |
| lab-cp2     | f4862e7e-49d3-4ae8-9c84-d4aada7ee01d | 0    | none              | net.ipv4.ip_forward = 1 | cgroup2fs | Permissive | n/a       |
| lab-cp3     | 86936edc-36f7-4730-bdbe-6ddbfc9a5226 | 0    | none              | net.ipv4.ip_forward = 1 | cgroup2fs | n/a        | n/a       |
| lab-worker1 | 9b1b24d3-5d9b-4d11-92d6-6eb6dd8d4f7e | 0    | none              | net.ipv4.ip_forward = 1 | cgroup2fs | Permissive | n/a       |

## kubernetes Packages

| hostname    | kubeadm | kubelet | kubectl |
| ----------- | ------- | ------- | ------- |
| lab-cp1     | v1.32.2 | v1.32.2 | v1.32.2 |
| lab-cp2     | v1.32.2 | v1.32.2 | v1.32.2 |
| lab-cp3     | v1.32.2 | v1.32.2 | v1.32.2 |
| lab-worker1 | v1.32.2 | v1.32.2 | v1.32.2 |

## Dependencies

| hostname    | containerd | runc  | cni    |
| ----------- | ---------- | ----- | ------ |
| lab-cp1     | v2.0.2     | 1.2.5 | v1.6.0 |
| lab-cp2     | v2.0.2     | 1.2.5 | v1.6.0 |
| lab-cp3     | v2.0.2     | 1.2.5 | v1.6.0 |
| lab-worker1 | v2.0.2     | 1.2.5 | v1.6.0 |

Setting up etcd cluster¶

Here are the tasks to setup etcd cluster:

configure kubelet to manage etcd service
generate etcd CA certs on one node and copy them over to the other etcd nodes
use the common CA certs to generate other certs required by etcd cluster members and also kubernetes control plane nodes
- the certs copy will be downloaded to the ansible master and will be used later when setting up kubernetes cluster
generate static pod manifest to run etcd and form a new etcd cluster

# spin up a new etcd cluster
ansible-playbook playbooks/etcd.yml --tags cluster

# run health checks and display results
ansible-playbook playbooks/etcd.yml --tags healthcheck

Setting up kubernetes cluster¶

In the previous steps, all the nodes were made kubernetes-ready and the etcd cluster kubernetes control planes are going to use was built. Finally in this step, I am going to setup a new kubernetes cluster.

Here is the list of tasks:

prepare kubeadm configuration file
prepare configuration files and static pod manifests to setup highly-available kube-apiservers
- keepalived to setup VIP with health check monitoring kube-apiserver availability on localhost
- haproxy to setup kube-apiserver loadbalancer with health check monitoring kube-apiserver on all control plane nodes
upload certs generated by etcd to control plane nodes
spin up the kubernetes cluster
- initiate the cluster on one control plane node
- copy necessary certs to the other control plan enodes
- join the other control plane nodes to the cluster
join worker nodes to the cluster

kubeadm configuration for control plane nodes¶

The custom kubernetes cluster configuration can be set using the kubeadm config file. The same was actually done when generating etcd cluster certs in the previous step by providing custom etcd service details and generating certs based on that configuration.

The two important customization to be done here are to use external etcd cluster and to change the kube-apiserver endpoint.

The official documentation on this is here.

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/#set-up-the-first-control-plane-node

etcd configuration in kubeadmcfg¶

Below is etcd section of kubeadm config file, and is in jinja2 template.

The control plane nodes on the kubernetes cluster know that the etcd service is available on the external etcd cluster with endpoints on three etcd nodes listening on port 2379, and that the specified certs must be used to access these external etcd service.

etcd:
  external:
    endpoints:
{% for etcd_ipaddr in lst_etcdipaddr %}
      - https://{{ etcd_ipaddr }}:2379
{% endfor %}
    caFile: /etc/kubernetes/pki/etcd/ca.crt
    certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
    keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key

Note that the etcd node IP address list in lst_etcdipaddr gets populated from the actual running systems.

kube-endpoint configuration in kubeadmcfg¶

The second key customization is the kube-endpoint. Admins, operators, kubernetes cluster components, external tools, and etc. uses kube-apiserver to communicate with the kubernetes cluster, and the endpoint is the destination where they send their communications to.

In case of the basic kubernetes cluster described in the beginning of this blog series, the kube-apiserver was present on a single control plane node, in which case the endpoint was the same as the kube-apiserver listening on the control plane.

When you have multiple kube-apiservers (control plane nodes) running in a kubernetes cluster, that is when you should point the kube-endpoint to somewhere other than the actual kube-apiserver listening on the control plane node.

---
title: flow from client to endpoint to apiserver
---
flowchart LR
    client[operators<br/>k8s components<br/>external tools] --> haproxy[kube-endpoint + loadbalancer<br/>lab-kube-endpoint.lab.example.net:8443] --> kubeapi[any cp node running healthy kube-apiserver<br/>cp-node:6443]

In this blog series, the endpoint is "lab-kube-endpoint.lab.example.net:8443" and its IP address is 192.0.2.8. The endpoint config portion in the kubeadmcfg is as shown below.

controlPlaneEndpoint: "{{ kube_endpoint }}:{{ kube_endpoint_port }}"

These variables are found in the defaults file at ./roles/kubernetes/defaults/main.yml. When running the playbook in the actual environment, update the variables here or in group inventory variables file such as ./inventory/group_vars/lab_kubernetes/vars.yml.

Customizing kube-endpoint is about pointing the destination for kube-apiservers access to a loadbalancer. The loadbalancer setup details will be covered shortly.

Rest of the settings in kubeadmcfg¶

Before moving on to the loadbalancer setup, let me explain briefly about the other customization done in the kubeadmcfg file.

Here is the entire kubeadmcfg jinja2 template, the kubernetes cluster configuration.

---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "{{ kube_endpoint }}:{{ kube_endpoint_port }}"
clusterName: {{ kube_clustername }}
networking:
  dnsDomain: {{ kube_cluster_domain }}
  podSubnet: {{ kube_pod_cidr }}
  serviceSubnet: {{ kube_svc_cidr }}
etcd:
  external:
    endpoints:
{% for etcd_ipaddr in lst_etcdipaddr %}
      - https://{{ etcd_ipaddr }}:2379
{% endfor %}
    caFile: /etc/kubernetes/pki/etcd/ca.crt
    certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
    keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
controllerManager:
  extraArgs:
    - name: "allocate-node-cidrs"
      value: "true"
    - name: "cluster-cidr"
      value: "{{ kube_cluster_cidr }}"
proxy:
  disabled: true

I will be using Cilium as the network add-on for this kubernetes cluster. Different network add-on has different cidr settings and other requirements. To do what I wanted to do with Cilium, I disabled the kube-proxy and turned on "allocate-node-cidrs" and set the cidr range for the controller manager.

The minimum kubeadmcfg customizations required for the highly-available cluster are the endpoint and etcd. The official documentation on this with sample config file is in the link below.

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/#set-up-the-first-control-plane-node

highly-available kube-apiserver preparation¶

The next topic is about setting up highly-available kube-apiserver using loadbalancer. There are different ways of implementing it, and the details are described in the official document especially in the following links.

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/#create-load-balancer-for-kube-apiserver

https://github.com/kubernetes/kubeadm/blob/main/docs/ha-considerations.md#options-for-software-load-balancing

In this project, I will be running keepalived and haproxy as static pods.

The keepalived service provides a virtual IP managed by a configurable health check

The haproxy service can be configured for simple stream-based load balancing thus allowing TLS termination to be handled by the API Server instances behind it

VIP - keepalived¶

Keepalived service running on each control plane nodes negotiates among each other to decide which one takes the VIP. In addition to the simple service availability check for the keepalived itself, the kube-apiserver health check on localhost:6443 is taken into account. So if lab-cp2 is up and keepalived service is running, but kube-apiserver is not, then the keepalived on lab-cp2 decides that it won't take the VIP.

loadbalancer - haproxy¶

So the keepalived is about hosting the VIP. This haproxy is about listening on port 8443, to receive the request traffic and loadbalance it to any one of the available kube-apiservers. Haproxy will also be doing its own health check against kube-apiserver on all control plane nodes to make sure that it passes on the request trafifc to a healthy kube-apiserver.

Preparing certs files¶

In a basic setup with the stacked etcd design, there will be cluster initialization phases to setup etcd and also generate certs that enables legitimate communication between the control plane and etcd service. Since the etcd was built independently, the certs required by the control plane are also generated independent to the kubernetes cluster initialization phases.

These certs were already generated and downloaded to the ansible master as a part of the tasks to setup the etcd cluster.

./playbooks/files/etcd-ca/ca.crt
./playbooks/files/etcd-certs/apiserver-etcd-client.crt
./playbooks/files/etcd-certs/apiserver-etcd-client.key

spin up the kubernetes cluster¶

Here is the same list mentioned in the beginning of this section. All of what's explained above will be done by this playbook.

prepare kubeadm configuration file
prepare configuration files and static pod manifests to setup highly-available kube-apiservers
- keepalived to setup VIP with health check monitoring kube-apiserver availability on localhost
- haproxy to setup kube-apiserver loadbalancer with health check monitoring kube-apiserver on all control plane nodes
upload certs generated by etcd to control plane nodes
spin up the kubernetes cluster
- initiate the cluster on one control plane node
- copy necessary certs to the other control plan enodes
- join the other control plane nodes to the cluster
join worker nodes to the cluster

# init kubernetes cluster and join all available control plane nodes
ansible-playbook playbooks/kubernetes.yml --tag cluster

# join all available worker nodes to the kubernetes cluster
ansible-playbook playbooks/kubernetes.yml --tag worker

The kubeadm init command run to spin up the cluster on the first control plane node actually prints a lot of logs. The outputs will be saved in files ./playbooks/files/kubernetes/kubeadm.log and ./playbooks/files/kubernetes/kubeadm.err.

The kubectl get nodes -o wide output is also taken and saved in the file ./playbooks/files/kubernetes/kubectl_get_nodes.txt.

Install network-addon - Cilium¶

As you may have seen it in the kubectl get nodes output, all the nodes are shown as "not ready". The next thing you need to install is a network add-on.

The quick installation steps would be to install cilium cli and use it to install cilium on the cluster, but we cannot go this path this time.

As briefly mentioned, there are features I wanted to try, and different feature had different requirements to use them. I have customized the cluster when configuring it through kubeadmcfg cluster configuration file. Some of the customizations were to have external etcd cluster with highly-available kube-apiservers. Others were for these items in the following links.

https://docs.cilium.io/en/stable/installation/k8s-install-external-etcd/#requirements

https://docs.cilium.io/en/stable/network/l2-announcements/#prerequisites

https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/gateway-api/#prerequisites

And so the cilium installation must be customized as well. I chose to do this using helm as I can make the customization on the "values" file, and document and track the changes on VCS. It's GitOps, almost. I mean, GitOps for the cluster cannot be setup until the cluster is functional with network add-on installed, but the point is that I can record the changes made in a GitOps repository.

You need to work on a host that you use to operate the kubernetes cluster. That host may be the same ansible master host or one of the control plane node. Let's just go with the control plane this time.

Here is the list of tasks:

install helm
identify the cilium version to use
download the values file of the cilium chart on the version you are going to install
edit the values file
install the helm chart

Here is the list of commands executed:

# on one of the control plane node

# install helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# add cilium repository on helm
helm repo add cilium https://helm.cilium.io/

# confirm the latest version of cilium
helm search repo cilium
helm search repo cilium -l # to see all available versions

# download the values file for version 1.17.1
helm show values cilium/cilium --version 1.17.1 > values.yaml

# edit the values file

# create secret for cilium containing etcd cert files
sudo cp /etc/kubernetes/pki/etcd/ca.crt .
sudo cp /etc/kubernetes/pki/apiserver-etcd-client.crt client.crt
sudo cp /etc/kubernetes/pki/apiserver-etcd-client.key client.key

sudo chown $USER:$USER *.crt
sudo chown $USER:$USER *.key

kubectl create secret generic -n kube-system cilium-etcd-secrets \
    --from-file=etcd-client-ca.crt=ca.crt \
    --from-file=etcd-client.key=client.key \
    --from-file=etcd-client.crt=client.crt

sudo rm *.crt *.key

# install
helm install cilium cilium/cilium --version 1.17.1 --values values.yaml -n kube-system

# it took a little less than 20 minutes until everything was up and running
# for a cluster composed of VMs running on personal-use Proxmox and Hyper-V

Here is the list of changes made to the cilium values file, and the entire file is stored at ./playbooks/files/cilium/values.yaml:

k8sServiceHost: lab-kube-endpoint.lab.example.net
k8sServicePort: "8443"
k8sClientRateLimit.qps: 33
k8sClientRateLimit.burst: 50
kubeProxyReplacement: "true"
kubeProxyReplacementHealthzBindAddr: "0.0.0.0:10256"
l2announcements.enabled: true
l2announcements.leaseDuration: 3s
l2announcements.leaseRenewDeadline: 1s
l2announcements.leaseRetryPeriod: 200ms
externalIPs.enabled: true
gatewayAPI.enabled: true
etcd.enabled: true
etcd.ssl: true
etcd.endpoints: ["https://192.0.2.5:2379", "https://192.0.2.6:2379", "https://192.0.2.7:2379"]
hubble.ui.enabled: true
hubble.relay.enabled: true
hubble.peerService.clusterDomain: lab.example.net

Demo¶

The kubernetes cluster is now functional with network add-on installed. Let me do a demo on name lookups inside the cluster, and then some more demos using cilium features.

Name lookups¶

Let's create a temporary namespace named "test" and create a pod there.

# again on any one of the control plane node...

# create test namespace
kubectl create namespace test

# add dnsutils pod in the test namespace
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: dnsutils
  namespace: test
spec:
  containers:
    - name: dnsutils
      image: registry.k8s.io/e2e-test-images/agnhost:2.39
      command:
        - sleep
        - "infinity"
      imagePullPolicy: IfNotPresent
  restartPolicy: Always
EOF

You can execute commands on the test pod using kubectl exec. You can see from the logs below that:

it's configured to use nameserver at 10.96.0.10
10.96.0.10 is the IP address of kube-dns service in kube-system namespace
you can lookup {service-name}.{namespace}.svc.{cluster-domain} for available service in the kubernetes cluster
- for example, kube-dns.kube-system.svc.lab.example.net.
- for example, kubernetes.default.svc.lab.example.net.
- for example, hubble-ui.kube-system.svc.lab.example.net.
workloads in "test" namespace has search suffix "test.svc.lab.example.net"
- if there is service "web" in this test namespace, "web.test.svc.lab.example.net" is the available destination for this "web" service
- thanks to the search suffix list, workloads in the same "test" namespace can access merely by this name "web"

$ kubectl exec -t pod/dnsutils -n test -- cat /etc/resolv.conf
search test.svc.lab.example.net svc.lab.example.net lab.example.net
nameserver 10.96.0.10
options ndots:5

$ kubectl get svc -A
NAMESPACE     NAME           TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes     ClusterIP   10.96.0.1      <none>        443/TCP                  19h
kube-system   cilium-envoy   ClusterIP   None           <none>        9964/TCP                 19h
kube-system   hubble-peer    ClusterIP   10.96.128.27   <none>        443/TCP                  19h
kube-system   hubble-relay   ClusterIP   10.96.155.61   <none>        80/TCP                   17h
kube-system   hubble-ui      ClusterIP   10.96.227.21   <none>        80/TCP                   17h
kube-system   kube-dns       ClusterIP   10.96.0.10     <none>        53/UDP,53/TCP,9153/TCP   19h

$ kubectl exec -t pod/dnsutils -n test -- dig kubernetes.default +search +noall +answer
kubernetes.default.svc.lab.example.net. 30 IN A 10.96.0.1

$ kubectl exec -t pod/dnsutils -n test -- dig kube-dns.kube-system +search +noall +answer
kube-dns.kube-system.svc.lab.example.net. 30 IN A 10.96.0.10

$ kubectl exec -t pod/dnsutils -n test -- dig hubble-ui.kube-system +search +noall +answer +stats
hubble-ui.kube-system.svc.lab.example.net. 30 IN A 10.96.227.21
;; Query time: 1 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Fri Feb 28 01:02:50 UTC 2025
;; MSG SIZE  rcvd: 139

Cleaning up the test namespace¶

Nothing complex was created on this namespace, so to clean this up you can just delete the namespace, and the pod will also be gone.

kubectl delete ns test

Cilium L2Advertisement¶

All these 10.96.*.* IP addresses shown as the service IP address are not accessible from outside the kubernetes cluster. One of the solutions to this is by using layer 2 advertisement feature. Here I am going to use the hubble-ui service that I enabled in the cilium helm chart to demonstrate this.

I first look for the appropriate label to use to identify the hubble-ui pods.

# looking at the defined labels on the hubble-ui deployment
$ kubectl get deploy hubble-ui -n kube-system -o jsonpath='{.spec.template.metadata.labels}'
{"app.kubernetes.io/name":"hubble-ui","app.kubernetes.io/part-of":"cilium","k8s-app":"hubble-ui"}o

# double check that the label works
$ kubectl get pods -l 'k8s-app=hubble-ui' -n kube-system
NAME                        READY   STATUS    RESTARTS   AGE
hubble-ui-68bb47466-6gkwb   2/2     Running   0          100m

I will then create another service but with loadbalancer type for the hubble-ui.

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: l2-hubble-ui
  namespace: kube-system
  labels:
    app.kubernetes.io/name: l2-hubble-ui
spec:
  type: LoadBalancer
  ports:
    - port: 80
      protocol: TCP
      targetPort: 8081
  selector:
    k8s-app: hubble-ui
EOF

The new service is created to access the same hubble-ui pods as the existing "hubble-ui" service.

# the created service with "pending" external IP address allocation
$ kubectl get svc l2-hubble-ui -n kube-system
NAME           TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
l2-hubble-ui   LoadBalancer   10.96.177.88   <pending>     80:32442/TCP   23s

Now I create a cilium IP pool for the created "l2-hubble-ui" service.

cat <<EOF | kubectl apply -f -
apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: "ippool-hubble-ui"
spec:
  blocks:
    - start: "192.0.2.24"
      stop: "192.0.2.24"
  serviceSelector:
    matchExpressions:
      - { key: app.kubernetes.io/name, operator: In, values: [l2-hubble-ui] }
EOF

Now the external IP address gets assigned to the service.

$ kubectl get svc l2-hubble-ui -n kube-system
NAME           TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)        AGE
l2-hubble-ui   LoadBalancer   10.96.177.88   192.0.2.24   80:32442/TCP   5m48s

YES! Now, is it reachable? Not yet, as no one is advertising on the LAN that this IP address is in use and available. So next, the l2 announcement policy needs to be created.

cat <<EOF | kubectl apply -f -
apiVersion: "cilium.io/v2alpha1"
kind: CiliumL2AnnouncementPolicy
metadata:
  name: l2-hubble-ui
spec:
  serviceSelector:
    matchLabels:
      app.kubernetes.io/name: l2-hubble-ui
  interfaces:
    - ^eth[0-9]+
    - ^eno[0-9]+
    - ^enp[0-9]s[0-9]+
  loadBalancerIPs: true
EOF

Now the IP address gets advertised on LAN, and I can connect to the hubble-ui from a web-browser on other machines on my home LAN.

Hubble UI¶

https://github.com/cilium/hubble-ui

Observability & Troubleshooting for Kubernetes Services

Since this is the tool to see something going on in the cluster, we want something running. There is a cilium post-installation test programs available which is introduced in the installation document. Let's go ahead and use this.

https://docs.cilium.io/en/latest/installation/k8s-install-helm/#validate-the-installation

https://github.com/cilium/cilium/blob/main/examples/kubernetes/connectivity-check/connectivity-check.yaml

It is as simple as spinning up the name lookup test pod, to create a namespace, do kubectl apply, and delete the namespace to clean up all.

# create the namespace cilium-test
kubectl create ns cilium-test

# run the connecitvity check pods in the cilium-test namespace
kubectl apply -n cilium-test -f https://raw.githubusercontent.com/cilium/cilium/1.17.1/examples/kubernetes/connectivity-check/connectivity-check.yaml

# clean up
kubectl delete ns cilium-test

Here is the screen capture of the hubble UI for the cilium-test namespace.

TODO: attach the image here