building homelab cluster part 6

Table of Content

building homelab cluster part 6¶

I setup cert-manager in part 5. How I used to setup web access before was that I manually prepared a TLS certificate, convert it into a secret, and reference it in the gateway manifest. It wasn't bad at all because I just used a single gateway with wildcard certificate, and that took care of any additional service with different names. However I also like how I setup cert-manager since the certificates get generated automatically for each gateway I create based on the hostname set, and I can't wait to see the cert-manager to handle renewal automatically when time comes.

So in this next part, I will setup GitLab Runner which is the instances that carries out jobs available on GitLab, to execute GitLab CI/CD. The process can be very efficient when caching is available, so I will configure GitLab Runner to use Minio S3 as caching storage.

GitLab Runner¶

https://docs.gitlab.com/runner/

GitLab Runner is an application that works with GitLab CI/CD to run jobs in a pipeline

node requirements¶

[x] install pigz on nodes that will run runner pods
- sudo apt install pigz

This is not documented in the official document, but missing pigz will result in unstable init error in both init-permissions init container and pipeline helper pod.

WARNING: Failed to pull image with policy "": image pull failed: failed to pull and unpack image "registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:ubuntu-x86_64-v16.10.0": failed to extract layer sha256:09c555e0ed6e70cd551f4d350f67dbbce8c9cb33a11a4f0c530b1da291d5dfb7: failed to get stream processor for application/vnd.docker.image.rootfs.diff.tar.gzip: fork/exec /usr/bin/unpigz: no such file or directory: unknown

GitLab Runner helm chart¶

https://docs.gitlab.com/runner/install/kubernetes.html

As always, I will be installing gitlab runner using helm.

# add repository
helm repo add gitlab https://charts.gitlab.io

# update when necessary
helm repo update gitlab

# list versions
helm search repo -l gitlab/gitlab-runner | head

# move to the homelab repo and store default values file
cd {repo}/infrastructure/homelab/controllers/default-values
helm show values gitlab/gitlab-runner --version 0.62.1 > gitlab-runner-values.yaml
cp gitlab-runner-values.yaml ../.

required configuration¶

https://docs.gitlab.com/runner/install/kubernetes.html#required-configuration

There are things I must configure in the values file to get it working with my self-managed GitLab.

gitlabUrl
- "https://cp.blink-1x52.net/"
rbac: { create: true }
runnerToken
- to be provided as secret named "gitlab-runner"

runner token¶

https://docs.gitlab.com/runner/install/kubernetes.html#store-registration-tokens-or-runner-tokens-in-secrets

As the self-managed GitLab server admin, navigate to Admin area > CI/CD > Runners, and manually create a linux runner and it will give you the runner token to register a runner.

Create a secret on sops repo using the token provided.

kubectl create secret generic gitlab-runner \
  --from-literal=runner-registration-token="" \
  --from-literal=runner-token="your token here" \
  --dry-run=client \
  --namespace=runner \
  -o yaml > gitlab-runner.yaml

sops -i --encrypt gitlab-runner.yaml

https://docs.gitlab.com/runner/install/kubernetes.html#s3

Example configuration is provided in the above link and configuration section will look like this in my case.

runners:
  # runner configuration, where the multi line string is evaluated as a
  # template so you can specify helm values inside of it.
  #
  # tpl: https://helm.sh/docs/howto/charts_tips_and_tricks/#using-the-tpl-function
  # runner configuration: https://docs.gitlab.com/runner/configuration/advanced-configuration.html
  config: |
    [[runners]]
      [runners.kubernetes]
        namespace = "{{.Release.Namespace}}"
        image = "alpine"
      [runners.cache]
        Type = "s3"
        Shared = true
        [runners.cache.s3]
          ServerAddress = "s3.blink-1x52.net"
          BucketName = "runner"

~~Credentials to use S3 is explained in the distributed runners cache section in the values file.~~

## $ kubectl create secret generic s3access \
##   --from-literal=accesskey="YourAccessKey" \
##   --from-literal=secretkey="YourSecretKey"

cache:
  ## S3 the name of the secret.
  secretName: s3access

Now, despite the instruction and this secret getting properly mounted to the generated pod, the go interfaces for the cache fails to use the credentials to access Minio S3. I did not want to do this, but I had to write the accessKey and secretKey directly in the values file as this way the cache Minio S3 bucket access works.

bucket and credentials¶

Let me access the minio tenant and create a bucket named "runner" and a user credentials.

login to the minio tenant GUI
navigate to the bucket menu to create a bucket named "runner"
- go check the bucket access policy and set to private
create a user and set read-write policy
create access key for this user, and you'll get accessKey and secretKey to give to GitLab Runner

~~And then on the sops repository, create a secret named s3access like this.~~ As written in the previous section, you can get the created pod to mount this secret, but the go program inside fails to use it unfortunately. Instead of creating a secret, write the accessKey and secretKey in the values file directly until the problem is fixed. I tested with 0.62.0 and 0.62.1 helm charts. The same credential works with the Minio mc client program and when written directly in the config.toml inside the runners values file.

kubectl create secret generic s3access \
  --from-literal=accessKey="access key here" \
  --from-literal=secretKey="secret key here" \
  --dry-run=client \
  --namespace=runner \
  -o yaml > s3access.yaml

sops -i --encrypt s3access.yaml

# the secret was deleted after many trial and errors

resources¶

I will just uncomment what's provided in the values file.

## Configure resource requests and limits
## ref: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
##
resources:
  limits:
    memory: 256Mi
    cpu: 200m
    ephemeral-storage: 512Mi
  requests:
    memory: 128Mi
    cpu: 100m
    ephemeral-storage: 256Mi

advanced configuration¶

https://docs.gitlab.com/runner/configuration/advanced-configuration.html

https://docs.gitlab.com/runner/configuration/advanced-configuration.html#how-clone_url-works

My GitLab is running as a Docker container with plain http, and there is NGINX reverse proxy doing the TLS offload in front of it. The GitLab server itself thinks it's reachable on plain http and tells runner so, but the runner must access the GitLab server over https.

https://docs.gitlab.com/runner/configuration/advanced-configuration.html#helper-image

The default alpine helper image can have DNS issues, and adding helper_image_flavor = "ubuntu" helps resolve the issue.

https://docs.gitlab.com/runner/executors/kubernetes/index.html#specify-the-node-to-execute-builds

My cluster consists of amd64 and arm64 nodes, and I need runners to run only on amd64, so I add node selector config.

runners:
  # runner configuration, where the multi line string is evaluated as a
  # template so you can specify helm values inside of it.
  #
  # tpl: https://helm.sh/docs/howto/charts_tips_and_tricks/#using-the-tpl-function
  # runner configuration: https://docs.gitlab.com/runner/configuration/advanced-configuration.html
  #
  # helper image
  # ref) https://hub.docker.com/r/gitlab/gitlab-runner-helper/tags?page=1&name=ubuntu-x86_64
  #
  # node to execute the build
  # ref) https://docs.gitlab.com/runner/executors/kubernetes/index.html#specify-the-node-to-execute-builds
  config: |
    [[runners]]
      clone_url = "https://cp.blink-1x52.net"
      [runners.kubernetes]
        namespace = "{{.Release.Namespace}}"
        image = "alpine"
        helper_image_flavor = "ubuntu"
        [runners.kubernetes.node_selector]
          "kubernetes.io/arch" = "amd64"
      [runners.cache]
        Type = "s3"
        Shared = true
        [runners.cache.s3]
          ServerAddress = "s3.blink-1x52.net"
          BucketName = "runner"
          BucketLocation = ""
          Insecure = false
          AuthenticationType = "access-key"
          AccessKey = "REDACTED"
          SecretKey = "REDACTED"

values.yaml¶

Here is the final version as of now. The access key and secret key is written in plain text because passing them in secret works and gets mounted on the runner instance, but the program currently fails to use them and falls back to not using cache when executing GitLab CI/CD pipeline jobs.

Passing credentials in secret might work if you are using AWS S3.

diff default-values/gitlab-runner-values.yaml gitlab-runner-values.yaml

13,14c13,16
<   registry: registry.gitlab.com
<   image: gitlab-org/gitlab-runner
---
>   # registry: registry.gitlab.com
>   registry: registry.blink-1x52.net
>   # image: gitlab-org/gitlab-runner
>   image: cache-dockerhub/gitlab/gitlab-runner
15a18,19
>   # tag: alpine-v16.9.1
>   # https://hub.docker.com/r/gitlab/gitlab-runner/tags?page=1&name=alpine-v
52c56
< # gitlabUrl: http://gitlab.your-domain.com/
---
> gitlabUrl: https://cp.blink-1x52.net/
147c151
<   create: false
---
>   create: true
327a332,337
>   #
>   # helper image
>   # ref) https://hub.docker.com/r/gitlab/gitlab-runner-helper/tags?page=1&name=ubuntu-x86_64
>   #
>   # node to execute the build
>   # ref) https://docs.gitlab.com/runner/executors/kubernetes/index.html#specify-the-node-to-execute-builds
329a340,341
>       clone_url = "https://cp.blink-1x52.net"
>       cache_dir = "/cache"
332c344,357
<         image = "alpine"
---
>         image = "registry.blink-1x52.net/cache-dockerhub/gitlab/gitlab-runner-helper:ubuntu-x86_64-v16.10.0"
>         helper_image_flavor = "ubuntu"
>         [runners.kubernetes.node_selector]
>           "kubernetes.io/arch" = "amd64"
>       [runners.cache]
>         Type = "s3"
>         [runners.cache.s3]
>           ServerAddress = "s3.blink-1x52.net"
>           BucketName = "runner"
>           BucketLocation = ""
>           Insecure = false
>           AuthenticationType = "access-key"
>           AccessKey = "REDACTED"
>           SecretKey = "REDACTED"
378c403
<   # secret: gitlab-runner
---
>   secret: gitlab-runner
480,488c505,512
<   {}
<   # limits:
<   #   memory: 256Mi
<   #   cpu: 200m
<   #   ephemeral-storage: 512Mi
<   # requests:
<   #   memory: 128Mi
<   #   cpu: 100m
<   #   ephemeral-storage: 256Mi
---
>   limits:
>     memory: 256Mi
>     cpu: 200m
>     ephemeral-storage: 512Mi
>   requests:
>     memory: 128Mi
>     cpu: 100m
>     ephemeral-storage: 256Mi
512c536
<   {}
---
>   kubernetes.io/arch: amd64

gitlab-runner manifest file¶

I use the script to generate the manifest file with the usual set of namespace, helmrepo, and helmrelease.

./infrastructure/controllers/gitlab-runner.sh

#!/bin/bash

# add flux helmrepo to the manifest
flux create source helm gitlab \
        --url=https://charts.gitlab.io \
        --interval=1h0m0s \
        --export >gitlab-runner.yaml

# add flux helm release to the manifest including the customized values.yaml file
flux create helmrelease gitlab-runner \
        --interval=10m \
        --target-namespace=runner \
        --source=HelmRepository/gitlab \
        --chart=gitlab-runner \
        --chart-version=0.62.1 \
        --values=gitlab-runner-values.yaml \
        --export >>gitlab-runner.yaml

runner namespace¶

Run the script to generate manifest file, update ./infrastructure/controllers/kustomization.yaml file to include it, and the flux will create the namespace, runner token secret, flux helm source, flux helm chart, and flux helm release which will create these in the runner namespace.

$ kubectl get all -n runner
NAME                                                      READY   STATUS    RESTARTS   AGE
pod/runner-gitlab-runner-gitlab-runner-7f944ffb55-b4nlm   1/1     Running   0          23m

NAME                                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/runner-gitlab-runner-gitlab-runner   1/1     1            1           23m

NAME                                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/runner-gitlab-runner-gitlab-runner-7f944ffb55   1         1         1       23m

runner job execution log¶

Here is the build job log for the beryl project which is generating this site. Though not the latest, working .gitlab-ci.yml file is available on mkdocs page.

So the first round, the executer is trying to restore cache but not present since I'm using the newly setup Minio S3 on my lab running on VMs hosted on Hyper-V. At the end of the job, it is uploading the cache to the S3 at https://s3.blink-1x52.net.

build-mkdocs job on beryl repository

Running with gitlab-runner 16.9.1 (782c6ecb)
  on runner-gitlab-runner-gitlab-runner-7f944ffb55-b4nlm ZP57C1kd9, system ID: r_svxTIEr5WHJr
Preparing the "kubernetes" executor
00:00
Preparing environment
00:43
Getting source from Git repository
00:02
Fetching changes with git depth set to 20...
Initialized empty Git repository in /builds/pages/beryl/.git/
Created fresh repository.
Checking out 637ec5ea as detached HEAD (ref is deploy-20240305-plugin)...
Skipping Git submodules setup
Restoring cache
00:00
Checking cache for build-deploy-20240305-plugin-protected...
WARNING: file does not exist
Failed to extract cache
Executing "step_script" stage of the job script
00:20
Running after_script
00:01
Saving cache for successful job
00:01
Creating cache build-deploy-20240305-plugin-protected...
/builds/pages/beryl/.cache/pip: found 568 matching artifact files and directories
Uploading cache.zip to https://s3.blink-1x52.net/runner/runner/ZP57C1kd9/project/148/build-deploy-20240305-plugin-protected
Created cache
Uploading artifacts for successful job
00:01
Uploading artifacts...
public/: found 175 matching artifact files and directories
Uploading artifacts as "archive" to coordinator... 201 Created  id=1021 responseStatus=201 Created token=glcbt-64
Cleaning up project directory and file based variables
00:01
Job succeeded

And then here is the log from the second run of the same job executed right after the one above. This time, both restoring cached data in the beginning and uploading at the end both being executed successfully.