Zum Hauptinhalt springen
Chris’ wirre Gedankenwelt

Unpopular Take on Documentation

Recently I witnessed a discussion between several parties about documentation, where to put it, single source of truth for everything, not writing about the same topic twice (even if there are multiple different audiences). Surprisingly, again there is no silver bullet 🤷

However, in my humble opinion, there is a way to make documentation better. To make it easier for people to start documenting. For having better indices, better search, and what not. People can even study this stuff in universities. These folks are called librarians, and every company should hire some.

Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.

👩‍🎓 TIL More About Systemd And Logging

If some service implements org.freedesktop.LogControl1 and defines a BusName, the log level of the service can be changed on the fly. For example systemd-networkd:

systemctl service-log-level systemd-networkd.service debug
Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.

A Tale On Kuberenetes Controller

Note: This is a pretty incoherent write up about what happens to be in my head on two different days. Maybe it is useful nevertheless 🤷

From time to time, I write operator (or controller) for Kubernetes. Primarily using go and thus using controller-runtime, either through kubebuilder or operator-sdk. The idea behind operator is reconciling something (let’s say an external resource) to the desired state. See Operator Pattern and Concepts Controllers.

In short terms: An operator observes the desired state in kube-apiserver and tries to “move” the resource to match the desired state. When it comes to controller-runtime, the logic is implemented in the method Reconcile.

It often times is desired, to only run a single instance of the operator. When dealing with idempotent APIs, one run multiple instances, but in general it is recommended to only have a single “active” instance in the cluster. Creating the boilerplate with kubebuilder or operator-sdk, a Leader Election will be created automatically. Thus, a failed instance will be taken over after a while.

Why is that important?

Imagine the operator can talk to an API to create a loadbalancer (we only can use that API, but can’t change any of it). The only field the API provides is a name for the loadbalancer. The create request returns an UUID, over which the loadbalancer would be identifiable. In an ideal world, the operator would create a loadbalancer, get back the UUID and store it somewhere for further processing.

uuid = db.get(lbUUID)
lb = api.get_lb(uuid)
if not lb:
  uuid = api.create_lb(name)
  db.store("lbUUID", uuid)
  lb = api.get_lb(uuid)

Looks reasonable, but has some shortcomings. Because there are no transactions between the various applications and APIs, the code could fail at any point in time. Fail not in terms of, the application as it fails, but fails because of external circumstances (OOM, node failure, etc.) For example, after/during calling create_lb. There was just no time to retrieve and store the UUID. The next reconcilation loop would create another loadbalancer, and now we maybe end up with two or more instances we need to pay for. Sometimes the code could end up in an error. For example because a port is already used by another loadbalancer instance. What would be the correct action? Adopting the loadbalancer? Erroring and retry? Probably the latter one. Because, who knows what or how created the loadbalancer in the first place.

The idea behind operator is that the program observes the world, try to move the world into the direction of the desired state, and store the observed state into the status field. It is not necessarily a good idea to store observed state somewhere for identification purposes. Even though there are examples, that do kind of that when storing things like providerID. However, in the world of Kubernetes cloud-provider, the provider deliberately is very limited in what it can do. For instance, the reconciler code for the loadbalancer has no access to Kubernetes.

So the best some reconciler sometimes get is a name. And must deal with the challenges of the external API.

lbs = api.list_lbs(filter = {"name": name})
if lbs:
  lb = lbs[0]
else:
  lb = api.create_lb(name)

Here are dragons again. The API does allow multiple loadbalancer with the same name. Every other method of identifying the object might be flawed in some way, or will suffer from the same problems. For example. If the API lets us set tags on resources, but doesn’t allow it in the create call, we are back at square one.

Even still pretty unsatisfying from a purists perspective, running a single instance of the operator is probably the best solution. Especially if otherwise one would introduce distributed consensus on what to create when and where. There are always trade-offs. Depending on the use case, it might be sufficient to only pick result[0], sometimes you might want to be more conservative and return an error if you get more results than one. You have to make those decisions, there is no way to get around them.

So far, we took look at operators and some of the challenges, especially if it comes to concurrently running processes. But do we suffer from the same problem when we are in the same process? In some cases, a single reconciliation is enough. If there are infrequent updates/changes to the source, and or one run of the loop is fast. But when we think back to the loadbalancer example. Imagine each reconciliation takes 30 seconds. If you create 10 loadbalancer at the time, it would take 300 seconds to create all of them, even though, in theory the API would be able to create 10 in parallel without problems.

Let’s take a look at the interface of a Kubernetes reconciler

type Reconciler = TypedReconciler[Request]

type TypedReconciler[request comparable] interface {
	// Reconcile performs a full reconciliation for the object referred to by the Request.
	//
	// If the returned error is non-nil, the Result is ignored and the request will be
	// requeued using exponential backoff. The only exception is if the error is a
	// TerminalError in which case no requeuing happens.
	//
	// If the error is nil and the returned Result has a non-zero result.RequeueAfter, the request
	// will be requeued after the specified duration.
	//
	// If the error is nil and result.RequeueAfter is zero and result.Requeue is true, the request
	// will be requeued using exponential backoff.
	Reconcile(context.Context, request) (Result, error)
}

For each event, no matter if it is create, update, delete, some generic event, the Reconcile method will be called. You will only see the request object, not what happened, not what state was there before, nor the change. Remember? The idea is to grab the desired state, drive the resource towards it, and record what you observed along the way.

But does that mean, we are really called for each and every event? Like… create of the Kubernetes Resource, which leads to creating the external resource. During that time, someone is editing the labels on the Kubernetes Resource. Would that result in the same difficulties described above?

When running a single Reconcile at any times, certainly not. The second event would be “batched”, and feed to the reconciler as soon as the previous run of Reconcile finished. Is it a good idea to crank up MaxConcurrentReconciles up to lets say 10? See Controller Options. Or do we need to implement something to coordinate execution of reconciles somehow. Certainly easier to do in the same process as opposed to do it distributed. But is it necessary?

The short answer is No. You don’t have to implement it. But not because the problem does not exist, but rather because of the workqueue used internally, provided by client-go. See Learning Concurrent Reconciling (from 2019) for more details.

If you need to have more than one Reconcile at the time, because of durations or number of objects in Kubernetes, you can tune MaxConcurrentReconciles. controller-runtime with help of workqueue will take care, that only one Reconcile per object is running.

Let’s do a quick detour back the events #

I mentioned that the reconciliation function is called for events happening. There are ways to filter events, and you can decide on which events you would like to be called, depending on you use case. Sometimes, it makes sense to store a condition with a field observedGeneration, recording the metadata.generation of the last successful and complete reconciliation. You would then filter out all events where metadata.generation == observedGeneration, to not reach out to external APIs and hit some rate limit in the process.

Sometimes, if you code for instance sets routes in the system, you might always return with RequeueAfter. Checking and setting routes is pretty fast, and there are no real rate limits in place. So it might be better to make sure a route is in place, lets say every 30 second, over accidentally not having the route at all. Either by a bug in the code, or because someone/something removed it. This kind of reconciliation would also run with MaxConcurrentReconciles = 1, so there are no two route altering processes at a time. Sure, you could watch netlink events in a separate goroutine and create a GenericEvent. However, this would need additional book keeping between the route and your object. The effort might not be worth it. But again, everything depends on your use case.

Reaching out for checking and maybe creating/updating an s3 bucket might not be worth doing it over and over again. Here you would likely use the observedGeneration pattern.

You could even watch on other objects and reconcile your real object. When there are ownerReferences, kubebuilder can do it pretty automatically by using .Owns(...). But even if you watch some object without real (or directly visible) relation to you object, .Watches(…) in combination with the correct enqueue function, can do exactly this. Your code “just” needs to be capable of drawing the line between the observed object and your actual ones.

Resources #

I strongly encourage you to have a look at the kubebuilder book for inspiration. Also check kubernetes API conventions. If you would like to see a real world example with several objects involved, take a look at cluster-api.

Updates #

In the recent edition of golangweekly I came across So you wanna write Kubernetes controllers by ahmentb. Much more thought out than what I am capable of.

Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.

Thoughts on ghostty 💻

A bit later then all the cool kids, I tried ghostty (on Linux), and I really like it.

Configuration and how to use the same options in the config file and as command line parameters is a breeze. I find ghosttys options to be very discoverable and the manpage works really well for me. I didn’t do much customization, but set up a light and a dark theme. Opted for rose-pine-dawn for light and rose-pine-moon. A good combination, but often times not supported by various terminals. Additionally some minor tweaks to keybinds.

I would say, I am a tmux user for decades… what can’t be really true, because it’s initial release was in late 2007, and I’m not sure that I was on it from the beginning. However, I used both screen and tmux, and even gave zellij a try. ghostty is the first terminal that seemingly makes a terminal multiplexer redundant for me, because I don’t use too many features on a regular basis. The only thing… my muscle memory needs some more time to adjust.

Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.

Wait... What??? Where are all my /boot files gone?

Hello future me 👋

You might wonder where all the files e.g. /boot/vmlinuz-linux, /boot/amd-ucode.img, /boot/initramfs-linux.img and the boot loader configs in /boot/loader/entries are gone.

I changed everything to Unified kernel images, by primarily changing the file /etc/mkinitcpio.d/linux.preset to generate the UKI. I also told the generation of /boot/vmlinuz-linux to be placed in /usr/local/share/boot/vmlinuz-linux instead, by setting ALL_kver="/usr/local/share/boot/vmlinuz-linux". The newly generated files will end up in /boot/EFI/Linux/arch-linux.efi and /boot/EFI/Linux/arch-linux-fallback.efi. systemd-boot will pick up these automatically. So no further entries are needed. Since, there are no boot loader entries any more, we must tell the system somehow about our cmdline. This is done in /etc/cmdline.d/*.conf. This files are respected by mkinitcpio and are bundled into the mentioned /boot/EFI/Linux/arch-linux*.efi files.

$ cat /etc/cmdline/10-root.conf
rd.luks.name=9c8381f8-7e0f-44f6-be26-655b70d33a32=root root=UUID=a1af7e43-857b-4903-8896-e25484175e5d
$ cat /etc/mkinitcpio.d/linux.preset
ALL_kver="/usr/local/share/boot/vmlinuz-linux"

PRESETS=('default' 'fallback')

default_uki="/boot/EFI/Linux/arch-linux.efi"
default_options="--splash /usr/share/systemd/bootctl/splash-arch.bmp"

fallback_uki="/boot/EFI/Linux/arch-linux-fallback.efi"
fallback_options="-S autodetect"

While I was there, I decided to get rid of /boot/amd-ucode.img as well. mkinitcpio collects the firmware from /usr/lib/firmware/amd-ucore/ anyways, and packs it into the initramfs. No need to keep the file around. However, /boot/amd-ucode.img is part of the package amd-ucode. I had to tell pacman to don’t extract those files.

$ grep NoExtract /etc/pacman.conf
NoExtract   = boot/*-ucode.img

Those changes do provide nothing but a learning opportunity 😄 And maybe… just maybe… in some point in the future, I will sign everything.

Update 2024-12-23 #

That went faster than expected. I just went on and created some keys and signed my images, roughly following Unified Extensible Firmware Interface / Secure Boot. Using systemd-ukify for signing, and systemd-boot for enrolment.

I decided to create my own files using openssl like documented. But first of all, backed up the my old setup. However, I realized that there is a “Restore to Factory Settings” in the Setup of the Zenbook.

My first attempt were to sign the resulted efi images, as well as the boot loader, with my own keys. Boot worked. However, now Windows for sure doesn’t boot any more.

To fix that, I just grabbed my backed up .esl files and combined the new with the old ones, and sign everything with my Platform Key (PK) or Key Exchange Key (KEK) respectively.

And voilà. A couple of hours and attempts later, Windows boots with Secure Boot enabled. The process took far to long, but I thought, my signing stuff was wrong. However, it turned out the filesystem of my /boot was broken, and ate the windows boot loader. Took me a while to figure it out and find the original to restore my system :grim:

Generate keys #

openssl req -newkey rsa:4096 -nodes -keyout PK.key -new -x509 -sha256 -days 3650 -subj "/CN=chrigl Platform Key/" -out PK.crt
openssl x509 -outform DER -in PK.crt -out PK.cer
openssl req -newkey rsa:4096 -nodes -keyout KEK.key -new -x509 -sha256 -days 3650 -subj "/CN=chrigl 2024-12-21 Key Exchange Key/" -out KEK.crt
openssl x509 -outform DER -in KEK.crt -out KEK.cer
openssl req -newkey rsa:4096 -nodes -keyout db.key -new -x509 -sha256 -days 3650 -subj "/CN=chrigl Signature Database key/" -out db.crt
openssl x509 -outform DER -in db.crt -out db.cer

Create all the efi things #

The old_*.esl were backed up from efi and now included so that Windows and ASUS stuff works. I know… not very secure… but this is only for toying around anyways :)

sign-efi-sig-list -g "$(< GUID.txt)" -k PK.key -c PK.crt PK PK.esl PK.auth
sign-efi-sig-list -g "$(< GUID.txt)" -c PK.crt -k PK.key PK /dev/null noPK.auth

cert-to-efi-sig-list -g "$(< GUID.txt)" KEK.crt new_KEK.esl
cat new_KEK.esl old_KEK.esl > KEK.esl
sign-efi-sig-list -g "$(< GUID.txt)" -k PK.key -c PK.crt KEK KEK.esl KEK.auth

cert-to-efi-sig-list -g "$(< GUID.txt)" db.crt new_db.esl
# Leaving PK.esl here for now, because how ukify uses the same cert everywhere
cat new_db.esl old_db.esl PK.esl > db.esl

sign-efi-sig-list -g "$(< GUID.txt)" -k KEK.key -c KEK.crt db db.esl db.auth
sign-efi-sig-list -g "$(< GUID.txt)" -k KEK.key -c KEK.crt dbx old_dbx.esl dbx.auth

cp db.auth keys/db/
cp dbx.auth keys/dbx/
cp KEK.auth keys/KEK/
cp PK.auth keys/PK/

Verifying everything, but not using sbkeysync for enrolment:

sbkeysync --keystore /etc/secureboot/keys --pk --dry-run --verbose

Enrolment with systemd-boot #

cp keys/*/*.auth /boot/loader/keys/auto/
cat /boot/loader/loader.conf

secure-boot-enroll manual

On reboot, you should see an additionally menu entry, given Secure Boot Setup Mode is enabled.

Signing kernel images #

systemd-ukify must be installed and configured properly.

# cat /etc/kernel/uki.conf

[UKI]
SecureBootPrivateKey=/etc/secureboot/db.key
SecureBootCertificate=/etc/secureboot/db.crt

[PCRSignature:initrd]
PCRPrivateKey=/etc/secureboot/tpm2-pcr-private-key.pem
PCRPublicKey=/etc/secureboot/tpm2-pcr-public-key.pem
Phases=enter-initrd

[PCRSignature:system]
PCRPrivateKey=/etc/secureboot/tpm2-pcr-private-key.pem
PCRPublicKey=/etc/secureboot/tpm2-pcr-public-key.pem
Phases=enter-initrd:leave-initrd
       enter-initrd:leave-initrd:sysinit
       enter-initrd:leave-initrd:sysinit:ready

Manually Signing systemd-boot #

There is no automation in place as of now.

Sign the systemd-bootx64.efi and install systemd-boot.

sbsign \
  --cert /etc/secureboot/db.crt \
  --key /etc/secureboot/db.key \
  /usr/lib/systemd/boot/efi/systemd-bootx64.efi

bootctl install

With all of this in place, systemd-cryptenroll can be used to unseal the luks encrypted disk without requiring the real password.

E.g.

systemd-cryptenroll \
  --tpm2-public-key=/etc/secureboot/tpm2-pcr-public-key.pem \
  --tpm2-with-pin=yes \
  --tpm2-pcrs=7 \
  --tpm2-device=auto /dev/nvme0n1p5

For sure, especially the signing keys should live somewhere else, in a secure space.

Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.

👩‍🎓 TIL More About a Brand Nushell

I know about nushell for quite a while. Today I installed nushell and gave it a try.

I thoroughly enjoy the hard break with traditional shells. “Everything is data”, the website claims. And this brings up a couple of interesting use cases. nushell supports json (along with a whole lot of other formats) out of the box, and also includes a http client. This makes nushell a perfect fit for browsing json APIs from the command line.

However, there are a bunch of other really interesting use cases. Tools like iproute2 or lsblk allow to output json.

We list all devices in json format and tell nushell to interpret it as json.

lsblk -J | from json

Running lsblk

We want to drill into the small nested box with root included.

lsblk -J | from json | get blockdevices.children.0.4.children.0

Running lsblk and delve deep into the data

How to delve into the data

An absolute great tool to dive into data 😄

You even can tell nushell to output in any known format. For example | to md prints a markdown table.

I think I will find plenty of places where nushell will help me!

Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.

New Notebook, New archlinux Installation

Just talking about my personal notebook, at work I’m usually using some kind of macbook.

This November, I decided to retire my trusted Lenovo Thinkpad x230 from about 2013.

Because I’m a heavy terminal user, it was not too much of a deal. For programming, these days I’m on AstroNvim with full blown LSP setup. Which mostly was fast enough for me. However, the trusted gear served me very well over the years, but the difference to a macbook air M1 is just… noticeable.

This month I somehow decided: This is the time! I went online, did a fair bit of research… not too much for sure… and ended up on a ASUS Zenbook UM3406.

Completely new installation #

I made some decisions about my OS. First, I want to keep a small partition for the included Windows 11. And to be perfectly honest. I’m impressed by it, and thought for a second: Maybe just sticking with it and use WSL2? I just can’t stand it 😄

Because, I really have a whole lot of respect for the people behind Universal Blue, I gave the Gnome spin a try. It really works well, but I’m old and a diehard archlinux user… So just installed my beloved distri.

But why do I keep Win 11… I use it for Zwift and it is just too much of a fuff to run it on Linux. Even form me, sometimes things should just work.

Took me three iterations to get both installations correct. Mostly because of being stupid and removing partitions by accident, and letting the Windows installer to pick a too small efi partition.

My last installation of archlinux dates back to 2011, when I installed my pre-pre notebook. This installation, I synced over to the Thinkpad in 2013. So I don’t remember anything about the old installation process. Today, I really liked how simple it is to configure WLAN using iwctl. I opted into doing a full disc encryption, and do not separate /home from /, but creating a single partition with a btrfs. Guess, I’m a btrfs user since 2013.

You might already realized how weird I am. Very old school on one hand, but a sucker for new stuff on the other. This time, I opted into using systemd-homed to manage my home directory. Because using btrfs, being the sole user of this computer, and having full disc encryption, I use subvolume as the storage driver. There are strings attached, and some parts are not yet integrated into other parts. Like there is still account-service, but it is unable to really manage systemd-homed users. I expect things to come closer together in the future. I will get those pretty early, because of the rolling release nature of archlinux.

A brand new terminal emulator #

For convenience purposes, I installed the package group gnome, which includes gnome-console. It works reasonably well, and has a more modern look and feel than my trusted gnome-terminal. The lovely Universal Blue folks brought my attention to ptyxis. I installed it via the gnome-software-center, which is configured to use flathub, because there is no archlinux package yet.

Tested the container features, using distrobox. Feels really nice and snappy. Some features I (unfortunately) cut out, because of using tmux. Here again, old man, old habits. For instance. Ptyxis colourizes the borders red when detecting sudo.

All in all, a great user experience. Did not spend too much time into it… well, because things should just work, and Ghostty is coming. I’m kind of sucker for those kind of new stuff 😄

Using minimal config from the old one #

This time, I decided to only copy over a minimal set of configurations from my old notebook. It proved to be a good decision. For Instance, my Evolution Mail client looks cleaner now, even though they are in the same version.

What’s next? #

This might or might not be my final installation. I still like the approach the lovely folks at Fedora and Universal Blue are taking. But for me, this notebook is not my “I earn money with it”-machine, and I love to have a relatively low level Linux.

Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.

👩‍🎓 TIL More About pkg.go.dev Tooling

I use pkg.go.dev regularly. Sometimes just for finding a package, that I know, but not the full name. One of those is go-spew. I know the package, I know it exists, but I will never be able to keep the full path in my head (for good reasons). Wouldn’t it be great to search from the command line?

gofind is our new friend 😄

❯ gofind spew
spew (github.com/davecgh/go-spew/spew)
    Package spew implements a deep pretty printer for Go data structures to
    aid in debugging.

    Imported by 15,504 | v1.1.1 published on Feb 21, 2018 | ISC

spew (github.com/spewerspew/spew)
    Package spew implements a deep pretty printer for Go data structures to
    aid in debugging.

    Imported by 12 | v0.0.0-...-89b69fb published on May 13, 2023 | ISC

...
Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.

👩‍🎓 TIL More About Rke2, Containerd and Private Registries

In a debugging session, I wanted to pull a container image manually using ctr, and encountered this (to me weird) error.

root@server:/etc# ctr --debug -n k8s.io image pull registry.k8s.io/ingress-nginx/controller:v1.11.3
DEBU[0000] fetching image="registry.k8s.io/ingress-nginx/controller:v1.11.3"
DEBU[0000] resolving host=registry.k8s.io
DEBU[0000] do request host=registry.k8s.io request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=containerd/v1.7.21-k3s2 request.method=HEAD url="https://registry.k8s.io/v2/ingress-nginx/controller/manifests/v1.11.3"
INFO[0000] trying next host error="failed to do request: Head \"https://registry.k8s.io/v2/ingress-nginx/controller/manifests/v1.11.3\": dial tcp: lookup registry.k8s.io on 127.0.0.53:53: server misbehaving" host=registry.k8s.io ctr: failed to resolve reference "registry.k8s.io/ingress-nginx/controller:v1.11.3": failed to do request: Head "https://registry.k8s.io/v2/ingress-nginx/controller/manifests/v1.11.3": dial tcp: lookup registry.k8s.io on 127.0.0.53:53: server misbehaving

I was a bit puzzled, because the containerd is configured to use a private registry, and should not go to the internet to find images. I double checked that the image was really there, and explicitly tested other images. Still the same error though.

At some point, I discovered that I need to add the registry configuration explicitly to ctr:

ctr -n k8s.io image pull --hosts-dir /var/lib/rancher/rke2/agent/etc/containerd/certs.d/ registry.k8s.io/ingress-nginx/controller:v1.11.3
Autor
Chris Glaubitz
Configuring, coding, debugging computers for a living. Riding bikes for fun.