A short one on Rust

1 August 2024

Coding Learning Rust

From time to time, I’m curious in some programming language. Once, it happened to be Rust, and I read the Rust book in the first lock down. Which I think is a really good source. Well done! Over the past years, I also read Rust in Action.

It did happen, that I didn’t write a single line of Rust so far, but read some code. A couple of weeks rustlings popped up onto my radar. I installed it, and walked through it. Only equipped with my theoretical knowledge, I made it up to section 20 so far without needing many hints.

I don’t really know what to I want to say, other than. Play around, try out new stuff, keep learning 😄

Another attempt to revive this blog

27 Juli 2024

For me it is notoriously difficult to keep up with sharing stuff. On one hand, I think I don’t have much to share. But so do most of the people on the internet.

In the past years, my blog software was just outdated and I didn’t take the time to do anything about it. As a result. I had some topics to write about, but because everything on my side was broken, I didn’t write. At some point, I decided to try to switch to hugo. Mostly for procrastination purposes 😅 I wrote a couple of lines of go, to convert my stuff over from Nikola restructured text to hugo markdown. Which turns out to be a nice little programming challenge. After finding all the edge-cases with my very personal use of Nikola, things went relatively well. To get most functionality back, I had to add some special templates, partials and shortcodes. Nothing special really. But work to be done.

Because my day work is also with computers, I tend to do not too much in my spare time. I really prefer spending time with my 👨‍👩‍👦 or riding my 🚴‍♂️. As a result. At some point, I was halfway in the migration. Even more reasons for me to not blog 🤷

Today is the day. It should rain the entire weekend. I spent the last two days on the bike. Thursday a 70km road ride 🚴‍♂️, yesterday 50km on gravel 🚵‍♂️. The kids are busy meeting friends. No real excuse to not spend an hour or two to shape off the last obvious edges.

Let’s see what happens next. Maybe I manage to keep up with writing, maybe this is the last post for the next six years 🙈 🙊 🙉

About Load Balancing at GitHub

24 August 2018

Devops

I just want to point to an interresting resource on load balancing. GitHub recently released GLB: GitHub's open source load balancer (GitHub Load Balancer Director and supporting tooling Repo), which is the actual implementation of what was descibed back in September 2016 Introducing the GitHub Load Balancer. It guides the reader through the challanges GitHub faces in the space of load balancing, and how they came to their solution, and what didn't work for them.

The load balancer at GitHub needs to be increadibly reliable. We all use it when cloning or pushing to a repo, which both could take a while. Cutting the connection for maintenance purposes would affect many people, especially those, working with bad internet access.

Even if you are not GitHub, this might be interresting. For instance, if you run OpenStack and depend on the Image-API (Glance) for VM snapshots, a restart of the load balancer results into a corrupt snapshot, which might be unnoticed until there is an urge to use the snapshot. For sure. One should verify if the snapshot works, but not everyone does. And even if one does, it is quite annoying to wait let's say an our for the upload, just to discover the time has been wasted.

Use HyperFIDO HYPERSECU on Linux

5 Januar 2017

Configuration

To use HyperFIDO HYPERSECU on Linux you must have access to the created device. So you need to add a udev rule to set up the permissions. There are two examples on the support page, but for the systemd-uacces method, one information is missing.

You can use the [HyperFIDO-k5-1.rules]{.title-ref} approach:

ACTION!="add|change", GOTO="u2f_end"

KERNEL=="hidraw*", SUBSYSTEM=="hidraw", ATTRS{idVendor}=="096e", ATTRS{idProduct}=="0880", TAG+="uaccess"

LABEL="u2f_end"

But you need to rename it to something smaller than 70- in [/etc/udev/rules.d/]{.title-ref}, because the uaccess-Tag is processed in [70-uaccess]{.title-ref}.

E.g. [/etc/udev/rules.d/69-hyperfido.rules]{.title-ref}.

Starting from scratch

30 Oktober 2016

Devops

This post is part of a series how we as SysEleven manage a large number of nodes. We deploy OpenStack on those nodes, but this could be basically everything.

For sure, this is not our first attempt to deploy a managable OpenStack platform. We not only deployed this platform, we also deployed a platform based on Virtuozzo, which is still in heavy use for our managed customers. We have a whole bunch of learnings from deploying and managing the mentioned platforms, which leds us to the decision to start from scratch.

In the beginning of this year, there is basically nothing, but a rough plan. Not even a single line of code, neither software, nor some kind of automation. With this project, we were able to break with everything we run in the company. We were (and still are) allowed to question each software and hardware decission. We started on a greenfield.

There may be much things you will put into question. Most of what we did is inpired by what others did. Nothing is new, or kind of rocket science. Sometimes we needed to take the shortest path, because we had a deadline to mid of the year to deploy a stable high available OpenStack platform.

So, where to start over? #

First of all, we needed to drill down the issues we encountered with the previous deployments. We went from vague ideas to concret problems with soft- and hardware solutions we used. But the larges problem was a lack of formal definition of what we want to build, and how we want to do that.

So the very first step was to change this. We wrote a whole bunch of blueprints, capturing a high level view of the cluster. Some of them were very precise, like the decision for Ansible as our primary configuration management system, although the former cluster was build with Puppet. Or the decission for network equipment, how we plug them together and how we operate it. Some other very specific blueprints described that we use a single mono-repo, how we manage the review process, everything we script has to be done in Python of Go, styleguides for those language and Ansible, how we handle dependencies in Ansible, that we are going to solve our current problem, and not every thinkable future problem, and so on and so forth. There were some vague blueprints about: We need a way to get an overview of the current cluster state. Not just what we expect. We need a system to initially configure and track our nodes.

All blueprints are meant to change. Not in a whole. But each time, someone diggs into the next topic, the blueprint is extended with the current knowledge and needs.

So we had a rough overview of what we need. We split the team into two groups. As we knew that we had to deploy a working OpenStack, one group started to deploy OpenStack, with all the components needed via Ansible. The primary goal still is to provide a high available OpenStack. The group I belong to works on the individual software we need to provide a basic configuration for our nodes, keep track of them and keep them up to date.

I joined the team after 8 month off, to take care of my boy. At that point, the basic blueprints were written and the team was consolidated. And to be honest, I am not sad to miss this process, and espacially not the path leading to this.

Where do we get from there? #

We dedicated three nodes to be our bootstrap nodes, so called bootinfra. They are mostly installed and configured by hand.

Within half of the year, we were able to plug a new node into a rack. This shows up in our MachineDB where we configure the location of the node. After this, a very basic configuration happens automagically. The node configures its designated fixed ip addresses, hostname and fqdn, some bgp configuration, setup of repos, setup of ssh configuration. Just very basic stuff to reach the node. The next step is still very manual. We assign the node to a group in our Ansible Inventory File, and run Ansible by hand.

From the Ansible point of view, there is no change since then, even there was much process in detail. But we were able to deploy two other cluster in the same manner. One of them is our lab, which shares the production bootinfra. That tends to be easier, but it threw up a whole bunch of problems. For the other cluster, we needed to automate much of our (so far handcrafted) bootinfra. Which now pays out, since we are about to replace the initial bootinfra nodes by automated ones.

Next time, I will write about our bootinfra. Not too much about the configuration, but which services we run, and for what reasons.

Managing a large number of nodes

28 Oktober 2016

Devops

Imagine following setup. As surly mentions somewhen in this blog, we are running a cloud based on OpenStack. The team, I am working in, is responsible to provision new nodes to the cluster, and manage the lifecycle of the existing ones. We also deploy OpenStack and related services to our fleet. One hugh task is running OpenStack high available. This seems not too difficult, but also means, we have to make each component OpenStack depends on, HA as well. So we use Galera as Database, Quobyte as a distributed storage, clustered RabbitMQ, clustered Cassandra, Zookeeper, and things I may have forgotten.

I will write about some aspects of our setup, how we deploy our nodes, and how we keep them up to date.

Starting from scratch

Dropping weekly format

28 Oktober 2016

Weekly

Turns out. Like expected I am not able to write one post per week, and I didn't manage to pick up after two weeks of vacation. When reading the weekly posts again, they seem kind of weird. Which is not bad, but nut worse the time. So I call the my first attempt to write a series to be failed.

Anyway. Natually, not each week is equally interresting. I will try to pick up interresting topic here, and try to go a bit into detail.

Weekly 2016/37

16 September 2016

Weekly

No such meeting Monday. This week, our all-hands-meeting was moved to Tuesday. Because most people in my team were on vacation, we did a short daily instead of a 1 hour weekly. More time to do real work :) I returned to the cluster wide unattended update. I worked at least half of the day on this topic. Another look into the networking setup of our lab and some more documenation did the rest of it.

Back in office on Wednesday after a day of sickness. From Tuesday on, the production cloud was kind of slow. In terms of: Creating a bare network took 6 seconds in average, instead of 2 seconds in the lab. Digging into the problem showed a problem in filling up one rabbitmq queue. Usually this should be consumed by a telemetry solution, which is not deployed correctly so far. I removed all the messages in the queue and everything went back to normal again.

On Thursday, I had not too much time, to digg into unattended updates again. Just an hour in the morning. This week is strange, with many little disturbances, and I literally forgot all the small stuff. After lunch we did a complete wipe and reinstall of our lab. I left after 2 hours of installation for 10 hardware nodes. Does not seem to be pretty fast, but in fact this is ok. At least we had to do the process on our first 3 nodes (those form our consul cluster, now) twice, because of an expired pgp key. My colleagues Continues with running Ansible on the nodes, without huge problems.

Although, the installation on the day before was pretty ok overall, I cleaned up some stuff on Friday. With some minor changes, the installation is smoother than Thursday. I did not run an entire installation. But shut off the consul cluster to simulate the "nothing is there"-situation. This worked pretty well. I did a second reinstallation, but with consul turned on. This is the usual scenario if we grow our existing cluster by new machines. I finalized the installation by applying Ansible. Because the node, I selected, was a gateway of our SDN, there was a little problem with a new UUID for this node, but easily to resolve by hand. This sceanrio is not usual, so I just documented this issue and did not automate it away. Furthermore, I extended the existing documentation by things, I saw yesterday, but not yet in the documentation. Somewhen in between, we had a call with our storage vendor about the next steps to happen. The afternoon I prepared a update of consul 0.6.4 to 0.7. Its going to be bad:

"Consul's default Raft timing is now set to work more reliably on lower-performance servers, which allows small clusters to use lower cost compute at the expense of reduced performance for failed leader detection and leader elections. You will need to configure Consul to get the same performance as before. See the new Server Performance guide for more details. [GH-2303]"

In other words: It did not work well on raspberry pi, we fixed this, but destroyed it for everyone else. And for sure, 0.6er consul refuse to start because of an unkown key, if I already would place the new paremeter to all config files :(

Oh. Almost forgotten. Last week I wrote about my plans of cleaning up my vim modules. And with this, a switch to neovim. Turned out. Cleaning up stuff went well. The neovim experiment not. It felt ok in most points, but not better then good old vim. And one thing let me scream several times, and go back vim. In some/most cases a hit on ESC took a while to complete. That long, that I typed some more characters before leaving insert mode. An I'm escaping a lot!

Weekly 2016/36

9 September 2016

Weekly

Starting with meeting Monday. The first half of the day was gone for this. After lunch, I had to add another authentication mechanism to our MachineDB. In fact this is a Flask-app which uses ldap as an authentication backend. We have to put this part of the software into a place without any access to our ldap. Since we use postgres as the DB backend, I created another authentication backend, using postgres. In the afternoon I helped a fellow with some scripts to setup floating ip networking to our lab cluster.

On Tuesday, I continued my work on the authentication backend and got it merge ready. More debugging was done at the floating ip creation stuff. Because the script is more complex than I hoped, I created the networks by hand and was able to configure our integration test to run against the lab.

The next day was a short day for me. I figured out some problems detected by the tempest run. Some strange errors happen due to the ordering of openstack endpoints. The config parameter v3_endpoint_type has been ignored, and adminURL was used by accident. By accident, since it really depended on the sorting of endpoints. I fixed this by updating the id of the endpoint in the keystone database. Beside this, I did some daywork.

Thursday, was kind of boring. First of all, it was a short day again. So primarily I did some daywork, and prepared a backup-presentation for the Berlin Ops Summit on Friday. Turned out, we didn't need it.

Friday was Conference day. I liked the Talk of Jeff Sussna. Most other talks were ok, but not new to me. The most entertaining talk was UI engineering by Dennis Reimann and Jan Persiel. The closing party was great :)

Btw: I am on my way to re-create my vimrc in terms of throwing away modules and settings, I don't use any more. I take the chance to switch to neovim in this step. Lets see how far I come with this.

Linux Kernel Features…

8 September 2016

… mit dessen Hilfe Container-Runtimes gebaut werden. Keine Garantie für die Vollständigkeit dieser Liste.

chroot #

z.B. Alpine Linux in chroot

Um den gestarteten Prozessen ein anderes / vorzugeben. Noch nicht mal ein Linux-Feature und wurde bereits 1979 in Unix implementiert.

# Ich habe da mal was vorbereitet.
export mirror=http://dl-3.alpinelinux.org/alpine/
export chroot_dir=/root/alpine
export version=2.6.7-r0
wget ${mirror}/latest-stable/main/x86_64/apk-tools-static-${version}.apk
tar -xzf apk-tools-static-*.apk
./sbin/apk.static -X ${mirror}/latest-stable/main -U --allow-untrusted --root ${chroot_dir} --initdb add alpine-base
mknod -m 666 ${chroot_dir}/dev/full c 1 7
mknod -m 666 ${chroot_dir}/dev/ptmx c 5 2
mknod -m 644 ${chroot_dir}/dev/random c 1 8
mknod -m 644 ${chroot_dir}/dev/urandom c 1 9
mknod -m 666 ${chroot_dir}/dev/zero c 1 5
mknod -m 666 ${chroot_dir}/dev/tty c 5 0
cp /etc/resolv.conf ${chroot_dir}/etc/
mkdir -p ${chroot_dir}/root
mkdir -p ${chroot_dir}/etc/apk
echo "${mirror}/${branch}/main" > ${chroot_dir}/etc/apk/repositories

mount -t proc none ${chroot_dir}/proc
mount -o bind /sys ${chroot_dir}/sys

chroot ${chroot_dir} /bin/sh -l

Namespaces #

Definiert was Prozesse sehen und machen können.

Mount #

unshare -m /bin/bash
mount -t tmpfs tmpfs /tmp

touch /tmp/blah
ls -l /tmp ### nur blah


Anderes Terminal
ls -l /tmp ### orig-stuff aber kein blah

PID #

unshare --pid /bin/bash
ps -ef ## segmentation fault, da fehlt wohl noch was

unshare --pid --mount-proc -f /bin/bash
ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 08:26 pts/1    00:00:00 /bin/bash
root         4     1  0 08:26 pts/1    00:00:00 ps -ef

User #

Wichtig, als != root starten

id -u ## 500
unshare --user --map-root-user /bin/bash
id -u ## 0

Trotzdem hat man keine root-Berechtigung!

mount -o tmpfs tmpfs /tmp  ## permission denied

Network #

man ip-netns

ip netns list
ip netns add testing
ip netns exec testing ip a
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1

ip link add veth0 type veth peer name veth1
ip link set veth1 netns testing
ip a ## kein veth1

ip addr add 192.168.0.1/32 dev veth0
ip route add 192.168.0.0/24 dev veth0
ip netns exec testing ip link set veth1 up
ip netns exec testing ip addr add 192.168.0.2/32 dev veth1
ip netns exec testing ip route add 192.168.0.0/24 dev veth1

ping 192.168.0.2 ## works

ip netns exec testing ip a
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1
7: veth1@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
   inet 192.168.0.2/32 scope global veth1

Capabilities #

Gezielt Fähigkeiten von Prozessen beschränken. Auch wenn der Prozess als effective user 0 (root) läuft.

man capabilities

mount -t tmpfs tmpfs /mnt
umount /mnt
capsh --drop=CAP_SYS_ADMIN --
id -u  ## 0
mount -t tmpfs tmpfs /mnt  ## permission denied

cgroups #

Resourcen von Prozessen einschränken. Beispiel Memory.

cd /sys/fs/cgroup/memory
mkdir testing
cd testing
ls -l  # zeigt u.a. tasks

# 2. Console auf und dort die PID ermitteln
ls -l  # funktioniert
echo $$

# Zurück in Console 1 und bash der Console 2 in die cgroup testing aufnehmen
echo PID > tasks
cat tasks
PID

echo 1 > memory.limit_in_bytes

# Console 2
ls -l
Killed

Wie man sieht hängen alle Child Prozesse auch in dieser cgroup!

Weitere Resourenlimits:

CPU
blkio
devices
pids (Process numbers)

Siehe die Kernel Dokumentation zu cgroups.

seccomp #

secure computing mode. Gezieltes filtern von system calls. Leider habe ich hier kein einfaches Beispiel.

AppArmor / SELinux / SMACK #

Mandatory Access Control im Linux Kernel. Es gibt verschiedene Implementierungen. Ubuntu verwendet z.B. AppArmor.

Auf Ubuntu

vi /etc/apparmor.d/usr.sbin.rsyslogd
...
/usr/sbin/rsyslogd {
  # rsyslog configuration
  /etc/rsyslog.conf r,
  /etc/rsyslog.d/ r,
  /etc/rsyslog.d/** r,
  /{,var/}run/rsyslogd.pid rwk,
  /var/spool/rsyslog/ r,
  /var/spool/rsyslog/** rwk
 ...
}

AppArmor wird unter Ubuntu auch dazu verwendet den Zugriff von VMs-Prozessen einzuschränken. Damit wird der Schaden potentielle Schaden bei einem Ausbruch aus der VM verringert.

Zum Schluss #

Alle der genannten Features existieren schon einige Jahre im Linux-Kernel. Vorläufer davon existieren in Form von OpenVZ / Parallels Virtuozzo bereits seit ca. 2000. Allerdings außerhalb des Mainline Kernels gepflegt. Die Features können nicht nur dafür benutzt werden um Container Runtimes wie Docker oder CoreOS rkt zu erstellen, man kann auch "regulär im System" laufende Anwendungen damit einpferchen. Auch systemd bietet Flags, die einige der angesprochenen Techniken verweden um Prozesse besser voneinander zu trennen. Siehe dazu Security Features in systemd.