Weekly

This is the first attempt to write about my week in tech. More about the things I did, and less about things I read about. First of all, this is for me. So times, I tend to be a bit frustrated about being caught in a treadmill. So I really don't know where this is going.

Let's step into this week. It was going to be a short week. I'm still on part time, and working 30h a week. This week from Monday to Thursday. The main task, and target as well, was bootstrapping the hardware nodes for our second OpenStack Cloud, which is going to be a lab. Means, this is the second time to bootstrap with our automation, and the first time doing it from zero to working in without steps in between.

For sure, the hardware is not exactly the same like the production cloud, but only differs some nuances. The most important difference for the entire cluster is the fact, that we dual use the boot infrastructure (like pxe boot) of the production cloud. However, the lab cloud must get its own network, but also reach our boot infrastructure and must be reachable from the cluster workstations of the production cloud. For sure, the connections were set up by our netops.

Slight insight into the networking. The ToR-switches of the production cloud are connected to a qfabric. The ToR of lab as well. Between those two, a mpls l3vpn is configured. Even the racks stand side by side, we decided to use this configuration. Our plan is to spread several clusters over the data center. So this is a quite nice test case.

The nodes were started end of the week before, but did not show up in the machine database. I even didn't expect this working the first time. Unfortunately, due to the design, I was not able to reach any bmc of the nodes. The nodes should be configured in a way to boot from network into a live system. This does a quick inventory of the machine, and submits it into the database. We as the admins, have to configure the rack position. The live system picks this configuration periodically and does some system configuration like setting a fixed bmc ip address and password. We are also able, to set the system disk in our machine database. After this two configuration bits, rack position and system disk, the node is ready to boot into the ubuntu installer and configures itself with a preseed. Again, this was some decision made somewhen. I would prefer Images at this point. But however.

However. None of the 10 machines showed up, and bmc did not work. We send out one of our colleages to the datacenter. While he was on his way, I connected to the ToR of the lab, because this was reachable, and did some basic tests. Because we use anycast for dns, ntp and some more stuff, the first guess was no working anycast. Turns out anycast worked. But the throughput was at least one problem. I only got 1.4MByte/s from ToR to ToR. Which is really bad, they are connected with 40GBit/s. My DC-fellow check network connections, and correct patching. Everything looked good so far. Since we started late, due to the meeting-monday, we called it a day.

On Tuesday, i tried to get our netops, to get the problem solved. Most of them were busy as well, but with little chatting, and further debugging, I detected a drop of the mtu from 9000 to 1500 somewhere in the qfabric. With the help of the netops, we checked the links. All were configured properly. Great!? Since both ToRs are configured with a mtu of 9000, and the bad throughput, I expect pmtu discovery not to work. Which is even worse if 10 nodes should do netboot/install stuff almost at the same time. Since we still lacked bmc, my colleage again visited the DC. He shut off all the nodes and brought back the first one. This barely worked, and it was able to download the live image.

I was stucked at this point. I had to wait for the netops who initially configured the l3vpn to help out. So in the meantime, I helped to submit a new version of the internal storage manager (that configures disks to be system or lvm or cluster fs), and get it out to the production cluster. This removed a blocker of a aborted rollout of the week before. At my end of the day, we had 4 of 10 nodes running. Before I left, I could grab the netops and he promised to have a look at this issue. During my spare time, my fellow in the DC discovered, that six nodes didn't even boot from the correct network device. For sure he was not able to fix this because we need a firmware update for the cards to get it working.

Wednesday. Like my netops promised, the mtu issue was fixed. Wohooo. Turned out, there was an overall-setting that overrides the concrete port configuration m( Because of the mtu struggles, I didn't debug the ToR of the lab enough. To recap, we still lack bmc. In the morning I discovered a missing link on ToR to bmc-switch. Lukily for me, my fellow had to visit the DC again, to update the firmwares. So he armed himself with some optics.

In this field, there was nothing to do for me. I waited for the updated nodes to submit their inventory to our database, and a working bmc. So I continued the work on a solution to run a cluster wide unattended update, but stop if something strange happen, or a human hit the red button. In the evening, my fellow replaced the fiber and got another 3 nodes running. The remaining nodes ended in an endless loop in the installer. But I thought, this is pretty good. We could define those nodes to be compute nodes, and deploy OpenStack with only one compute.

My Thursday starts with a totally different topic. We have a Thursday time slot of 15 minutes to present some interresting stuff. I showed my kubernetes stack running onto of our OpenStack. 15 minutes are way to less. I could talk hours about this topic. But anyway. It is a great way to show other teams topics to think about. Even I would like to see more faces in my presentation. I did a live demo of booting the stack, so there are no slides, but the stack is available on github.

After this short trip, I went back to the cluster bootstrap an noticed, that exactly those broken nodes have to be the controllers, because of the ssd provisioning. So we had to get the nodes out of the door. However, we have bmc, so at least no one had to visit the DC. For sure, no errors were shown on the serial console. Nothing but the over and over starting progress bar. Thankfully my DC-fellow was able to connect to ilo and get some error messages. Somewhere in the bootup-time, the routes to the lab disappeared. There was a experimental rollout on the lab-ToR, that broke the config of bird. Trivial fix and we were up again. Some of my tests of the installed nodes showed missing bits in the consul configuration, as well as wrong configuration of the fqdn. Trivial fixes for another fellow.

Back to the insallation process. Turned out. The very last step failed. Because ubuntu changes the boot device to install-disk, there is some late command to reset the boot order netboot. We always use netboot, to be able to change grub config even if we are not able to connect to the nodes at some time. The netboot grub fetches the next desired boot target for the node. A tiny fix in a regexp to detect the network device did the trick. The nodes installed correctly and came up like expected.

This was at 1430. I called it a week and left for weekend. With major help of "the DC-fellow", I was able to get done the ground work for a running lab (installed with the same automation like production). The next steps are up to my colleages anyway.

All in all a very productive week, in face of "only" 30h :)

Flattr this!

Heute war ich im Amtsgericht Pankow/Weißensee

/galleries/amtsgericht-pankow-2016/IMG_20160211_122422.thumbnail.jpg

Der Grund war ein ganz einfacher. Meine Freundin hat einen Bescheid über die Erbschaft von einer verstorbenen Großtante bekommen. Diese Großtante hatte keine eigenen Kinder, daher hatten ihre Nichten und Neffen Anspruch auf das Erbe. Diese (7) haben das Erbe ausgeschlagen und daher ist die nächste Riege angeschrieben worden. Warum war ich eigentlich dabei? Wenn, der Empfehlung nach, alle Cousins und Cousinen meiner Freundin das Erbe ebenfalls ausschlagen, wird unser Sohn angeschrieben. Da wir beide sorgeberechtigt sind, haben wir das für unseren Sohn gleich mit erledigt. Eine Vollmacht genügt nicht, ich musste persönlich vorstellig werden.

Da ich die Geschichte um das Erbe so interessant finde, werde ich darüber noch schreiben. Dies hier möchte ich aber dem Gebäude widmen, in dem das Amtsgericht Pankow/Weißensee in der Parkstraße residiert.

"Widmen" ist vielleicht etwas übertrieben. Ich finde das Gebäude einfach interessant und der Schein von außen hat nicht getrügt. Mal abgesehen von den Büros, die waren genau so eingerichtet wie ich es bei einem Amtsgericht erwartet hätte, fanden wir hohe schön verzierte Decken mit tollen Lampen und interessante Treppenhäuser vor.

… alle Bilder

/galleries/amtsgericht-pankow-2016/IMG_20160211_123726.thumbnail.jpg

Was ich nicht wusste: Anscheinend gilt man als Kunde des Amtsgerichtes als "Publikum".

Außerdem habe ich etwas über das Erbrecht gelernt. Das Ausschlagen eines Erbes kostet eine Mindestgebühr von 30 Euro, die wirklich zu bezahlende Gebühr richtet sich nach dem Vermögen des Erbes. Bis zu einem Erbe von 5000 Euro bleibt es aber bei den genannten 30 Euro.

Nun ist das mit dem Erben also eine kleine Lotterie. Nimmt man es an, und es besteht aus Schulden, hat man den Salat. Nimmt man es nicht an, es stellt sich aber heraus, dass die verstorbene Person mehr als 5000 Euro zu vererben hätte, wird es unter Umständen auch teuer. Ich habe leider vergessen zu Fragen was "Die Gebühr richtet sich nach dem Vermögen" in echt bedeutet. Prozent des Vermögens? Positives oder negatives Vermögen? Wer weiß. Ich bin zu faul das zu recherchieren.

Flattr this!

GNOME 3 color profile and screen brightness

Lesson learned: Next time I when face a problem with some desktop software, first I will do is dbus-monitor.

I decided to switch my current desktop environment from Cinnamon to GNOME 3. As a by product, I created a new color profile with my spyder3. The old one was from 2014, and it was definitly the time to create a new one. Therefor, GNOME shipps gnome-color-manager. So I set my display to my "photo-editing"-brightness and started measuring.

So far, everything worked fine. I even only had to touch gnome-tweak-tool once to set my favorite fonts.

I did everything in one session, without logging out or rebooting. During this time, I also created my shortcuts to switch the screen brightness between "working"- and "photo-editing"-mode. In the first shot, they worked. The only issue: The brightness slider did not move to the correct position.

At this point I was done! Just wanted to compare the console fonts with those, set in Cinnamon. So I logged out, switched the session to Cinnamon, took a screen shot, and switched back to GNOME.

On login it happened. GNOME set the screen brightness to 100%. I I set the brightness manually using the brigness slider. Still the same problem. Searching the web, did not really help. I was told to place a startup desktop-file in ~/.config/autostart/. But this did not work very reliable. One thing I did. Killing the gnome-settings-daemon, since in my excperience, this could be the cause. Turns out. gnome-settings-daemon was responsible. Next, I tried to find some gsettings in dconf, without any success.

So I got the source code of gnome-settings-daemon, found the responsible lines of code within a few minutes. During walking up the possible call tree, I did not find an explicit call. But I discovered the dbus api of the power-plugin. However. I started dbus-monitor, and pressed the brightness buttons, to get an idea who the api is used. Mainly to be able to put a filter on dbus-monitor.

$ dbus-monitor --session "path=/org/gnome/SettingsDaemon/Power"

This only shows calls to the power-thing. I used path in favor of interface, since this also shows property changes.

Killing gnome-settings-daemon showed a property change of brightness to 100%.

signal time=1453584221.013391 sender=:1.579 -> destination=(null destination) serial=214 path=/org/gnome/SettingsDaemon/Power; interface=org.freedesktop.DBus.Properties; member=PropertiesChanged
   string "org.gnome.SettingsDaemon.Power.Screen"
   array [
      dict entry(
         string "Brightness"
         variant             int32 100
      )
   ]

Form this on, I knew the call, and grepped again the source code of gnome-settings-daemon. And that was the missing piece. I found a dbus call in plugins/color/gsd-color-state.c … the color manager part of gnome-settings-daemon. However. I discovered the following rows:

// if output is a laptop screen and the profile has a
// calibration brightness then set this new brightness
brightness_profile = cd_profile_get_metadata_item (profile,
                                                   CD_PROFILE_METADATA_SCREEN_BRIGHTNESS);
if (gnome_rr_output_is_builtin_display (output) &&
    brightness_profile != NULL) {
        // the percentage is stored in the profile metadata as
        // a string, not ideal, but it's all we have...
        brightness_percentage = atoi (brightness_profile);
        gcm_session_set_output_percentage (brightness_percentage);
}

The comment pointed me directly to the solution of my problem. I took a look into the icc profile, and voila, metadata said "Screen Brightness: 100".

I only had to remove this metadata of my profile (located in ~/.local/share/icc).

$ cd-fix-profile PROFILE.icc md-remove SCREEN_brightness

\o/

However. I am pretty sure, the screen was on the correct brightness level (39%) when I started to create the profile.

And again. I faced a little, but annoying, problem, and was able find the cause. And this, just because GNOME is open source!

Above, I wrote about the issue of the "not moving brightness slider". As a by-product of the walk through gnome-settings-daemon, I was able to change my brightness-switch-command, to use the dbus calls.

Setting the brigness to 9% (which is my working-mode, most of the time):

$ dbus-send --session --type=method_call \
  --dest="org.gnome.SettingsDaemon" \
  /org/gnome/SettingsDaemon/Power \
  org.freedesktop.DBus.Properties.Set \
  string:"org.gnome.SettingsDaemon.Power.Screen" \
  string:"Brightness" \
  variant:int32:9

Setting the brightness to 39% (for photo-editing):

$ dbus-send --session --type=method_call \
  --dest="org.gnome.SettingsDaemon" \
  /org/gnome/SettingsDaemon/Power \
  org.freedesktop.DBus.Properties.Set \
  string:"org.gnome.SettingsDaemon.Power.Screen" \
  string:"Brightness" \
  variant:int32:39

Wohooo, and yet it moves :D

Flattr this!