diary at Telent Netowrks

In the Nix of time#

Mon, 15 Jan 2018 18:35:04 +0000

[ I'm not sure I can keep up these puns in the blog post titles much longer. That may be welcome news for my readers, of course. ]

I was expecting this blog post to be along the lines of "there is no progress to report since last week but I am writing anyway just to maintain the weekly schedule", but happily, last night I saw the board boot with an ethernet driver and was even able to ping it.

<6>libphy: ag71xx_mdio: probed
<6>ag71xx-mdio.1: Found an AR7240/AR9330 built-in switch
<6>eth0: Atheros AG71xx at 0xba000000, irq 5, mode:GMII
<6>ag71xx ag71xx.0: connected to PHY at ag71xx-mdio.1:04 [uid=004dd041, driver=]
<6>eth1: Atheros AG71xx at 0xb9000000, irq 4, mode:MII

Once I realised I should be using eth1 not eth0, at least.

The things I have learnt this week were almost entirely not about Nix: instead I was looking at the kernel, and the OpenWRT (actually LEDE-which-will-soon-be-OpenWRT-again) build process. Which was to some extent what I was originally trying to avoid by basing this whole thing on Nix, but there we are.

What's the problem?

The Linux kenel 4.14.1 has no support for the wired Ethernet device builtin to the AR933x SoC.

(I was actually quite surprised to find this out)

Do you have any plausible but unworkable suggestions to fix it?

Porting the driver from OpenWRT should be pretty simple. Just copy some files across and patch the Makefile, right?

I infer from your use of the words "simple" and "just" that it turns out to be a bit more complicated?

Damn, you know me too well.

I am you

True dat.

So?

OpenWRT is based on the upstream kernel (score one over Android, at least) but diverges quite significantly, to the extent that the kernel stuff in the LEDE source repo contains about 250 extra source files you have to copy into your kernel source tree, and 2500 patch files that need to be applied on top. And a lot of the patches depend on previous patches in the series, and basically the upshot is that the chance of cherry-picking only the changes you want is kind of ... remote. At least certainly not without at least downloading and applying the whole series, by which time you have the whole series anyway.

There's another, slighty more long-term, problem with this suggestion, too: a tonne of those files are basically copy-paste jobs of each other, which makes me hope (admittedly against my own immediate self-interest) that upstream would refuse to adopt the resulting patch.

You're going to expound on this at tedious length, aren't you?

I'll try to keep it brief. Grown-up computers like PCs and SPARCs usually have standards by which an operating system may discover what hardware is attached/plugged in - PCI bus enumeration or Open Firmware or something like that. This is good because it means the kernel doesn't have to hardcode all this stuff. Embedded systems, on the other hand ...

... don't?

Often don't, no. Please stop finishing my sentences. So, historically, for every board or product that runs the Linux MIPS kernel, there is a chunk of code that registers all the devices and memory regions and all that stuff which the drivers will need, and this all gets a bit repetitive when there are a zillion of the buggers and they're all approximately the same but have slightly different base addresses for their USB ports, or they have two ethernets instead of 4, or the LEDs and the WPS buttons are hooked up to different GPIO pins.

Madness!

Understandable in context, because what router manufacturer really cares that much that the same Linux kernel image will run across not only their entire product range but also the product ranges of seventeen of their competitors? But still, for our purposes a PITA.

So what can be done?

The Device Tree, or "why write code when you can write data?". First mooted back in 2009 and gradually (tending sometimes to grudgingly) accepted over the following nine years, the device tree for some particular board is essentially a serialisation of the data structures that Open Firmware would provide the OS when running on that board, in the hypothetical event that the board had Open Firmware. Upstream support for the ar71xx (a.k.a ath79) has rudimentary support for device tree, but no ethernet devices therein, and the old mach_* files have not yet been removed.

(Here's an example for the TL-MR3020 , a device almost-but-not-quite-identical to the Yun, which is too long to paste but definitely short enough that you should have a look at it)

So that's the Right Answer: add the ag71xx ethernet driver to the tree. Forward port it from 4.9 to 4.14, abstract somehow over the eleventy-billion-branch switch statements it's littered with so it works on multiple SoCs, decide what to do about the driver for the SoC's network switch that it relies on, and ponder whether to delete some mach_*.c files that clearly shouldn't be needed before deciding not to make that many needless enemies among the commercial users of this code.

Contrast, however, with the Pragmatic Answer: for the moment at least, until the circular tuit drought ends, why don't we switch to the OpenWRT kernel ? Which, as you can see from the printk output that started this entry, Already Just Works.

You said "just" again

Yeah. Sorry.

Finished?

Pretty much. Also this week I made the kernel image build process a teeny bit less hacky, and added some frivolous stuff like cat, ifconfig and mount to the root filesystem, but that was basically trivial. And I posted to nix-devel about it and several people were quite kind.

Next stop, some userland - including the thorny question of what shall we use for an init system - and maybe some forward porting to make it work on nixpkgs master.

Baud games#

Sun, 07 Jan 2018 11:46:21 +0000

Epiphany (n): (1) January 6 observed as a church festival in commemoration of the coming of the Magi as the first manifestation of Christ to the Gentiles or in the Eastern Church in commemoration of the baptism of Christ; (2) a moment of sudden and great revelation or realization.

Milestone

This week in NixWRT was typified by lots of trying stuff that didn't work followed by an unexpected achievement: I have a shell running on the actual hardware!

When we left off last week , if you will recall, we had a kernel that booted most of the way to mounting the root filesystem and executing init but not quite, and for some odd reason it booted a little bit further if I lied to it about the console device. Since then:

This is simultaneously a victory and a complete PITA, because there's no way to change the baud rate in this feature-impoverished branch of u-boot , so every time I reboot I have to change speed back and forth to talk to the bootloader. It would be nice if we could get it to work at 250000 (perhaps the u-boot console code has some pointers), or find a way to make u-boot speak more slowly, and I will probably look at that at some point.

Other things to do

[ Postemporaneous edit: the next thrilling installment in this series is now up at https://ww.telent.net/2018/1/15/in_the_nix_of_time ]

gehen Sie bitte mit, hier ist Nix zu sehen#

Tue, 02 Jan 2018 12:08:23 +0000

[ Meta: I don't actually speak German. I hope the pun works, but I have no particular reason to suppose it should do. ]

Happy New Year, if you observe the Gregorian Calendar. This week in NixWRT was typified by lots of beating head on brick wall followed by an unexpected achievement: I have a working rootfs in qemu!

Look, isn't it cool?

[nix-shell:~/src/nixwrt]$ qemu-system-mipsel  -M malta -m 64 -nographic -kernel 
linux-*/vmlinux   -append 'root=/dev/sr0 console=ttyS0 init=/bin/sh' -blockdev d
river=file,node-name=squashed,read-only=on,filename=tftproot/rootfs.image -block
dev driver=raw,node-name=rootfs,file=squashed,read-only=on -device ide-cd,drive=
rootfs -nographic                                                               
Linux version 4.14.1 (dan@loaclhost) (gcc version 6.4.0 (GCC)) #2 SMP Tue Jan 2 
14:58:10 UTC 2018                                                               
[...]
BusyBox v1.27.2 () built-in shell (ash)

# LD_TRACE_LOADED_OBJECTS=1 /nix/store/*-rsync*/bin/rsync --version linux-vdso.so.1 (0x77cc8000) libpopt.so.0 => /nix/store/79ffdcjvk5bpbm1vgrxii935vhjbdg5p-popt-1.16-mi psel-unknown-linux-gnu/lib/libpopt.so.0 (0x77c70000) libc.so.6 => /nix/store/7njknf9mhcj7jd3l0axlq8ql0x7396pk-glibc-2.26-75-m ipsel-unknown-linux-gnu-mipsel-unknown-linux-gnu/lib/libc.so.6 (0x77ad4000) /nix/store/7njknf9mhcj7jd3l0axlq8ql0x7396pk-glibc-2.26-75-mipsel-unknown -linux-gnu-mipsel-unknown-linux-gnu/lib/ld.so.1 (0x77c98000)

Points of note here:

(-> head wall)

I spent a lot of time, with no actual result yet, on getting the Yun to tftp its kernel and rootfs and run them in-place without having to write anything to flash. Motivation here is: it's not my Yun, it belongs to my employer who will probably want it back next time we do a hackathon or something. So I don't want to brick the device accidentally, nor use all the flash erase cycles, and anyway it's probably slower than running from RAM.

This is my theory which almost works but for some reason not quite: we should be able to tftp the root fs into RAM then use the MTD "phram" driver to emulate an MTD device at that address, and the memmap option to hide that region of memory from the Linux system (so it doesn't overwrite it)

ar7240> setenv kernaddr 0x81000000
ar7240> setenv rootaddr 1178000
ar7240> setenv rootaddr_useg 0x$rootaddr
ar7240> setenv rootaddr_ks0 0x8$rootaddr
ar7240> setenv bootargs keep_bootcon console=ttyATH0,250000 panic=10 oops=panic init=/bin/sh
phram.phram=rootfs,$rootaddr_ks0,9Mi root=/dev/mtdblock0 memmap=10M\$$rootaddr_useg
ar7240> setenv bootn "tftp $kernaddr /tftp/kernel.image ; tftp $rootaddr_ks0 
/tftp/rootfs.image; bootm  $kernaddr"
ar7240> run bootn

Here's where it gets weird. With those options, it rus most of the way through boot then hangs after printing NET: Registered protocol family 17 (that's netlink, if you were wondering). If I misspell the console device name, though, it gets slightly further. wat?

NET: Registered protocol family 17
Warning: unable to open an initial console.
VFS: Mounted root (squashfs filesystem) readonly on device 31:0.
Freeing unused kernel memory: 208K
This architecture does not have kernel memory protection.
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000

So it's identified that there is a squashfs filesystem there, which is a positive sign, but it's not going to run init without a console.

Also falling into the "known unknowns" quadrant: you will note that we randomly set and unset the high bit on some of our addresses there: this is because the same physical RAM is mapped into more than one place in the MIPS address space and I sort of think I have a handle on how it works but not really.

[ Postemporaneous edit: the next thrilling installment in this series is now up at https://ww.telent.net/2018/1/7/baud_games ]

All MIPSy were the borogroves#

Wed, 27 Dec 2017 00:14:32 +0000

My New Year's Resolution is to blog something every Tuesday (shut up at the back there, I haven't been to bed yet so it's still nominally Tuesday in my personal timezone) whenever I haven't posted in the preceding week.

Recently I had the idea of repurposing my previous wireless router (a TL-WR842ND ) as the brain for a backup server in my study, by plugging a USB disk into it and installing rsync. In order to fulfill my yak shaving quota, I decided to do this using Nixpkgs/NixOS instead of just doing the sensible thing and installing the relevant OpenWRT package.

Story so far:

In the pursuit of getting a serial console on it, I have probably burnt out the UART by bad soldering and/or inadvertently connecting TX to the 5V rail.

But I also have an Arduino Yun lying around which has an Atheros AR9331 MIPS 32 bit SoC - more or less the same hardware as most consumer broadband routers - with an Arduino microcontroller stuck to it that can be persuaded into a role as a USB/serial converter - so I have a console and no soldering required. This felt like a waste of an Atmega, but clearly in a good cause so I pressed on.

Right now I'm at the stage where I can build a bootable kernel and an unuseably large filesystem for it. Here are some things I have learned:

Currently I am working from the branch named in that pull request: ar works and strip too, but I still have huge image sizes, now because it has decided that glibc depends on gcc and the kernel headers - this seems to be a problem with cross-compilation generally and not with MIPS specifically, because ARM has the same issue.

[dan@loaclhost:~/src/nixwrt]$ nix-store -q --references /nix/store/pf047ij2z1bfzlkkyf0v7m4p273713d6-glibc-2.26-75-armv6l-unknown-linux-gnueabihf-armv6l-unknown-linux-gnueabihf/
/nix/store/3x1wd17r8fg3zhasljxdm6vabyn7qr5y-gcc-6.4.0-armv6l-unknown-linux-gnuea
/nix/store/xh15qw8k4za1va29ks7z3kjbjlcfb15v-linux-headers-4.4.10-armv6l-unknown-
/nix/store/pf047ij2z1bfzlkkyf0v7m4p273713d6-glibc-2.26-75-armv6l-unknown-linux-g

It is entirely possible, of course, that I will never get a GLibc-based system into 8MiB even when it's not dragging in the kitchen sink, the plumber that installed it, and the staff and plant of the factory that made the plumber's van (er, figuratively) and I should switch to uclibc or musl, but in the meantime this is all educational.

I have some clearly still very work-in-progress code at https://github.com/obsidiansystems/nixpkgs/compare/02726a2...telent:nixwrt-cross-elegant for anyone who wants to see it.

In other news, I've also been addressing my apparent need to solder stuff by having Fun With Arduinos and Neopixels

[ Postemporaneous edit: the next thrilling installment in this series is now up at https://ww.telent.net/2018/1/2/gehen_sie_bitte_mit_hier_ist_nix_zu_sehen ]

NixOS (again) - declarative VMs with QEMU#

Fri, 20 Oct 2017 08:43:49 +0000

I built a new PC to sit in the study at home. This isn't going to be a blog post about that, though: it all worked the first time and so there is nothing to rant about. The new box is smaller, quieter and faster than the old one, as it should be given that the old one is about 9 years old now.

Having got it running I wanted to put some VMs on it (hey, just like last time), but this time around I want to do it slightly less ad-hoc (hey, not just like last time) so I have been playing with creating them declaratively.

Goals (and non-goals)

Prior art

The Virtualization in NixOS page on the new Wiki is thoroughly worth reading and I am very much indebted to it for ideas and even some bits of code. The author has different requirements to me and therefore has different answers in places, but I borrowed a lot. In particular you should read that if you are wondering "doesn't {nixos, nixops} do this out of the box already?"

My approach

Note to the reader: there are many snippets of code in the rest of this post. They are all extracted from the actual system at the time I write this, and provided to help explain the approach, but probably not the best place to start if you just want something you can run. If you want something you can run, look instead at telent/nixos-configs on github which at time of writing is basically the same thing, but more likely to be updated, refined, bugfixed etc than I am ever to revisit this blog post.

Describe the guests

Unlike the Wiki author, I am managing the host machine as a plain Nixos machine and not using Nixops here. So I have created a module /etc/nixos/virtual.nix and added it to my imports in /etc/nixos/configuration.nix

  imports =
    [ # Include the results of the hardware scan.
      ./hardware-configuration.nix
      # [ ... ]
      ./virtual.nix
    ];

In that module, I define the VMs I want using an attribute set bound to a local variable. I know, I should do this properly with the module config system. Some day I will.

let guests = {
      alice = {
        memory = "1g";
        diskSize = "40g";
        vncDisplay="localhost:1";
        netDevice="tap0";
      };
      bob = {
        memory = "1g";
        diskSize = "20g";
        vncDisplay="localhost:2";
        netDevice="tap1";
      };
    };

Start the guest VM processes

We map over the guests variable to make a systemd service for each VM that checks it has a disk image and brings it up (or takes it down, as appropriate).

    systemd.services = lib.mapAttrs' (name: guest: lib.nameValuePair "qemu-guest-${name}" {
      wantedBy = [ "multi-user.target" ];
      script =
          ''
          disks=/var/lib/guests/disks/
          mkdir -p $disks
          hda=$disks/${name}.img
          if ! test -f $hda; then
            ${firstRunScript} $hda ${guest.diskSize}
          fi
          sock=/run/qemu-${name}.mon.sock
          ${pkgs.qemu_kvm}/bin/qemu-kvm -m ${guest.memory} -display vnc=${guest.vncDisplay} -monitor unix:$sock,server,nowait -netdev tap,id=net0,ifname=${guest.netDevice},script=no,downscript=no -device virtio-net-pci,netdev=net0 -usbdevice teablet -drive file=$hda,if=virtio,boot=on
          '';
      preStop =
        ''
          echo 'system_powerdown' | ${pkgs.socat}/bin/socat - UNIX-CONNECT:/run/qemu-${name}.mon.sock
          sleep 10
        '';
    }) guests;

Create the guest disk images

These systemd services expect the guest machine to have a working disk image, so we need some way to create those.

The recipe for this on the Wiki creates a partition image, resize it appropriately, then uses pkgs.vmTools.runInLinuxVM to install NixOS on it. The way it does this is somewhat low-level and to my mind uncomfortably close to Dark Arts: it manually creates /nix/store and calculates package closures and makes directories and runs Grub and ...

I took a different approach which I feel is both cleaner and and more hacky: I created a custom CD image which has a service on it that looks for a disk called vda and runs nixos-generate-config and nixos-install on it. When a new VM is needed, it boots from this virtual CD instead of from its own disk. Note that the auto-install service has no safeguards or checks - this is definitely not a CD image that you would burn onto an actual disk and leave around the office.

(I claim it's more clean because it uses the "standard" installation method, but it is definitely more hacky because it uses sed on the generated configuration.nix to enable ssh and configure grub, and we all know what happens when sed is invited to the party.)

The dangerous unattended install service is defined in nixos-auto-install-service.nix which I'm not going to copy and paste here but you can view on Github. In virtual.nix we write a derivation to create a NixOS config including it and build an ISO image

    iso = system: (import <nixpkgs/nixos/lib/eval-config.nix> {
      inherit system;
      modules = [
        <nixpkgs/nixos/modules/installer/cd-dvd/installation-cd-minimal.nix>
        ./nixos-auto-install-service.nix
      ];
      }).config.system.build.isoImage;

and then we need something to create the disk image and run a QEMU which boots the ISO

firstRunScript = pkgs.writeScript "firstrun.sh" '' #!${pkgs.bash}/bin/bash hda=$1 size=$2 iso=$(echo /etc/nixos-cdrom.iso/nixos-*-linux.iso) PATH=/run/current-system/sw/bin:$PATH ${pkgs.qemu_kvm}/bin/qemu-img create -f qcow2 $hda.tmp $size mkdir -p /tmp/keys cp ${pubkey} /tmp/keys/ssh.pub ${pkgs.qemu_kvm}/bin/qemu-kvm -display vnc=127.0.0.1:99 -m 512 -drive file=$hda.tmp,if=virtio -drive file=fat:floppy:/tmp/keys,if=virtio,readonly -drive file=$iso,media=cdrom,readonly -boot order=d -serial stdio > $hda.console.log if grep INSTALL_SUCCESSFUL $hda.console.log ; then mv $hda.tmp $hda fi '';

(This is called from the systemd service defined previously, if you hadn't noticed and were wondering)

SSH keys

Eagle-eyed readers might notice the shenanigens with /tmp/keys and file=fat:floppy in that script. I didn't really want to bake my ssh public key into the ISO just because that's a vast amount of churn every time the key changes, so this is how we get an SSH key into the image. We're using a feature of QEMU that I did not previously know about - it can create a virtual FAT system from the contents of a directory on the host machine.

Networking

The guests are bridged onto the host LAN, because there is too much NAT in the world already and I do not wish to be the cause of more.

    networking.interfaces = lib.foldl (m: g: m // {${g} = {virtual=true; virtualType="tap";};}) {} (map (g: g.netDevice) (builtins.attrValues guests));
    networking.bridges.vbridge0.interfaces = [hostNic] ++ (map (g: g.netDevice) (builtins.attrValues guests));

A note of caution here: messing with bridges while connected via ssh is a bad idea, if your connection is through one of the interfaces you want to add to the bridge. As soon as you add eth0 (or wlp3s0 or enp0s31f6zz9pluralzalpha or whatever systemd thinks your network card should be called today) to the bridge it will lose its IP address and things will probably not Be Right until dhclient next refreshes. Learn from my mistakes: do this at the console or have some kind of backup connection.

In practice

So far, It Seems To Work. There are some points you may want to note: