diary at Telent Netowrks

NixOS (again) - declarative VMs with QEMU#

Fri, 20 Oct 2017 08:43:49 +0000

I built a new PC to sit in the study at home. This isn't going to be a blog post about that, though: it all worked the first time and so there is nothing to rant about. The new box is smaller, quieter and faster than the old one, as it should be given that the old one is about 9 years old now.

Having got it running I wanted to put some VMs on it (hey, just like last time), but this time around I want to do it slightly less ad-hoc (hey, not just like last time) so I have been playing with creating them declaratively.

Goals (and non-goals)

Prior art

The Virtualization in NixOS page on the new Wiki is thoroughly worth reading and I am very much indebted to it for ideas and even some bits of code. The author has different requirements to me and therefore has different answers in places, but I borrowed a lot. In particular you should read that if you are wondering "doesn't {nixos, nixops} do this out of the box already?"

My approach

Note to the reader: there are many snippets of code in the rest of this post. They are all extracted from the actual system at the time I write this, and provided to help explain the approach, but probably not the best place to start if you just want something you can run. If you want something you can run, look instead at telent/nixos-configs on github which at time of writing is basically the same thing, but more likely to be updated, refined, bugfixed etc than I am ever to revisit this blog post.

Describe the guests

Unlike the Wiki author, I am managing the host machine as a plain Nixos machine and not using Nixops here. So I have created a module /etc/nixos/virtual.nix and added it to my imports in /etc/nixos/configuration.nix

  imports =
    [ # Include the results of the hardware scan.
      ./hardware-configuration.nix
      # [ ... ]
      ./virtual.nix
    ];

In that module, I define the VMs I want using an attribute set bound to a local variable. I know, I should do this properly with the module config system. Some day I will.

let guests = {
      alice = {
        memory = "1g";
        diskSize = "40g";
        vncDisplay="localhost:1";
        netDevice="tap0";
      };
      bob = {
        memory = "1g";
        diskSize = "20g";
        vncDisplay="localhost:2";
        netDevice="tap1";
      };
    };

Start the guest VM processes

We map over the guests variable to make a systemd service for each VM that checks it has a disk image and brings it up (or takes it down, as appropriate).

    systemd.services = lib.mapAttrs' (name: guest: lib.nameValuePair "qemu-guest-${name}" {
      wantedBy = [ "multi-user.target" ];
      script =
          ''
          disks=/var/lib/guests/disks/
          mkdir -p $disks
          hda=$disks/${name}.img
          if ! test -f $hda; then
            ${firstRunScript} $hda ${guest.diskSize}
          fi
          sock=/run/qemu-${name}.mon.sock
          ${pkgs.qemu_kvm}/bin/qemu-kvm -m ${guest.memory} -display vnc=${guest.vncDisplay} -monitor unix:$sock,server,nowait -netdev tap,id=net0,ifname=${guest.netDevice},script=no,downscript=no -device virtio-net-pci,netdev=net0 -usbdevice teablet -drive file=$hda,if=virtio,boot=on
          '';
      preStop =
        ''
          echo 'system_powerdown' | ${pkgs.socat}/bin/socat - UNIX-CONNECT:/run/qemu-${name}.mon.sock
          sleep 10
        '';
    }) guests;

Create the guest disk images

These systemd services expect the guest machine to have a working disk image, so we need some way to create those.

The recipe for this on the Wiki creates a partition image, resize it appropriately, then uses pkgs.vmTools.runInLinuxVM to install NixOS on it. The way it does this is somewhat low-level and to my mind uncomfortably close to Dark Arts: it manually creates /nix/store and calculates package closures and makes directories and runs Grub and ...

I took a different approach which I feel is both cleaner and and more hacky: I created a custom CD image which has a service on it that looks for a disk called vda and runs nixos-generate-config and nixos-install on it. When a new VM is needed, it boots from this virtual CD instead of from its own disk. Note that the auto-install service has no safeguards or checks - this is definitely not a CD image that you would burn onto an actual disk and leave around the office.

(I claim it's more clean because it uses the "standard" installation method, but it is definitely more hacky because it uses sed on the generated configuration.nix to enable ssh and configure grub, and we all know what happens when sed is invited to the party.)

The dangerous unattended install service is defined in nixos-auto-install-service.nix which I'm not going to copy and paste here but you can view on Github. In virtual.nix we write a derivation to create a NixOS config including it and build an ISO image

    iso = system: (import <nixpkgs/nixos/lib/eval-config.nix> {
      inherit system;
      modules = [
        <nixpkgs/nixos/modules/installer/cd-dvd/installation-cd-minimal.nix>
        ./nixos-auto-install-service.nix
      ];
      }).config.system.build.isoImage;

and then we need something to create the disk image and run a QEMU which boots the ISO

firstRunScript = pkgs.writeScript "firstrun.sh" '' #!${pkgs.bash}/bin/bash hda=$1 size=$2 iso=$(echo /etc/nixos-cdrom.iso/nixos-*-linux.iso) PATH=/run/current-system/sw/bin:$PATH ${pkgs.qemu_kvm}/bin/qemu-img create -f qcow2 $hda.tmp $size mkdir -p /tmp/keys cp ${pubkey} /tmp/keys/ssh.pub ${pkgs.qemu_kvm}/bin/qemu-kvm -display vnc=127.0.0.1:99 -m 512 -drive file=$hda.tmp,if=virtio -drive file=fat:floppy:/tmp/keys,if=virtio,readonly -drive file=$iso,media=cdrom,readonly -boot order=d -serial stdio > $hda.console.log if grep INSTALL_SUCCESSFUL $hda.console.log ; then mv $hda.tmp $hda fi '';

(This is called from the systemd service defined previously, if you hadn't noticed and were wondering)

SSH keys

Eagle-eyed readers might notice the shenanigens with /tmp/keys and file=fat:floppy in that script. I didn't really want to bake my ssh public key into the ISO just because that's a vast amount of churn every time the key changes, so this is how we get an SSH key into the image. We're using a feature of QEMU that I did not previously know about - it can create a virtual FAT system from the contents of a directory on the host machine.

Networking

The guests are bridged onto the host LAN, because there is too much NAT in the world already and I do not wish to be the cause of more.

    networking.interfaces = lib.foldl (m: g: m // {${g} = {virtual=true; virtualType="tap";};}) {} (map (g: g.netDevice) (builtins.attrValues guests));
    networking.bridges.vbridge0.interfaces = [hostNic] ++ (map (g: g.netDevice) (builtins.attrValues guests));

A note of caution here: messing with bridges while connected via ssh is a bad idea, if your connection is through one of the interfaces you want to add to the bridge. As soon as you add eth0 (or wlp3s0 or enp0s31f6zz9pluralzalpha or whatever systemd thinks your network card should be called today) to the bridge it will lose its IP address and things will probably not Be Right until dhclient next refreshes. Learn from my mistakes: do this at the console or have some kind of backup connection.

In practice

So far, It Seems To Work. There are some points you may want to note: