diary at Telent Netowrks

Shell out tour#

Wed, 08 Aug 2018 12:51:52 +0000

Nothing to show this week. I have more or less proved to my own satisfaction that I can reboot into a new image using kexec and a small C program and some shell scripts. This came at at considerable personal mental cost, but that's what happens when trying to do text processing in a Bourne shell (not bash) script without falling back on awk or sed (not installed). Associative arrays would have been nice. Actually, just arrays in general would have been a help.

The C program is called writemem and is approximately the moral equivalent of cat | dd seek=N of=/dev/mem bs=1 except that it writes in blocks bigger than 1 byte. Just the kind of thing your security auditor wants to find left lying around on random systems, yeah. I can see a need for some proper thought on security posture in the near future: although no-web-interface and ssh-only-with-a-pubkey-embedded-at-build-time probably makes it less of a target than any consumer D-Linksysgear box in its default configuration, there's probably still a lot more to do on that front. The attack we want to protect against is (1) being able to write to random locations in physical memory; (2) being able to reboot into random kernels using kexec; (3) being able to flash anything we like; (4) all of the above. Probably (4)

There will be one user-visible change when this stuff lands: whereas previously we produced separate files for kernel and rootfs when doing a "development" build, now we make a single agglomerated firmware image and rely on the kernel mtdsplit code to find the root filesystem. This is because step 1 of the headless upgrade procedure is to reboot into the current kernel with an additional memmap parameter, so in the case that the current kernel is running from RAM we need the original uImage to still be accessible and not to have been overwritten since boot. It also makes the build a bit more consistent between dev and production, which is a nice side effect.

First things first, though: need to get it into a state where I can actually commit something. Last night I dreamt I was in a bacon-eating competition where the goal was to consume as much as possible during a MongoDB cluster election before a new primary was chosen, but I woke up before the contest finished. I mention this just to give you an idea of where my brain is right now, but it is probably not a very good idea.

Hydra gen molecule#

Mon, 30 Jul 2018 23:48:15 +0000

Two things this week

1. I got hydra running on my build machine, so I can start doing regression tests now that things work well enough that I ought to worry about breaking them. (Note cavalier insertion of "ought to" in the preceding sentence)

Provided you are running Nixos, this is easier than you'd think from looking at the Hydra manual, because there is a Nixos module for it. First, add this or something like it to your configuration.nix and rebuild

  services.hydra = {
    enable = true;
    hydraURL = "http://${hostname}:3000/";
    notificationSender = "dan@telent.net";
    buildMachinesFiles = [];
  };

Second, reboot, or logout and login, or ensure in some other way that your shell has sourced the HYDRA_* variables which the previous step added to /etc/profile. Otherwise step 3 will fail to do what it should do and the error messages will be entirely unhelpful

[dan@loaclhost:~]$ grep HYDRA /etc/profile 
export HYDRA_CONFIG="/var/lib/hydra/hydra.conf"
export HYDRA_DATA="/var/lib/hydra"
export HYDRA_DBI="dbi:Pg:dbname=hydra;user=hydra;"

Third, refer to the instructions in the Hydra manual starting where it says to run hydra-init then hydra-create-user. The previous steps were already done by the module. Also, there are systemd services to run the server, the evaluator and the queue runner, so ignore anything that says you should start them by hand.

Note the empty array for buildMachinesFiles - this was important on my machine and probably is important on yours too. If you don't have it, when eventually you get all your projects and jobsets apparently working properly and evaluating without error, you will find that the jobs sit in the queue and never get run, because something something bad defaults no queue runner machines something something.

Things I read wherein I found the solutions to my problems:

My Hydra instance is private and destined to remain so, at least for now.

2. Some refactoring of the kernel derivation into three parts: unpacking the tree and applying LEDE/OpenWRT patches; building vmlinux; and applying the DTB and making a uImage. I think this is an improvement: it will certainly make a few things (like running qemu, or changing the command line) more convenient, but I'm not sure I have it exactly right yet.

My current TODO list: as you will see, everything that has happened recently has been procrastination on the top goal.

Argumentum ad arborem fabrica#

Wed, 25 Jul 2018 22:54:16 +0000

What I'd like you to take away from this post title is that I speak about as much Latin as I do German.

What I'd like you to take away from the post body is (i) that I have a solution for the problem it describes, and (ii) that it required a tonne more of reading code and adding debugging statements and experimenting than I think it reasonably ought to have done, so look upon my words ye mighty and despair. The picture is a Google search result for "flattened tree", if you were wondering.

So, as I said previously we have (now, "had") a problem with kexec and specifying the command line arguments to the kernel: on the one hand we want to ignore any arguments that the bootloader provides, because generally they're probably wrong, but on the other we want to pay attention to the command line when booted by kexec, because the appropriate parameters for booting from flash are not also appropriate when booting from RAM. I'm going to skip over the voyage of discovery here because it's almost as tedious to relate as it was to, er, discover. So here are the highlights:

Kexec on MIPS (for ELF) provides two ways to supply the kernel command line.

The first option is that you add a segment which starts with the magic string "kexec " to the list of segments that you call kexec_load with, and then the pre-reboot kernel kexec code (machine_kexec_init_argv) iterates through the segments, finds the one with the right magic prefix, and parses it into kexec_argv[]. Then after the reboot, code in relocate_kernel.S loads the argument vector into register a1 before it calls into the new kernel. head.S in the new kernel then copies the pointer into fw_arg1, and then some board-specific code is responsible for what happens next. For the ralink case, this is prom_init_cmdline in ralink/prom.c which copies the argument vector back into a single string arcs_cmdline. After that, the next point of interest is in kernel/setup.c which tests a complex combination of kernel config options to decide which of arcs_cmdline, boot_command_line and builtin_cmdline (gotta love that consistent use of abbreviations) are used and in what combination to form the command line that the kernel will actually see.

There is a comment in the kexec source to say that this only works on an Octeon. Now that I trace the entire execution path I no longer understand why it only works on Octeon, but I will note that it didn't work for me. And, incidentally, wouldn't solve the problem if it did as we can't identify whether the command line came from kexec or u-boot. Anyway, taking inspiration from the said comment that this is "legacy", I decided to go with the second way.

The second way is to pass a DTB (a compiled device tree) from kexec, and embed a command line in there. There's a branch of the tree called chosen and within that is a leaf called bootargs, and that's where you find the command line. As a string, not an array of strings, please note.

In stock Linux there are two defined ways tell your kernel where its DTB is (in addition to anything your bootloader might do, if your bootloader is an accomplice rather than an adversary). The first option is to include it in the kernel as a special ELF section, or the second is to append it (using cat or similar) to a raw kernel image. It should be noted that the first approach only works if your kernel image is ELF (ours isn't) and the second - aside from being somewhat brittle if you ever boot a kernel where you forgot to concatenate the DTB - only works if your "raw image" is a zImage (ours isn't). So it's probably not at all surprising that OpenWRT have added a third way: in kernel/head.S they've added 16kB of zeroes preceded by the magic string "OWRTDTB:", then provided a utility called patch-dtb to run at kernel build time, that looks through the kernel image for this string and patches a provided DTB into place. This location is labelled __image_dtb, and for ralink boards there's some code in plat_mem_setup to call __dt_setup_arch on it.

(You will observe, if you're following all this, that this code is unconditional, so the third option is not so much an option as an override)

__dt_setup_arch calls early_init_dt_scan which calls early_init_dt_scan_nodes which calls early_init_dt_scan_chosen to populate boot_command_line.

After that, we're back to kernel/setup.c and the same complex combination of kernel config options we already saw, to decide which of arcs_cmdline, boot_command_line and builtin_cmdline are used - except that this time the answer we want is boot_command_line not arcs_cmdline.

So: how do we get kexec to inject a different DTB (which will in practice be a very similar DTB except for the /chosen/bootargs node) into this sequence? Three things.

First, the userland side . This turns out to be pretty simple if you have the ELF code to crib from - we read or create the DTB in RAM, and then add it as a segment to the segment list that kexec_load is called with.

Second, the pre-reboot kernel side. There is existing support in the MIPS "generic" kernel for finding this segment by stepping through the list looking for an fdt header . We're not running the generic kernel - we're running the ralink kernel - so we moved that code into mips/kernel/machine_kexec.c and made it conditional on CONFIG_USE_OF. The effect of this code, if it finds a DTB, is to put its address into kexec_args[1] and set kexec_args[0] to -2.

Third, after reboot the kexec_args[1] ends up in register a1 and then gets copied to fw_passed_dtb. All that remains after that is to change the code that hardcodes using image_dtb into code that defaults to image_dtb if we didn't get a DTB some other way - voici - and we're basically good to go. One yak successfully popped off the shaving stack.

Power play#

Fri, 20 Jul 2018 12:54:10 +0000

Last week in NixWRT we made kexec with uImages work despite unexpected complexity; this week we realised it's not even going to be as simple as we thought it would be after last week's work. Because u-boot is often delivered in lobotomised form on end-user devices we are using the kernel CONFIG_CMDLINE_OVERRIDE options to specify the command line and ignore whatever weird defaults the bootloader wishes to provide, but this then bites us when we want to boot the same kernel with kexec and augment the overridden command line. Argh.

But I haven't really jumped into doing anything about that yet, because I got sufficently annoyed with having to walk up and down stairs to reset the device every time I crashed it (testing kexec) that I decided to take some time out to add remote power switching to it. Which is what you see to your right (if you are reading this in HTML with CSS enabled: if you are an RSS subscriber or using Reader mode or ... I dunno, it's probably somewhere around here). This is an Arduino Yun running a sketch that turns pin 8 on or off when it gets a 1 (or y) or a 0 (or n) on its serial port, attached via a voltage divider to the base of a 2N2222 transistor, whose collector is attached to the coil of a small relay, whose switch is interposed in the path of the 5V wire of a microUSB cable. Result: I can turn my GL-MT300N off and on by running something like (echo n && sleep 1 && echo y) > /dev/ttyACM0 from the computer that the Arduino is plugged into.

The Arduino Yun, including as it does an entire embedded MIPS system , is definitely overkill to drive a single GPIO output, but it was lying on my desk and not currently doing much: it's not as though I bought it for this use.

Images of U#

Tue, 17 Jul 2018 16:27:37 +0000

Last week I outlined a plan that uses kexec to reboot into a new image without access to U-Boot or having to flash it. Because I don't think an upgrade path that required popping the top off every device and attaching a serial console cable is much of a path.

Getting kexec to work turned out to involve a little more work than I think I was hoping for, mostly because I want to be able to boot the same uImage from kexec as will be flashed and booted from u-boot, and the kexec userland utility on MIPS doesn't support uImage format.

Conclusion: someone somewhere needs to get a grip, though I'm open to the possibility that it's me.

On MIPS, a uImage file contains a raw binary file - this is not the same thing as an Image - which has optionally been compressed by gzip or by lzma (though, note, not by xz in its lzma compatibility mode, which generates streaming files that u-boot can't load). Which is basically another way of saying "none of the above, and there isn't a lot of code in kexec you can reuse either".

Getting the payload out of the uImage is fairly painless: we just had to add an offset . But then we have to decompress it. The existing lzma decompression code in kexec works only on files, so we can't reuse it here because the data is in memory already. So we have to write a whole new bunch of code to decompress LZMA in RAM.

Some day I will submit this upstream, but I think it could stand some considerable cleanup first. For the moment I've added it as a patch to our kexectools derivation.

Next week: actually implement the kexec reboot/upgrade dance. Then maybe move my archive disk onto the new box, so we can start work on the broadband router.