diary at Telent Netowrks

Musl memory#

Wed, 28 Feb 2018 00:36:07 +0000

A short post this week, but this is because I need to sleep, not because there is nothing to write about.

First up, NixWRT has moved. It is no longer part of a "lightly forked nixpkgs" repo, it has its own repo containing only NixWRT stuff at https://github.com/telent/nixwrt . Instead of embedding the Nix package collection it now requires that you provide it with one by e..g using the -I flag to nix-build

nix-build -I nixpkgs=../nixpkgs-for-nixwrt/  -A tftproot backuphost.nix

Presently there is still a mildly forked Nix package collection involved, but it is now available separately, and I have started the process of feeding the changes back into upstream so I hope to be able to eliminate that dependency in time.

Second, it builds with musl - which is great news as the image for `backuphost` is too big to fit in 8MB flash when using glibc. The changes required to switch to musl are - apart from a small bug in nixpkgs libiconv derivation - ludicrously trivial.

Third, I was not entirely correct last week when I said that upgrading to nixpkgs master caused nixwrt to break "almost not at all", because after I actually split the repos up I found a couple more patches needed than just the two mentioned. But nothing too serious.

Here's what it looks like:

[dan@loaclhost:~/src/nixwrt]$ ls -l yun/
total 9608
-r--r--r-- 1 root root 1565199 Jan  1  1970 kernel.image
-r--r--r-- 1 root root 2568192 Jan  1  1970 rootfs.image
-r-xr-xr-x 1 root root 5698784 Jan  1  1970 vmlinux

(vmlinux is not actually required on the target, it's a leftover)

Next up will be more patch upstreaming, and making it generate an image I can actually flash onto a TL-WR842. It is claimed that the emergency debricking TFTP client only works when fed with actual TP-Link images and not with OpenWRT, which is going to be bit of a drag if true.

Grandmaster, cut faster#

Tue, 06 Mar 2018 22:28:59 +0000

No Nix content at all this week, as all I've done is flash (please refer back to blog post title) my TL-WR842ND back to the factory firmware in preparation for figuring out how to get NixWRT onto it.

There's some discussion of how to do this on the OpenWRT wiki - attach router to wired network, configure a tftp server to answer on 192.168.1.66 and respond to requests for a file called wr842ndv1_tp_recovery.bin which was previously downloaded from the TP-Link site, then turn the router on while holding RESET and wait for stuff to happen.

As always, however, there is a wrinkle. The firmware I downloaded was a ZIP which contained a file called wr842ndv1_en_3_12_25_up_boot(130322).bin, and according to most sources (most sources parrot the OpenWRT wiki)

in case the file name of this firmware file does contain the word “boot” in it, you need to cut off parts of the image file before flashing it:

specifically, remove the first 131584 bytes. Why that number? It doesn't say.

This is what binwalk is for

[dan@carobn:~]$ nix-shell -p python27Packages.binwalk --run "binwalk /tmp/wr842.bin"

DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 0 0x0 TP-Link firmware header, firmware version: 1.-3012.3, image version: "", product ID: 0x0, product version: 138543105, kernel load address: 0x0, kernel entry point: 0x80002000, kernel offset: 8258048, kernel length: 512, rootfs offset: 872767, rootfs length: 1048576, bootloader offset: 7077888, bootloader length: 0 110592 0x1B000 U-Boot version string, "U-Boot 1.1.4 (Mar 22 2013 - 09:09:03)" 110768 0x1B0B0 CRC32 polynomial table, big endian 131584 0x20200 TP-Link firmware header, firmware version: 0.0.3, image version: "", product ID: 0x0, product version: 138543105, kernel load address: 0x0, kernel entry point: 0x80002000, kernel offset: 8126464, kernel length: 512, rootfs offset: 872767, rootfs length: 1048576, bootloader offset: 7077888, bootloader length: 0 132096 0x20400 gzip compressed data, has original file name: "vmlinux.bin", from Unix, last modified: 2013-03-22 01:11:22 1180160 0x120200 Squashfs filesystem, big endian, lzma signature, version 3.1, size: 4675579 bytes, 562 inodes, blocksize: 65536 bytes, created: 2013-03-22 01:24:41

So there you are: the emergency tftp restore expects an image with a TP-Link firmware header followed by a kernel followed by a filesystem - which roughly corresponds with the description of mtd5 in the openwrt flash layout - but the image on the TP-Link site prefaces that with about 128k of something that might be U-boot, which roughly corresponds with the layout of the entire flash chip

Going forward this is relevant insofar as it means we really have two problems not just one

Currently thinking: we can tackle problem 2 first. Let's put OpenWRT on the machine (then at least I have ssh available) and then build a kernel/fs I can start with kexec and iterate on that until I know it works on the hardware. Once we have the right code then we can start figuring out how to put it at the right offset.

Booted out#

Wed, 14 Mar 2018 14:12:58 +0000

This is another week of not much done, but for the record

kexec

You can't use kexec to boot into a new kernel unless the kernel you're booting from has support for it. So that cunning plan is out.

Das U-Boot

Das U-Boot is billed as "the Universal Boot Loader", but sometimes I wonder if in practice the U stands for "unique per board" or "unco-ordinated" or even "uninstallable" - simply because the actual version of u-boot that comes installed on your cheap consumer router or IoT device board is a forked and undocumented mess based on an upstream release that's probably about ten years old, and if you want to replace it with mainline U-Boot you have to either (1) be lucky enough to have your new build work perfectly first time, or (2) have access to JTAG or a serial programmer in case it doesn't.

u-boot_mod

... u-boot_mod looks really rather cool if you have a device it supports - in addition to the basic u-boot it has a web server and a network console

Unfortunately, as it doesn't support my device (it supports some varieties of TL-WR841 and a later revision of WR842 than mine) I'm disinclined to try building it given that if it doesn't work - and that it's sensitive to things like gcc version - there is again no way to resurrect the device without special hardware.

Excuses, excuses. What's the answer?

New hardware

I ordered this yesterday, so when Amazon eventually deign to deliver it, development will/may resume.

Lede by example#

Tue, 20 Mar 2018 21:51:52 +0000

My GL-MT300A arrived just as I was about to go on holiday. This is how far I've got -

Serial console

These things are, if not actually made for DIY purposes, at least very tolerant to such uses. Take it out of its case and you find three standard 0.1" header pins on the PCB labelled "TX", "RX", "GND" - connect each of them to something that speaks TTL serial (I used a Raspberry Pi) and set the baud rate to 115200. Worked first time.

"The U is for Uninitialized"-Boot

I commented previously about the differences one may encounter between two devices both of which run the allegedly "Universal" U-Boot boot loader. This time I couldn't work out why my tftp downloads were loading into memory at offset 0 instead of, say, 0x811f8000. Until I realised that (i) it no longer sufficed to say

setenv rootaddr 11f8000
setenv rootaddr_useg 0x$rootaddr
setenv rootaddr_ks0 0x8$rootaddr

and I must now surround environment variable references with curly braces.

setenv rootaddr 11f8000
setenv rootaddr_useg 0x${rootaddr}
setenv rootaddr_ks0 0x8${rootaddr}

and (ii) on this device, double quotes around the value of a setenv are no longer special, so

  setenv bootn "foo;bar"
will set the value of bootn to "foo and then attempt to run the command bar". Which typically doesn't work all that well.

Hello darkness my old friend

Having made the relevant changes I was able to get the following output:

## Booting image at 81000000 ... Image Name: Linux-4.9.76 Image Type: MIPS Linux Kernel Image (lzma compressed) Data Size: 1705466 Bytes = 1.6 MB Load Address: 80001000 Entry Point: 803fa9c0 Verifying Checksum ... OK Uncompressing Kernel Image ... OK No initrd ## Transferring control to Linux (at address 803fa9c0) ... ## Giving linux memsize in MB, 128

Starting kernel ...

followed by indefinite but emphatic silence, and various bouts of fiddling with CONFIG_EARLY_PRINTK and stuff have not yet persuaded it to loosen up. Currently I am running LEDE in a Docker container to see what it does, and diffing its .config with mine. This has shown up a couple of things that I've now added to my configuration, but I am only going to get one shot at running it before I go home, because at 70 miles distant from the hardware I can't reach across and power cycle it.

The U is for urgghle#

Wed, 28 Mar 2018 00:36:42 +0000

A more productive week than the previous one , on the whole.

(Once I got warmed up, at least. I returned from my holiday to find that the entire local network had stopped working because after mucking around with U-boot on the device while I was away, I'd inadvertently let the default openwrt installation on the MT300A start , and it was running a DHCP server. Shouldn't have put it on the LAN, I suppose. One round of reboots later ... but apart from that it has been a productive week)

Let's start by spoiling the ending, so you don't have to read the rest of this post: said MT300A now boots to user space and runs init (and monit). I hope that all that remains is to get the Ethernet working and to build a flashable image.

On our way to that destination, we ... basically, this is another thrilling instalment of "don't trust the bootloader"

This board builds with device tree, but its u-boot has no way to provide a device tree blob at start up, so what we have to do instead is bodge the device tree into the kernel itself.

The actual mechanics of glomming a device tree binary blob onto the kernel image are fairly straightforward if you have an openwrt build to crib from: generate the ELF vmlinux as usual, convert it to a raw binary, compile the DTS file (which involves preprocessing it with cpp because - I don't know why - it contains two different kinds of include directive and the dtc tool only understands one of them) and then use some magic patch-dtb tool to stick them together.

I previously described device tree as "let's pretend that the hardware description was provided to us by open firmware." and insofar as this means the hardware is described by a data file instead of by the effect of running imperative C code then it's definitely good. But when that data file is provided by the kernel source [*] and attached to the kernel image, instead of being passed in by the bootloader as an input to the kernel, things get weird. For example, the default config for the MT300A is that the kernel command line is coming from DT - effectively this means that the kernel provides its own command line. And you may ask yourself: where does that highway go to? It would make sense if the command line had been provided by the user to the bootloader which then merged it into the DT, but in this scenario ... not so much.

[*] splitting hairs here: the particular DTS we're using comes from LEDE and not from the mainline kernel. But that doesn't invalidate my point, which is that it doesn't come from the bootloader.

At this point, and skating over a minor digression where I had accidentally built a kernel that thought it was little endian but using a big endian compiler (tl;dr - didn't work), I had a kernel that booted most of the way to mounting root but then failed.

The reasons it failed were approximately legion in number, or at least that's how it felt. In the order I discovered them :

I. The kernel was configured to get its command line - see above - from DT not the bootloader, so was ignoring all my phram options because the hardcoded command line overrode them.

II. For reasons that no doubt made perfect sense to the people who were there are the time, the u-boot build on this board doesn't support bootargs anyway. So the kernel was ignoring all my phram options because it wasn't even seeing them . Remember, kids - the U in "u-boot" stands for #undef . (I am assuming the sources there correspond with the binary that shipped on my device, to be quite honest I haven't checked this properly for myself).

III. another bit of hardcoded magic ... there is a kernel config option in LEDE, again enabled by default, that looks through the MTD partition table and if it finds a partition named rootfs, sets it up as the root filesystem. This overrides any root= option given on the command line. So the openwrt rootfs on the actual flash was being used instead of the custom root fs in phram - and failing to work not least because it's a jffs filesystem and I'd set rootfstype=squashfs.

IV. Whereas the Yun wanted the address of the filesystem image as an offset from 0x80000000, the MT300A wants it offset from 0. I haven't got the MIPS memory map entirely straight in my head but i think I would be using the right words if I said it wants a kuseg memory address not a kseg0 address. Or perhaps it wants a physical address not a kseg0 address. (The same physical RAM is mapped in both places but the cache behaviour is different). Whichever is the right explanation I don't know, but when I fixed that, bingo, it boots!

I don't know if the Yun would work with kuseg addresses too, but next time I plug it back in I'll try it.

Also this week, significantly cleaned up how we provide config options to the nixwrt kernel derivation - now we read the appropriate defconfig file(s) into an attrset then pass that attrset into overrideConfig. overrideConfig is a function that accepts an attrset of default config options and is expected to return a customised set. It's a lot prettier than the rather awful mess of echo and grep -v it replaces, but it's not 100% perfect because it doesn't know the type of each config option value - if you have a string value you need to quote that string yourself (tip: use builtins.toJSON) but it will do for now. Probably I should extract it into a function so that I could use the same code to configure Busybox.

That's about it for this week.