diary at Telent Netowrks

Flip the switch#

Mon, 02 Apr 2018 22:07:34 +0000

[ meta: I wanted to call this one "sudo make me a LAN switch" but now the MRAs have ruined that phrase for everyone ]

I got networking working on the GL-MT300A. This entailed:

Patching the device tree definition.

This is slightly cargo-culted (I Read It On The Internet), but by removing the pinctrl-0 entry from the &ethernet stanza I managed to change the bootup mesages from saying

[    1.873672] rt2880-pinmux pinctrl: could not request pin 40 (io40) from group
 ephy  on device rt2880-pinmux
[    1.883620] mtk_soc_eth 10100000.ethernet: Error applying setting, reverse things back
[    1.891724] mtk_soc_eth: probe of 10100000.ethernet failed with error -22

to saying

<6>[    2.586201] mtk_soc_eth 10100000.ethernet eth0 (uninitialized): port 1 link up (100Mbps/Full duplex)
<6>[    2.595753] mtk_soc_eth 10100000.ethernet: loaded mt7620 driver
<6>[    2.602600] mtk_soc_eth 10100000.ethernet eth0: mediatek frame engine at 0xb0100000, irq 5

which felt a lot like progress but did not result in actual connectivity.

Building swconfig

After trying the obvious culprits (firewall rules? weird routing?) for why my board was not seeing the network - indeed, not even able to ping its own IP address - I studied the dmesg output a bit more closely, and noticing the line

<6>[    2.581845] gsw: setting port4 to ephy mode

I took a wild-ass guess that given I knew the the device contains some kind of network switch, maybe the switch doesn't come up in a useful state. So we needed a tool of some kind to reconfigure it and apparently the appropriate tool is swconfig

(If you followed that link and struggled to understand what it was talking about, be assured that it means nothing to me either. Ure not alone)

Building swconfig is easier when you start with a fork for Debian instead of the original OpenWRT package: I simply made my kernel derivation install the header files (first time I have written a multi-output derivation, but turned out that in this case it was a one-line change), wrote a derivation with libnl as a dependency, and created an 80MB filesystem image.

PRO TIP: don't override phases in a Nixpkgs derivation unless you understand what is done by all the phases you didn't include. In this case, not running the fixup phase meant that nothing ran the "shrink rpath" magic which removes unneeded compile-time dependencies from the runtime dependency list. Image size go boom.

Running swconfig

Dog with network cable, captioned \

I spent some time trying to figure out what I was doing with vlans and port configuration and worrying that when it said link: ?unknown-type? against each port that would mean I had to tell it somehow what kind of link type to use. Then eventually I hit on

swconfig dev switch0 set enable_vlan 0
swconfig dev switch0 set apply

and as if by magic (= "insufficiently advanced understanding of technology") it all started working. When I eventually want it to function as a switch, obviously I will need to revisit this. But that is Milestone 1 and this is Milestone 0 - sufficient unto the day etc etc.

The GL(.Inet) has landed#

Tue, 10 Apr 2018 22:57:27 +0000

See subject. Most of the work this last week has been moving things around in the hope of making it possible to support more than one device, and then merging the gl-mt300a branch into master (nitpick: for reasons I can't remember and are unlikely to be convincing, the primary development branch is actually called nixwrt not master. Probably something to do with git filter-branch)

This breaks the Yun code that was previously there, because "possible" and "actually implemented" are two different things. But I am not using it, I am pretty certain nobody else is either, and at least now I can see how to fix it.

I have added a targetBoard argument which will soon allow choice of mt300a, yun or malta (that last one for qemu), so the build command is presently:

nix-build -I nixpkgs=../nixpkgs-for-nixwrt/ backuphost.nix \
  -A tftproot --argstr targetBoard mt300a -o mt300a 

I have added a `swconfig` invocation to the monit config so that networking comes up. I've done it in a totally hacky way until I decide how to represent the switch in configuration, but that might involve learning how the damn thing works.

That's about all for now, save to say that yesterday I went to the NixOS London meetup which in the event did not involve an install party (everyone present had installed it already) but did involve some interesting conversations, learning something about overlays and giving a quick demo of NixWRT so far. Albeit that I was demonstrating by ssh back to my hardware at home : no actual hardware at the venue that could be waved around or kicked. I had intended to do something a bit more visual but thought better of it when I realised how much of my home kit I'd have to unplug. Next time, definitely.

Flash! (ah-ah)#

Mon, 16 Apr 2018 23:18:57 +0000

This week I successfully flashed NixWRT to my GL-MT300A, such that it runs whenever I turn the device on. And then I found it slowly fills up all RAM and the process table over about half a day then stops working.

But we'll get to that in a minute. Let's talk about the good bits first.

Graven image

Given we have a kernel and a root filesystem, how exactly should we mash them up into a flashable image such that (i) uboot will find and run the kernel, (ii) the kernel will find its filesystem? Let's look at the dmesg output from booting OpenWRT -

[    0.620000] m25p80 spi32766.0: w25q128 (16384 Kbytes)
[    0.630000] 5 ofpart partitions found on MTD device spi32766.0
[    0.630000] Creating 5 MTD partitions on "spi32766.0":
[    0.640000] 0x000000000000-0x000000030000 : "u-boot"
[    0.650000] 0x000000030000-0x000000040000 : "u-boot-env"
[    0.650000] 0x000000040000-0x000000050000 : "factory"
[    0.660000] 0x000000050000-0x000000fd0000 : "firmware"
[    0.780000] 2 uimage-fw partitions found on MTD device firmware
[    0.790000] 0x000000050000-0x000000174720 : "kernel"
[    0.790000] 0x000000174720-0x000000fd0000 : "rootfs"
[    0.800000] mtd: device 5 (rootfs) set to be root filesystem
[    0.810000] 1 squashfs-split partitions found on MTD device rootfs
[    0.810000] 0x000000890000-0x000000fd0000 : "rootfs_data"
[    0.820000] 0x000000ff0000-0x000001000000 : "art"

It has five "ofpart" partitions. I'm guessing the "of" stands for "open firmware" and indeed if we look at the DTS file (remember if you will from blog entries passim that the device tree is a representation of "what you'd get from open firmware if the hardware had open firmware") we can see five partitions defined there.

It has two "uimage-fw" partitions which further subdivide the firmware partition. Unlike the ofpart partitions these are not actually defined anywhere: they are the result of kernel code (in drivers/mtd/mtdsplit/mtdsplit{,_uimage}.c) which looks for partitions which start with a uimage, parses the image length, and then looks for a filesystem signature of some kind on the next erase block boundary. (This is very convenient magic not least because it means we don't have to update some partition table each time our kernel size changes, but it still makes me uneasy; I have a very low threshold for magic)

Hypothesis (subsequently proven): if our firmware file consists of a kernel wrapped in a uimage, plus padding to the next erase block boundary, plus the filesystem image, Linux will report an MTD uimage-fw partition that starts where the filesystem starts. We can copy this combined image into flash at offset 0x000000050000 to overwrite the existing kernel/root fs while leaving the rest of the flash undisturbed.

(To get the erase block size, we could (a) guess it's probably 128k, or (b) if we were more prudent, check in in /proc/mtd )

Short version: that half of the puzzle is solved by creative use of dd.

Flash override

So the general principle for flashing from U-Boot once you have a suitable image, is (1) download it into RAM somewhere, (2) erase an appropriate section of flash, (2.1) hope the power doesn't fail before you finish step 3, (3) copy the image from RAM into flash, (4) reboot and see whether you've bricked the device. Although unless you've done something badly wrong (like overwrite uboot itself or the ART partition) then it doesn't matter too much if the image you've uploaded doesn't actually work because you can just go back to the u-boot prompt and try again. There's an explanation of how to do this on the Yun which I had previously successfully followed, converting that to another board is just a matter of working out whereabouts in RAM the flash chip is mapped.

And the simplest way of doing this is to look at what uboot does by default when the board is powered on. If we run printenv we see i.a.

   bootcmd=bootm 0xbc050000
With 99.8% certainty, we think that this is a jump to offset 0x50000 (remember, this is the offset of the "firmware" partition) of a flash chip that starts at 0xbc000000, and this is probably all we need. So: on the build machine

$ nix-build -I nixpkgs=../nixpkgs-for-nixwrt/ backuphost.nix \
 -A firmwareImage --argstr targetBoard mt300a -o mt300a.bin
$ cp mt300a.bin /tftp

and then on the device, run these u-boot commands

setenv serverip 192.168.0.2 
setenv ipaddr 192.168.0.251 
tftp 0x80060000 /tftp/mt300a.bin
erase 0xbc050000 0xbcfd0000
cp.b 0x80060000 0xbc050000 ${filesize};
reset

and then offer up a silent prayer because 99.8% is still less than 100%. Punch the air shortly thereafter :-)

Flash I love you but we only have 14 hours

As alluded to above, there is something weird going on in userland that makes it presently less than useful: the ntpd and syslogd processes (both actually BusyBox applets) don't write pid files when they start up, causing monit to decide they have failed to start and spawn another. One of each of them added every 30 seconds soon leads to a poorly computer.

No idea why, yet. I'd run strace but it doesn't want to build (maybe a MIPS thing, maybe a musl thing). Hopefully next week...

Completely random aside

And I mean completely random. Is it just me or does anyone else find that the taillight cluster on the new Prius reminds them of Ming the Merciless?

Maybe just me.

Null hypothesis#

Tue, 24 Apr 2018 22:39:25 +0000

When I blogged last week I left off at the point that syslogd and ntpd were not writing their pid files into /run.

Having basically no hypothesis[*] as to why this might be, I looked into building gdbserver and I looked into building strace and then I decided to try good old printf debugging first. To do this I wanted a slightly faster development cycle than having to build and tftp an entire image for each change.

Shared folders

It is probably fair to say I had never heard of the Plan 9 Filesystem Protocol (9P for short) before I started messing around wth Nix, but it turns out not only is it a Thing That Replaces NFS, but that it is a Thing which has client support in the Linux kernel and server support in qemu. So now I start qemu with some command rather like

qemu-system-mips  -M malta -m 128 -nographic -kernel malta/kernel.image \
 -virtfs local,path=`pwd`,mount_tag=host0,security_model=passthrough,id=host0 \
 -append 'root=/dev/sr0 console=ttyS0 init=/bin/init' \
 -blockdev driver=file,node-name=squashed,read-only=on,filename=malta/rootfs.image \
 -blockdev driver=raw,node-name=rootfs,file=squashed,read-only=on \
 -device ide-cd,drive=rootfs -nographic

where for the purpose of this blog post the interesting bit is the line starting with virtfs. This exports the current working directory such that inside the qemu VM I can mount it with

mount -t 9p -o trans=virtio,version=9p2000.L host0 /run/mnt

There is one weirdness that I've found, which is that trying to run a binary from from inside /run/mnt fails with Not a socket errors. I don't know why, but if I copy the same binary into /tmp it works fine. Don't ask me, because I don't know.

Anyway, this is all rather lovely because now I can build binaries - such as, for completely random example, a busybox executable with calls to fprintf(stderr, "here %s:%d\n", __FILE__, __LINE__) every four lines - and try them instantly in a running emulator wthout restarting anything. Doing this led me to the discovery that it's trying to open /dev/null as part of daemonising itself, and that that file is somehow mode 0660 instead of 0666. I took a guess (yes, ladies and gentlemen, the "null hypothesis" [*]) that this might make a difference, and turns out I was right. So, fix is to add an mdev.conf entry . It would be interesting to know why it makes a difference, considering that both of the daemons involved run as root, but I haven't really dug into it.

Change my switch up

In other news, after some experimentation with iproute2 and some more reconfiguring the kernel, I believe I might just understand how the builtin switch works and can do networking the Right Way (i.e. treat the WAN and LAN ports separately). But more about that in another thrilling installment.

[*] if you're reading this footnote for the first time, it's not just a lame joke it's foreshadowing an even worse one. See you again in about five paragraphs.

Let's switch again#

Sun, 29 Apr 2018 13:47:07 +0000

Achieving anything new this week has been rather hampered by (1) my decision to try out XMonad, and then (2) the kids all picked up some kind of vomiting bug. I do not intend you to infer any connection.

(XMonad: dunno yet, haven't really tried using it for long enough. The mouse pointer is impossibly small and I'm going to have to fix that sooner or later, but I only need it for gui apps and all I've really tried using thus far is emacs and rxvt)

But let's see if I can explain why I've been hung up on switches lately. If you've ever wanted to know why your OpenWRT router has network interfaces with names like eth0.1 (no, it's not a misguided decision to do semantic versioning on them) maybe this is for you.

First, you need to know about VLANs. A VLAN is a "virtual LAN": a way to multiplex traffic for multiple independent LANs onto the same cable. In the picture on your right, VLAN 5 connects ports 1 and 3 on switch A with ports 2 and 4 on switch B, and VLAN 6 connects port 2 on A with 3 on B.

The switches do this by "tagging" the packets (frames) according to per-port rules. If alice sends to fred, her frames will be tagged with VLAN ID 5 when they enter switch A, sent (with the tag intact) to switch B, and then untagged again as they are sent out of port 3 to fred. Long story short -

This is super useful, I have no doubt, if you're running an enterprise network and need to keep devices separated without having to run multiple sets of cables to every desk. But how is it relevant to *WRT? Because the hardware you're running it on, despite any impressions you might have had from its inclusion of two or five or eight RJ45 sockets, quite likely only has one ethernet device that Linux can see. This device is connected (internally, inside the SoC) to a builtin switch, which is also connected to all the sockets you can see. So if you want to address them separately - for example, you want to connect your upstream connection to one of them without giving it full access to your LAN and vice versa - you do it with VLAN configuration.

Make the switch

Set the upstream port to tag VLAN 1 and the LAN ports to tag VLAN 2, and the "CPU port" (the one connected to the eth0 device that Linux sees) to allow VLANs 1 and 2 but not to tag/untag either - i.e. it receives tagged packets and particapates in the VLAN as if it were another switch.

We would do this with swconfig but we haven't yet because this turns out to be the default configuration anyway, at least on the MT300A. (I have no idea whether it's hardware or u-boot or the devicetree config that makes it be this way - at some point I dare say I will find out though)

Configure Linux

So we have a switch on the SoC which is sending VLAN tagged frames to the ethernet interface - we need to tell Linux to expect them. (You might say it needs configuring to serve VLAN).

You do this with the ip command found in iproute2

ip link add link eth0 name eth0.1 type vlan id 1 
ip link add link eth0 name eth0.2 type vlan id 2 
ip addr add 192.168.0.251/24 dev eth0.2 
ip link set dev eth0 up

and now for most purposes you can treat eth0.1 and eth0.2 as though you have two network interfaces.

I'd like to close by saying "Simples!" but when I googled that term I learned of the existence of contemporary mereology as a field of philosophical study, and I am not sure anything can ever be simple again.