Musl memory#
Wed, 28 Feb 2018 00:36:07 +0000
A short post this week, but this is because I need to sleep, not
because there is nothing to write about.
First up, NixWRT has moved. It is no longer part of a "lightly forked
nixpkgs" repo, it has its own repo containing only NixWRT stuff at
https://github.com/telent/nixwrt . Instead of embedding the Nix package collection it now requires that you provide it with one by e..g using the -I
flag to nix-build
nix-build -I nixpkgs=../nixpkgs-for-nixwrt/ -A tftproot backuphost.nix
Presently there is still a mildly forked Nix package collection
involved, but it is now available separately, and I have started the
process of feeding the changes back into upstream so I hope to be able
to eliminate that dependency in time.
Second, it builds with musl - which is great news as the image for
`backuphost` is too big to fit in 8MB flash when using glibc. The
changes
required
to switch to musl are - apart from a small bug in nixpkgs libiconv
derivation - ludicrously trivial.
Third, I was not entirely correct last week when I said that upgrading
to nixpkgs master caused nixwrt to break "almost not at all", because
after I actually split the repos up I found a couple more patches
needed than just the two mentioned. But nothing too serious.
Here's what it looks like:
[dan@loaclhost:~/src/nixwrt]$ ls -l yun/
total 9608
-r--r--r-- 1 root root 1565199 Jan 1 1970 kernel.image
-r--r--r-- 1 root root 2568192 Jan 1 1970 rootfs.image
-r-xr-xr-x 1 root root 5698784 Jan 1 1970 vmlinux
(vmlinux
is not actually required on the target, it's a leftover)
Next up will be more patch upstreaming, and making it generate an
image I can actually flash onto a TL-WR842. It is claimed that the
emergency debricking TFTP client only works when fed with actual
TP-Link images and not with OpenWRT, which is going to be bit of a
drag if true.
Grandmaster, cut faster#
Tue, 06 Mar 2018 22:28:59 +0000
No Nix content at all this week, as all I've done is flash (please
refer back to blog post title) my TL-WR842ND back to the factory
firmware in preparation for figuring out how to get NixWRT onto it.
There's some discussion of how to do this on the OpenWRT
wiki - attach
router to wired network, configure a tftp server to answer on
192.168.1.66 and respond to requests for a file called
wr842ndv1_tp_recovery.bin
which was previously downloaded from the
TP-Link site, then turn the router on while holding RESET and wait for
stuff to happen.
As always, however, there is a wrinkle. The firmware I downloaded was a ZIP which contained a file called wr842ndv1_en_3_12_25_up_boot(130322).bin
, and according to most sources (most sources parrot the OpenWRT wiki)
in case the file name of this firmware file does contain the word “boot” in it, you need to cut off parts of the image file before flashing it:
specifically, remove the first 131584 bytes. Why that number? It doesn't say.
This is what binwalk is for
[dan@carobn:~]$ nix-shell -p python27Packages.binwalk --run "binwalk /tmp/wr842.bin"DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 TP-Link firmware header, firmware version: 1.-3012.3, image version: "", product ID: 0x0, product version: 138543105, kernel load address: 0x0, kernel entry point: 0x80002000, kernel offset: 8258048, kernel length: 512, rootfs offset: 872767, rootfs length: 1048576, bootloader offset: 7077888, bootloader length: 0
110592 0x1B000 U-Boot version string, "U-Boot 1.1.4 (Mar 22 2013 - 09:09:03)"
110768 0x1B0B0 CRC32 polynomial table, big endian
131584 0x20200 TP-Link firmware header, firmware version: 0.0.3, image version: "", product ID: 0x0, product version: 138543105, kernel load address: 0x0, kernel entry point: 0x80002000, kernel offset: 8126464, kernel length: 512, rootfs offset: 872767, rootfs length: 1048576, bootloader offset: 7077888, bootloader length: 0
132096 0x20400 gzip compressed data, has original file name: "vmlinux.bin", from Unix, last modified: 2013-03-22 01:11:22
1180160 0x120200 Squashfs filesystem, big endian, lzma signature, version 3.1, size: 4675579 bytes, 562 inodes, blocksize: 65536 bytes, created: 2013-03-22 01:24:41
So there you are: the emergency tftp restore expects an image with a TP-Link firmware header followed by a kernel followed by a filesystem - which roughly corresponds with the description of mtd5
in the openwrt flash layout - but the image on the TP-Link site prefaces that with about 128k of something that might be U-boot, which roughly corresponds with the layout of the entire flash chip
Going forward this is relevant insofar as it means we really have two
problems not just one
- creating a firmware layout for NixWRT which is acceptable to some
flashing tool or other - the tftp emergency flash, or the "Firmware
upgrade" web ui in the OEM firmware, or some facility offered by
OpenWRT if I flash that first.
- creating a kernel and fs which will boot successfully on the
hardware and work well enough to bring up networking. Because, as
previously mentioned, I have not yet been able to make the serial
console work.
Currently thinking: we can tackle problem 2 first. Let's put OpenWRT
on the machine (then at least I have ssh available) and then build a
kernel/fs I can start with kexec and iterate on that until I know it
works on the hardware. Once we have the right code then we can
start figuring out how to put it at the right offset.
Booted out#
Wed, 14 Mar 2018 14:12:58 +0000
This is another week of not much done, but for the record
kexec
You can't use kexec to boot into a new kernel unless the kernel you're
booting from has support for it. So that cunning plan is out.
Das U-Boot is billed as "the Universal Boot Loader", but sometimes I
wonder if in practice the U stands for "unique per board" or
"unco-ordinated" or even "uninstallable" - simply because the actual
version of u-boot that comes installed on your cheap consumer router or IoT device board is a forked and undocumented
mess based on an upstream release that's probably about ten years old,
and if you want to replace it with mainline U-Boot you have to either
(1) be lucky enough to have your new build work perfectly first time,
or (2) have access to JTAG or a serial programmer in case it doesn't.
u-boot_mod
... u-boot_mod
looks really rather cool if you have a device it supports - in addition to the basic u-boot it has a web server and a network console
Unfortunately, as it doesn't support my device (it supports some
varieties of TL-WR841 and a later revision of WR842 than mine) I'm
disinclined to try building it given that if it doesn't work - and
that it's sensitive to things like gcc version - there is again no way
to resurrect the device without special hardware.
Excuses, excuses. What's the answer?
New hardware
I ordered this yesterday, so when Amazon eventually deign to deliver it,
development will/may resume.
Lede by example#
Tue, 20 Mar 2018 21:51:52 +0000
My GL-MT300A arrived just as I was about to go on holiday. This is how far I've got -
Serial console
These things are, if not actually made for DIY purposes, at least
very tolerant to such uses. Take it out of its case and you find
three standard 0.1" header pins on the PCB labelled "TX", "RX", "GND"
- connect each of them to something that speaks TTL serial (I used a
Raspberry Pi) and set the baud rate to 115200. Worked first time.
"The U is for Uninitialized"-Boot
I commented previously about the differences one may encounter
between two devices both of which run the allegedly "Universal" U-Boot
boot loader. This time I couldn't work out why my tftp downloads were
loading into memory at offset 0 instead of, say, 0x811f8000
. Until I
realised that (i) it no longer sufficed to say
setenv rootaddr 11f8000
setenv rootaddr_useg 0x$rootaddr
setenv rootaddr_ks0 0x8$rootaddr
and I must now surround environment variable references with curly braces.
setenv rootaddr 11f8000
setenv rootaddr_useg 0x${rootaddr}
setenv rootaddr_ks0 0x8${rootaddr}
and (ii) on this device, double quotes around the value of a setenv
are no longer special, so
setenv bootn "foo;bar"
will set the value of
bootn
to
"foo
and then attempt to run the command
bar"
. Which typically doesn't work all that well.
Hello darkness my old friend
Having made the relevant changes I was able to get the following output:
## Booting image at 81000000 ...
Image Name: Linux-4.9.76
Image Type: MIPS Linux Kernel Image (lzma compressed)
Data Size: 1705466 Bytes = 1.6 MB
Load Address: 80001000
Entry Point: 803fa9c0
Verifying Checksum ... OK
Uncompressing Kernel Image ... OK
No initrd
## Transferring control to Linux (at address 803fa9c0) ...
## Giving linux memsize in MB, 128
Starting kernel ...
followed by indefinite but emphatic silence, and various bouts of
fiddling with CONFIG_EARLY_PRINTK
and stuff have not yet persuaded
it to loosen up. Currently I am running LEDE in a Docker container to
see what it does, and diffing its .config
with mine. This has
shown up a couple of things that I've now added to my configuration, but I am
only going to get one shot at running it before I go home, because at
70 miles distant from the hardware I can't reach across and power
cycle it.
The U is for urgghle#
Wed, 28 Mar 2018 00:36:42 +0000
A more productive week than the previous one , on the whole.
(Once I got warmed up, at least. I returned from my holiday to find
that the entire local network had stopped working because after
mucking around with U-boot on the device while I was away, I'd
inadvertently let the default openwrt installation on the MT300A start
, and it was running a DHCP server. Shouldn't have put it on the LAN,
I suppose. One round of reboots later ... but apart from that it has
been a productive week)
Let's start by spoiling the ending, so you don't have to read the rest of this
post: said MT300A now boots to user space and runs init (and monit). I hope
that all that remains is to get the Ethernet working and to build a flashable
image.
On our way to that destination, we ... basically, this is another
thrilling instalment of "don't trust the bootloader"
This board builds with device tree, but its u-boot has no way to
provide a device tree blob at start up, so what we have to do instead
is bodge the device tree into the kernel itself.
The actual mechanics of glomming a device tree binary blob onto the kernel image
are fairly straightforward if you have an openwrt build to crib from:
generate the ELF vmlinux as usual, convert it to a raw binary, compile the DTS
file (which involves preprocessing it with cpp
because - I don't know why -
it contains two different kinds of include
directive and the dtc
tool only
understands one of them) and then use some magic patch-dtb tool to stick
them together.
I previously described device tree as "let's pretend that the hardware
description was provided to us by open firmware." and insofar as this
means the hardware is described by a data file instead of by the
effect of running imperative C code then it's definitely good. But
when that data file is provided by the kernel source [*] and attached
to the kernel image, instead of being passed in by the bootloader as
an input to the kernel, things get weird. For example, the default
config for the MT300A is that the kernel command line is coming from
DT - effectively this means that the kernel provides its own command
line. And you may ask yourself: where does that highway go to? It
would make sense if the command line had been provided by the user to
the bootloader which then merged it into the DT, but in this scenario
... not so much.
[*] splitting hairs here: the particular DTS we're using comes from
LEDE and not from the mainline kernel. But that doesn't invalidate
my point, which is that it doesn't come from the bootloader.
At this point, and skating over a minor digression where I had accidentally
built a kernel that thought it was little endian but using a big endian
compiler (tl;dr - didn't work), I had a kernel that booted most of the way to
mounting root but then failed.
The reasons it failed were approximately legion in number, or at least that's
how it felt. In the order I discovered them :
I. The kernel was configured to get its command line - see above - from DT not
the bootloader, so was ignoring all my phram
options because the hardcoded
command line overrode them.
II. For reasons that no doubt made perfect sense to the people who
were there are the time, the u-boot build on this board doesn't
support bootargs
anyway. So the kernel was ignoring all my phram
options because it wasn't even seeing
them
. Remember, kids - the U in "u-boot" stands for #undef
. (I am
assuming the sources there correspond with the binary that shipped on
my device, to be quite honest I haven't checked this properly for myself).
III. another bit of hardcoded magic ... there is a kernel config option in
LEDE, again enabled by default, that looks through the MTD partition table and
if it finds a partition named rootfs
, sets it up as the root filesystem.
This overrides any root=
option given on the command line. So the openwrt
rootfs on the actual flash was being used instead of the custom root fs in
phram - and failing to work not least because it's a jffs filesystem and I'd set rootfstype=squashfs
.
IV. Whereas the Yun wanted the address of the filesystem image as an
offset from 0x80000000, the MT300A wants it offset from 0. I haven't
got the MIPS memory map entirely straight in my head but i think I
would be using the right words if I said it wants a kuseg memory
address not a kseg0 address. Or perhaps it wants a physical address
not a kseg0 address. (The same physical RAM is mapped in both places
but the cache behaviour is different). Whichever is the right
explanation I don't know, but when I fixed that, bingo, it boots!
I don't know if the Yun would work with kuseg addresses too, but next time I plug it
back in I'll try it.
Also this week, significantly cleaned up how we provide config
options to the nixwrt kernel
derivation
- now we read the appropriate defconfig file(s) into an attrset then
pass that attrset into overrideConfig
. overrideConfig
is a
function that accepts an attrset of default config options and is
expected to return a customised set. It's a lot prettier than the
rather awful mess of echo
and grep -v
it replaces, but it's not
100% perfect because it doesn't know the type of each config option
value - if you have a string value you need to quote that string
yourself (tip: use builtins.toJSON
) but it will do for now.
Probably I should extract it into a function so that I could use the
same code to configure Busybox.
That's about it for this week.
- With luck and a following wind I'll be at the NixOS 18.03 install
party
at Codenode - I hope to bring along some hardware - so if you're
interested and in London, sign up and come along.