diary at Telent Netowrks

Radio Free Europe#

Thu, 26 Dec 2019 23:46:56 +0000

The GL-AR750 now has working (though not particularly fast) wifi on both 2.4GHz and 5GHz bands. A fair amount of fiddling was required to get us to this point, so in the best tradition of the Christmas Radio Times (a pre-digital British institution, don't know if still a thing) here is a double-length post about it all.

The GL-AR750 has two distinct sets of wifi hardware. The 2.4GHz stuff is part of the QCA9531 SoC, i.e. it's on the same silicon as the CPU, the Ethernet, the USB etc. The device is connected to the host via AHB, which I think (but have not confirmed) stands for Atheros Host Bus, and it is supported in Linux using the ath9k driver. The 5GHz support, on the other hand, is provided by a QCA9887 PCIe (PCI embedded) WLAN chip: I haven't looked closely at the router innards to see if this is actually physically a separate board that could be unplugged, but as far as the Linux is concerned it behaves as one. This is supported by the ath10k driver. Clear so far?

Five giga hertz, four calling birds, three French hens ...

My approach to porting NixWRT was basically

and the answer, at least initially, was that I got no kind of anything from the ath9k driver and some error messages from ath10k, so I thought I'd start there.

A firmware hand on the tiller

There are two things that ath10k devices need from their host environment that are not provided directly by the driver: the firmware and the calibration data. The firmware is the code that the wifi chip runs and we have to upload into it when it boots, and the calibration data, by my somewhat hazy impression, is stuff like tuning parameters for e.g. knowing which amplitudes correspond to what power outputs (which is obviously going to depend on the amplifiers, antenna design, etc, and therefore will differ depending on how the device is wired up).

On a proper PC the driver obtains the firmware by doing some kind of call-out-to-udev dance that makes userspace find it on the filesystem and feed it back into the kernel, and then feeds it into the device using the BMI (Bootloader Messaging Interface). For NixWRT I want a monolithic kernel, but happily there is a config option for people like me: CONFIG_EXTRA_FIRMWARE takes a space-separated list of files that it expects to find in a location given in CONFIG_EXTRA_FIRMWARE_DIR, and bakes their contents into the kernel in some way such that the generated kernel can make calls with names like request_firmware to find them. So we can do the firmware using that.

Calibrate good times (come on!)

The calibration data is a little bit more involved, though .... On a PC or other "proper" computer, there's some kind of storage on the wireless card (this might be a so-called "OTP", which surprisingly enough stands for "One-Time Programmable" - or maybe an EEPROM - or according to some parts of the internet maybe even both? I'm sketchy here) which the manufacturer has set up with the cal data. When the driver initializes the card, it reads from the OTP or maybe the EEPROM or maybe it tries both (if they're not the same thing) and pushes that data into the device proper.

For a device which is intended for embedded systems, like the QCA9887, the manufacturers might not incorporate an OTP. It's destined for use in something that already has non-volatile memory on the host, why not just use some of that?

On some devices, it can be a little more involved, though ... the calibration data comes in two parts. There's the so-called pre-cal data, plus the board data file (BDF), and the two are combined somehow inside the device. Courtesy of a mailing list post

  1. load a firmware(-5).bin from /lib/firmware/ath10k/QCA4019/hw1.0/
  2. load the pre-cal (aka first part of calibration) data from /lib/firmware/ath10k/pre-cal-*
  3. do some firmware magic to identify the reference design
  4. load board data "files" (BDF) for this reference design from /lib/firmware/ath10k/QCA4019/hw1.0/board-2.bin
  5. send the BDF data to the firmware to let it compute the final calibration data
  6. start the actual wifi stuff

but wait! On a board which is not the Atheros reference design, it can be a little more involved ... only the reference boards get assigned board ids, and everyone else just borrows a board ID from something that they don't share electrical/RF/whatever characteristics with. Yay.

The IPQ4018/4019 SoC doesn't contain the actual RF parts. There are a couple of reference designs (SoC+RF parts) from QCA which got official numbers. These numbers identify the BDFs inside the board-2.bin. And the board-2.bin is not the firmware - it is a container for multiple BDFs.

Having said all that, I believe that for the QCA9887 we can skip some of this, because the ART partition in the flash (aka MTD) contains the final combined calibration data, so all we need to do is retrieve that and splat it into the device. For this relief much thanks - no worries about which board id we're improperly appropriating, just a lovely blob of binary mystery meat we need not examine closely. I hope.

Of course, that does still require us to be able to read the MTD. The ath10k driver doesn't already know how (as far as I can tell): it can get it from OTP or by asking via request_firmware or from the device tree.

Colonic irritation

So after obtaining a copy from my ART partition by booting OpenWRT and copying it across, my first thought was to add it to CONFIGEXTRAFIRMWARE except haha that that doesn't actually work because the driver requests a filename containing colon characters (it's something like cal-pci-0000:00:00.0.bin) and the thing that bakes firmwares into the kernel is written as a baroque piece of Makefile rule, and make is prejudiced against filenames that contain colons. The exact error message was target pattern contains no '%' and I am quite proud of myself for not having spent even longer than I did working out the actual problem.

Stuck between the orang-utan and one of the boats

There's a very funny childrens story about sticking things in a tree that probably shouldn't be there, and this next bit reminds me of it. If the device tree actually were a creation of the bootloader and it was passing configuration data into the freshly booted kernel as a parameter, I would willingly accept that 2k of binary blob encoding the length of the wireless antenna and the setting of the RF amplfier's volume knob is an appropriate part of that configuration data. As we are instead creating the device tree elsewhere on a build server and glomming it onto the end of the kernel we deploy to our target device, I am less convinced. But I am nothing if not pragmatic, and it beats coming up with an actual kernel patch to change the expected name of the calibration data file.

So, the cal data is now part of the device tree. To make this simpler we rearranged some of the code that builds the device tree from source, such that it's now its own derivation instead of part of the uimage derivation.

This actually works!

The animals went in 2.4

So, all that's left to do is add the ath9k.

First, it turns out we have the same problem here as we did with the mt7620 - the driver doesn't have the OF metadata to say it's compatible with the device.

Next, it turns out the mainline (4.19) kernel ath9k driver doesn't even support AHB anyway, only PCI. There's a patch in OpenWRT for that though, which also teaches it how to get its calibration data straight from MTD. This is a different way of doing it than we did in ath10k which offends my sense of perfect symmetry, but my occasional streak of pragmatism is kicking my sense of perfect symmetry under the table and my sense of perfect symmetry is keeping schtum.

Next next, when I added this, it stopped the ath10k from working. Argh.

Rules and regulations

This next bit I am slightly sketchy about, but this the internet so here goes anyway. Different countries have different laws about what you can broadcast on the radio, and even in parts of the spectrum like the 2.4GHz ISM band which are supposedly available globally, there are different power limits in various places. In Linux, there is a CRDA (Central Regulatory Database Agent) which can be queried to find out what you can do at any given frequency, but again there are kernel config flags to let us bake this into the kernel.

The problem is made more complicated by Atheros, who have decided that they also should lock the hardware to a particular set of local rules (anyone remember DVD region locks?) by having the EEPROM say which reg domain is supported then restricting you to the intersection of those rules and the rules of the regulatory domain that you've said are applicable in your location. Again I am a trifle sketchy here (because my other device has the same dmesg output but not the same problem) but this seems to cause problems because the EEPROM settings are for regdomain 0 - which is either "international" or "US", depending on who you believe - and the combined effect of that and requesting UK region is to disallow any operation on 5GHz channels.

(In passing: what I find odd about this is that it seems that a setting in the ath9k eeprom can change the behaviour of the entirely separate ath10k)

The Onus is Upon Us

Long story short(ened, but still really rather long): we have to add CONFIG_CFG80211_CERTIFICATION_ONUS to make it work. As far as I can work out this means "turn off all the safeties that ensure your transmitter is legal", so I'm not altogether happy about this. I need to do a bit more digging to ascertain whether there are different applicable restrictions for APs than there are for stations, because it would be much cleaner if we could enforce some appropriate restrictions instead of just disabling inappropriate ones. In OpenWRT there's a patch to disable enforcing the EEPROM regulatory restrictions which might be a less nuclear option if it works.


It works, but it needs tuning. Next steps:

The other thing that might be worth looking at, I have recently learned about, is the Linux Backports Project which "enables old kernels to run the latest drivers".

Creta than the sum of its parts#

Sun, 24 Nov 2019 17:01:18 +0000

The Creta (GL-AR750) travel router now boots NixWRT. Although, I hasten to add, I have not checked whether (and don't really expect that) any of its network interfaces work yet.

As alluded to previously, the snapshot release of OpenWRT for this device uses the newer ath79 target, not the older ar71xx target - in their words "it's modernization under the hood, with the main goal to bring the code into a form that is acceptable for Linux upstream, so that all (most) of the whole ar71xx supported devices can be handled by an upstream, unpatched Linux kernel". Which is, obviously, a good thing, and one which I wanted to have in NixWRT instead of sticking with the rather elderly kernel that comes with the vendor firmware.

So, a summary list of changes required

I also lost a day or more because of an off-by-0 error. Which is to say, off-by-factor-of-16, where I'd added a zero to the end of the address I was loading the firmware into RAM and not added the same zero to the address where I told it to look for the firmware. Because this workflow depends on CONFIG_MTD_SPLIT_FIRMWARE which is now deprecated I made the unwarranted assumption that the problem was more complicated than it turned out to be

None of this is actually on Github right now, because of the strong likelihood (racing certainty) that it breaks the ramips target, but by tomorrow with a bit of luck it will probably at least be checked in on a branch.

Got the power#

Mon, 18 Nov 2019 23:26:36 +0000

This is little more than a placeholder, really, to note that I am 98% of the way to a working NixWRT test system again.

I rebuilt the computer-controlled USB power cable from last summer, because some time between then and now half the cables fell out and it wasn't immediately obvious where to put them back. This time I dun it on stripboard with soldered connections, but I connected it to a GPIO pin on the Raspberry Pi instead of using an Arduino. Because I need the Pi anyway and why needlessly multiply entities? That went pretty well except for the bit where I killed the transistor by failing to clip the legs until sometime after bending them to touch each other and applying power. Lesson learned.

The thing you see it plugged into is a second GL-AR750, because the first one is now serving my family's Internet needs (using OpenWRT, I am not in a dogfood situation here) and can't really be unplugged just so I can play with it (because that would be something closer to a doghouse situation). One piece of good news (maybe not actually new news, but I only noticed recently ) here is that the Linux kernel for said device has now been ported from the ar71 subarchitecture to ath79, which means it now uses device tree files instead of hardcoding where all the hardware bits and bobs inside it are.

Light touch regulation#

Wed, 02 Oct 2019 23:34:42 +0000

Or less obliquely, how to configure the Thinkpad X1 Carbon (gen 4, if it matters) touchpad in NixOS to not respond to the lightest brush of a finger as if it were a click. An end to accidentally favouriting random tweets while browsing Twitter on Firefox. Or at least, I hope, the elimination of a significant source of such.

First off, the instructions at https://people.freedesktop.org/~whot/libinput-rtd/touchpad-pressure.html#touchpad-pressure-hwdb are very nearly all true, and you should read them because I am not going to recapitulate them. I say "almost" because I found that where it says list-quirks I had to say quirks list instead. Following that process led me to create a file /etc/libinput/local-overrides.quirks that reads as follows:

[Section touchpad pressure]
MatchName=*SynPS/2 Synaptics TouchPad

Second, it's more complicated than that. In NixOS 19.03, the libinput binary doesn't look at that file, it looks at /nix/store/9lwm03xqd8pkbxc3hgq9iiginddiyha3-libinput-1.12.6/etc/libinput/local-overrides.quirks, which is owned by the libinput derivation and can't be changed. (This is probably a bug. I will report it in the morning). Because I am trying to stick to channels and not maintain more forks of nixpkgs than I need to, I decided to patch it with an overlay. Therefore

[dan@noetbook:~]$ cat /etc/nixos/overlays/libinput.nix
self : super: {
  libinput = super.libinput.overrideAttrs (o: {
    mesonFlags = o.mesonFlags ++ [


[dan@noetbook:~]$ grep -B1 -A9 overlays /etc/nixos/configuration.nix 
  nixpkgs.overlays = [ (import /etc/nixos/overlays/libinput.nix) ];
  environment.etc."libinput/local-overrides.quirks" = {
    text = ''
[Section touchpad pressure]
MatchName=*SynPS/2 Synaptics TouchPad

and nixos-rebuild switch, and after some swearing caused mostly by accidentally forgetting to exit all the nix-shells I had accidentally got myself into, about 30 minutes rebuilding later, it worked.

Note that adding overlays in configuration.nix does not make them available to nix-env or nix run, so you probably also want to add libinput to environment.systemPackages if you want to test your quirks are getting picked up. Then you can do this:

[dan@noetbook:~]$ strace -e openat libinput quirks list /dev/input/event7 2>&1 |grep etc 
openat(AT_FDCWD, "/etc/libinput/local-overrides.quirks", O_RDONLY) = 3

[dan@noetbook:~]$ libinput quirks list /dev/input/event7 ModelSynapticsSerialTouchpad=1 AttrPressureRange=45:42 AttrThumbPressureThreshold=100

A great improvement / A+++ would recommend. Wish I'd done it three years ago.

Nix wrought#

Tue, 03 Sep 2019 22:00:29 +0000

Time from unboxing new GL.iNet GL-MT300N-V2 to an ssh-able NixWRT installation: 30 minutes. Though admittedly this does not include the actual firmware build time itself as I did that bit yesterday when I ordered the box.

I lost the CPU for my backup server somewhere when we moved - I still have the disk, just not the bit that makes it go. Probably it's in a box I haven't unpacked yet, but anyway. Having generated much new "content" over the past few days - I've now scanned something over 500 pieces of paper into my paperless archive - it becomes somewhat more pressing to get the automated backup service running again.

  1. make the image
  2. find some scissors, open the box
  3. plug the device LAN port into my laptop and configure it to use a different RFC1918 address than the one it came with (which conflicts with the LAN here)
  4. upload the firmware.bin using the gl.inet web router admin page
  5. wait
  6. why is it not showing up on the LAN?
  7. wait
  8. ah yes, because it's still plugged into my laptop. Try plugging it into a LAN switch instead
  9. odds bodikins, I can ssh into it!

So, 30 minutes, would have been quicker if I weren't an idiot at step 6. To say I am mildly stoked this went so smoothly would be an understatement.

There are a couple of niggles: I need to rebuild the image because I forgot to update the name of the syslog host and I think I have probably also forgotten to put a real password on the rsync service. But both those are (or should be) simple fices.