diary at Telent Netowrks

I will run in the path of your commands#

Wed, 04 Jul 2018 00:33:38 +0000

I found last week's weird bug not long after posting. Debugging really got underway when I tried setting LD_LIBRARY_PATH to include /nix/store/...-zlib-1.2.11.../lib and observed that then my binaries were able to start. From this I inferred that the libraries themselves were most probably fine and the problem must be in the binaries referring to them or in the dynamic linker (or "ELF program interpreter" as we're apparently supposed to call it)

Running strings on broken or on working binaries didn't turn up much of note, but when I ran readelf -d monit I got

Dynamic section at offset 0x230 contains 31 entries:                                
  Tag        Type                         Name/Value                                
 0x00000001 (NEEDED)                     Shared library: [libz.so.1]                
 0x00000001 (NEEDED)                     Shared library: [libc.so]                  
 0x0000001d (RUNPATH)                    Library runpath: [/nix/store/6qw1h5hwikg4wv9dhfhyk08pzskph6y1-zlib-1.2.11-mips-unknown-linux-musl/lib:/nix/store/mkvy309rmdjzrj81j8hmc13j2fq6dpl1-musl-1.1.18-mips-unknown-linux-musl/lib]                         
 0x0000000c (INIT)                       0x4078b4                                   
...

for a working monit and something more like

Dynamic section at offset 0x230 contains 31 entries:                                
  Tag        Type                         Name/Value                                
 0x00000001 (NEEDED)                     Shared library: [libz.so.1]                
 0x00000001 (NEEDED)                     Shared library: [libc.so]                  
 0x1d000000 (<unknown>: 1d000000)        0x278e                                     
 0x0000000c (INIT)                       0x4078b4
...

and wait what why's there that 1d in the MSB instead of in the LSB where we'd recognise it? Either gcc (or ld or something) is misgenerating the ELF tags, or something afterwards is trashing them. Long story short, it turns out that something is patchelf and that I am not the first person to find the bug.

Given the patch in the PR (thanks UraniumKnight), it was comparatively simple to add it locally to my overlay and now everything is working. I still can't run with exact nixpkgs master, but there are only two changes and I have submitted PRs for both: #42795 and #42794

Next steps (ongoing): bring the mt300a config up to date so I can get cracking on replacing the OS on my primary internet router. Probably I should buy another one for this purpose so that I actually have a test device, I don't think the family will appreciate it if I kill the live one.