Blogging on logging#
Fri Jun 14 07:34:20 2019
Before I put NixWRT on my primary internet connection, I want to deploy this wireless range extender, which means unplugging the serial connection. So, I really need to make it send syslog over the network.
"Just [* ] install syslogd", you say, "how hard can it be?". There's a syslogd applet in Busybox, all I need is something elsewhere on the LAN to receive the messages (hint: not Journald ). But, before I can send log messages over the network, I need the network to be available:
- the DHCP client on br0device has been allocated an address
- which means the bridge between wireless and wired is up
- which means hostapdis running
- and swconfighas run
- which means monit(process supervisor) has started
- which means the kernel booted successfully and started userland
Which in sum is about 96% of everything it needs to do when it boots up, and whether that all goes to plan or not, any log messages it generates will be lost like tears in rain. Forgotten like a politician's manifesto commitments. Cast to the ground like a toddler's breakfast. I wanted something a bit more comprehensive, and didn't want to write the messages to flash because I'm not using any writable flash filesystem.
Avery Pennaruns blog article The log/event processing pipeline you can't have is not only a fun read about log processing at scale, but contains a really neat solution to this problem: instead of having klogd suck kernel messages and push them into syslogd which spits them out to files on disk, why not run the pump in reverse and push all the userland log messages into the kernel printk buffer? We have lots of RAM (I mean, by comparison with how much flash we can spare) and can squirrel all the boot time messages away until the network is up and ready to send them to the loghost.
Copying from /dev/log to /dev/kmsg is either
simple
or as complicated as you care to make it
(the Google Fiber app has clearly encountered and addressed a
whole bunch of real-world issues I haven't had to deal with).
Sending messages from the kernel printk buffer is slightly more complicated but only in the incidentals not in the essentials. We need to
/dev/kmsg and parse the strings we get. The format is
documented
and the only challenge here is that for reasons of keeping the system
size down I felt obliged to write it in C, which is peculiarly
suitable for text processing tasks.  Emphasis here on "peculiar" not
"suitable".  Slightly annoyingly, the log entries don't include a
timestamp field which can be related to time of day: instead they use
a monotonic timestamp which is probably the number of seconds since
boot but doesn't account for time while the system is suspended.
There's only one other thing to say here, which is that the lines you
write into /dev/kmsg are whatever you want to write.  My
klogcollect just writes more or less anything that turn up in
/dev/log, without attempting to parse it, which means whatever format
the C library syslog() function writes.  So when forwarding the
messages I have to check the message origin (kernel or userland)
before deciding whether to format it for RFC5424 or whether to wing it as-is.  
Summarising
After the monolog(ue), the epilog(ue)...
klogforward is on Github, and you can have this working in your NixWRT by adding
(syslog { loghost = "loghost.example.com" ; })
to your modules.
When I next come back to this I am going to play with the PRINTK_PERSIST patch that Avery talks about, and also I should have a proper look at logos - it probably solves problems I haven't seen yet. But more pressing right now is working out why I can't reboot my wireless extender without freezing every other device on the switch it's attached to for a minute. Wireshark says it's "MAC PAUSE" frames, and my working hypothesis is a switching loop.
