What's missing from this motorbike? The answer is shocking.
I removed the shock from my motorbike today so I can take it to
ABE tomorrow to be
rebuilt. Some notes for posterity and so that I remember how to
reinstall it.
I mostly followed the Haynes manual: the words are good but the
pictures are awful. They say to remove the fuel tank, but I didn't
really want to, on account of how it's full of fuel. I found it worked
to lift the tank a bit and stick some wooden blocks underneath to hold
it up. While doing this the vent/breather pipe popped off, as it
always does.
In the order that I tackled them:
the reservoir needs to be removed: use a JIS screwdriver to loosen
the strap around it and then just slide it out.
to get to the nut/bolt at the lower end I had to loosen the rear
hugger - two screws removed. I used a socket on the nut and a
spanner on the bolt to stop it from spinning while I turned the nut.
I can't see any way to get a socket onto the nut/bolt at the top
end, but it eventually succumbed to two spanners. Reassembling this
is going to be "fun" if it's finicky about torque settings.
then it's "just" a matter of untangling everything to remove the
shock and its reservoir. I had to unplug the connector for the
stator cable, as there was no way to get the reservoir through the
tangle otherwise.
Hopefully having now written this down I'll not forget to reattach all
the bits
I'll tell you a joke about UDP, but you might not get it.
We have a new name. "Thing I can plug into my motorbike ECU to log the
data (rpm, speed, throttle position, temperatures etc etc) it
produces" is Leonard-of-Quirm-level naming. I'd provisionally been
calling it "eculogical" which I didn't like, and now it's called
"eculocate" which I ... can tolerate.
And I've got it to the point where it (kind of) works - but, now I've
decided I need to semi-fundamentally break it again. I'll get to
that.
On the server side we have a UDP socket that listens for subscription message
containing [(interval, table-number, start, end), ...] (actually
binary encoded) and then sends back the requested table data once
every interval milliseconds for the next minute. Then it stops,
because this is UDP and we can't reliably tell when the peer has gone
away, so the peer should send another subscription message in the
meantime if it wants to carry on receiving.
For now we're just offering the raw tables, because I'm going to need
much more example data to figure out the structure. Eventually we'll
do some processing on device so that clients can query "RPM" or "TPS"
without having to know their table/offset - as that varies between
bike models.
Notes:
the embassy-net IP stack (actually smoltcp) requires you to
statically declare how many sockets you're going to
use
which is fine once you know you have to.
And we have an Android client. Well, it's Android insofar as it runs
on my phone, but I don't think it'd qualify for Android Market or the
Play Store or whatever it's called now. I sidestepped the
whole android app development slog, by installing Termux and
Termux:GUI on my phone and writing the client side as a Python script. I don't even like Python and I still found this preferable to
the Android Studio build/run process: I simply sshed into my phone and
used tramp to edit the script. I believe that Termux:GUI doesn't
support the full range of Android widgets but it has buttons and
labels and text boxes and LinearLayouts which is enough for me. Adding
dns-sd (zeronconf) support was the work of about 20 minutes, which was nice.
Having achieved that milestone I made a list of what's left before I
can plug it into my motorbike and take it for a ride (cable, power
supply, some form of protective casing) and realised that once I
detach it from its USB umbilical I will no longer be able to release
new versions simply by invoking cargo run. So, it needs a mechanism
for OTA updates, and this should probably come with some kind of
auth[nz] so that not just any Tom, Dick or Harry on the same wifi
network could flash random crap onto it. Then I considered that if
we're not trusting the wifi, the actual UDP service (which is
currently read-only but maybe some day might include a means of
writing to (and therefore probably bricking) the ecu) is also
sensitive.
Here's the plan:
we'll make an EdDSA key pair at build time and embed the public key in the binary
build tooling will sign the release artefact with the key
a TCP socket will listen for OTA update requests and verify the signature before
writing to the flash
the TCP socket will also listen for session key registrations (signed in the same way) and remember them for x hours (or until the ignition
is turned off and we lose power)
the UDP listener will reject subscription requests unless they come with a
valid session key
Additionally, we need to change the dns-sd stuff to advertise a TCP
service, the client to register a session key when it starts, the
subscription message format to include the session key, and the UDP
listener to check it. Which is what I meant when I said
"semi-fundamentally break it".
If this were commercial/proprietary software then we'd have separate
keys for the firmware signing and for the client. That seems less of
an issue when it's most likely the same person building the software as
is using the client, but it might be worth doing anyway.
Current status: bodged together a TCP listener, haven't touched on
crypto yet, and so far it only pretends to do the OTA update.
to jot down some of what I've been doing in the past month or so. The
tl;dr is "making a thing I can plug into my motorbike ECU to log the
data (rpm, speed, throttle position, temperatures etc etc) it
produces". For reasons mostly of ramifying the learning opportunities,
I decided the best way would be to get a cheap ESP-32 device (it's
RISC-V - isn't that cool?) and hook it up to a level converter, and
then write a program for it in Rust (Rust learning opportunity ahoy)
to twiddle the serial line appropriately and send the data over the
network to the mobile phone which sits on my handlebars.
It turns out that I spent way less time getting the serial interface
to the Honda K-line ECU signal
to reveal its secrets than on the "why don't you just ..." part where
I want to stream the data over wifi to another device. So this post is
actually not at all about hardware hacking.
The constraints I have imposed on myself here are
I do not want to hardcode my wifi ssid and password into the device
(actually, I have at least two wifi networks I may want to use it with)
the (yet to be written) data collection app should be able to find
the device without hardcoding its IP address or requiring me to type it
in - as the device is getting its address from DHCP, we don't even
know what address it will get
These are both in principle solved problems.
There's a convention for provisioning wifi on these devices
which involves using a mobile phone app to connect to it using BLE
then sending the ssid/password of the chosen access point. In fact
there's even a prebuilt Android app which we can use and an esp32
arduino library which we can't (because we have elected to make our
lives difficult and use Rust instead). But I am led to believe that
"rewrite everything in Rust" is idiomatic for Rust programmers anyway.
I haven't done this yet.
And for the "what's my IP address" problem there is a standard way, by
combining Multicast DNS
and DNS-based Service Discovery, for
computers to publish their services on the LAN. When I say
"computers": if this household is typical, mostly they're set-top
boxes, printers, light bulbs, smart speakers and thermostats rather
than general-purpose computing devices. I've mostly done this bit.
Terms
Multicast DNS is DNS, but peer-to-peer: it reuses mostly the same packet formats
but instead of requiring a centralised server which knows all the
names, every device listens on a multicast address
for DNS queries for its own name.
DNS-SD is a convention for which records you can query/need to send in
order to advertise what kind of services you have and where they are.
Because sending an A record alone is not sufficient for anyone with a
Mac and a fancy-schmancy service browser to know what kind of service
is on offer at that address. Is it a printer? A dishwasher? An
IoT air fryer?
The RFCs for each (which are, by the way, much easier reads than a lot
of RFCs and contain no EBNF at all) go to great lengths to point out
that each is independent of the other. But they stack well.
DNS-SD, 3048 metre view
DNS-SD is based on a paradigm of "services" and "service instances". A
"service" is the general "kind" of thing on offer and is named
something like _http._tcp.local - it will always end in _tcp.local
if it is TCP or _udp.local if it is anything other than TCP. For our
ECU project we chose the service name _keihin._udp.local after the
manufacturer of the ECUs that the device knows how to talk to. A
service instance might be something like
WiserHeat05AB12._http._tcp.local. Service names aren't usually
hierarchical but there are a few with a second level like
_printer._sub._http._tcp
The minimum/usual set of records you need to publish for DNS-SD is
this (pseudocode)
Your service instance needs a SRV and a TXT, then there's a PTR
connecting the service instance to the service for people who are
browsing the service - think about e.g. an "Add a printer" dialog box,
then there's a PTR from _services._udp.local to the service name PTR
for people who are running avahi-browse -a or its moral equivalent
in GUI-land. And not forgetting there's an A record matching the one in the
SRV record data.
Note: _services._udp.local is the right name for discovery
even if your service is TCP - there is no _services._tcp.local
Note: I am assuming .local is the suffix, which is likely true for
MDNS but probably not if you are using DNS-SD with regular DNS
MDNS
The single biggest problem when implementing MDNS is the lack of
tooling to test it against. In my experience:
dig: historically, you used to be able to dig @224.0.0.251 -p 5353 name.local but it largely worked by accident and now it
doesn't. Note
that even when it did work it wasn't sending the same packets as a
real MDNS query.
avahi-browse: e.g. avahi-browse -v -a shows all the services on
the LAN. Note that is is a frontend to the persistent avahi-daemon
and there is caching happening in there somewhere, so if it didn't
work then and you made a change and restarted your service ... is
your service still broken or did it not reissue the query? shrug-emoji.gif
mquery from Jeremie Miller's mdnsd: note that this has been forked
into a million pieces. The one I'm using is
https://github.com/troglobit/mdnsd.
This works for querying the _services._udp.local discovery name
but if you run e.g. mquery _scanner._tcp.local it appears to send
the query and then sit there silently ignoring the responses. But:
it doesn't cache as avahi-browse does, so that's good.
wireshark, of course. Wireshark is pretty good, but note that it
will display your replies as "Unsolicited" because the query id for
MDNS is (per the standard) 0, so there is no way for it to correlate
them with requests.
mdns-debugger was
handy for pointing out my TTLs (and a lot of other TTLs) were wrong.
It didn't point out that my PTR record data was incorrectly encoded
and was therefore naming a nonexistent A record, which was a source
of much hair tearing.
there are a couple of Android apps I also used, mostly to see what
they'd do when nothing was working (see "hair tearing" above) and I
wa sout of ideas. "mDNS Discovery" (com.mdns_discovery.app) and
"Service Browser" (com.druk.servicebrowser). The first one is
prettier, the second one very helpfully rendered the errant PTR as
with a backslash as eculogical\.local and so led me to the said
encoding error.
Where are we now?
I believe that it now does everything an mdns responder SHOULD(sic) do
except
compress labels in response data
ignore queries for records where the record we'd send is already in the answers section of the query message
NSEC
and I can't decide, in the context of this being a program that
probably nobody else in the world will ever use and even I will only
use on one single piece of hardware (I only have one motorbike)
whether implementing those things is a good and laudable decision
because spec compliance is important, or just a way of further putting
off the inevitable next step which involves writing the Android app
to collect the data.
It also could do with being extracted into its own module/crate/thing
to be more modular. I'd say "to aid reuse" but I don't think anyone
really wants to (or should want to) reuse my novice-level Rust code.
Learning in public.
Centralised logging with Liminix and VictoriaLogs#
It's a year since I wrote Log off, in which I
described some ongoing-at-the-time work to make Liminix devices log over the
network to a centralised log repo. It's also, and this is entirely a coincidence,
a year since I made any kind of progress on it: since that time all my log messages have continued to be written to ramdisk that will
be lost forever like tears in the rain.
This situation was not ideal. I had some time and energy recently to
see if I could finish it up and, well, I haven't done that exactly but
whereas last time I only believed it was substantially finished,
this time I believe it is substantially finished.
It goes a little something like this:
Tap the log pipeline
Each service in Liminix is connected to its own log process, which is
(for 98% of the services) connected to the "fallback logger" which
writes the logs to disk (ramdisk) and takes care of log rotation
etc. This is standard s6 stuff, we're not innovating here.
Into the middle of this pipeline we insert a program called
logtap
which copies its input to its output and also to a fifo - but only
writes to the fifo if the previous writes worked (i.e. it doesn't back
up or stop working if the tap is not connected). The standard output from
logtap goes on to the default logger, so local logging is unaffected -
which is important if the network is down or hasn't come up yet.
This is a change from last year's version, which used a unix domain
socket instead of a fifo. Two reasons: first, we need to know which
messages were sent successfully and which weren't. It was difficult to
tell reliably and without latency whether there was anything at the
other end of the socket, whereas we learn almost instantly when a fifo
write fails. Second, it makes it easier to implement a shipper because
it can just open the fifo and read from it, instead of having to call
socket functions.
Hang a reader on the tap
The log shipper opens the other end of the fifo and ... ships the
logs. I've chosen
VictoriaLogs
(wrapped in an HTTPS reverse proxy) as my centralised log service, so
my log shipper has to conect with HTTPS to the service endpoint and
send "jsonline" log messages. In
fact, my log shipper just speaks pidgin HTTP on file descriptors 6 and 7 and leverages
s6-tlsclient
to do the actual TCP/TLS heavy lifting.
This is all new since last year when we were just splatting raw logs
over a socket connection instead of doing this fancy JSON stuff. It
did mean writing a parser for TAI64N external timestamps and some
functions to convert it to UTC: as a matter of principle (read:
stubbornness) I do appreciate that my log message timestamps won't go
forwards and backwards arbitrarily when leap seconds are decreed, but
I guess almost nobody else (at least, neither VictoriaLogs nor Zinc)
thinks it's important.
Before the log shipper can start, it needs to get its TLS client
certificate, by making a CSR and sending it to
Certifix. The
certifix-client
is almost the same as last year's version except that it uses
lua-http instead of
fetch-freebsd as the http
interface. This is because last year's version wasn't work
when asked to traverse the baroque maze of iptables forwarding and
QEMU Slirp networking that lies between my Liminix test network and my
VictoriaLogs instance. After a long time staring at pcap dumps I gave
up trying to work out why and just rewrote that bit.
It's important to have an (at least vaguely) accurate clock before
attempting HTTPS, because the server certificate has a "not valid
before" field, so OpenSSL won't like it if you say it's still 1970.
Originally I planned to put a Lets Encrypt cert in front of
Victorialogs, but that would need 500k of CA certificate bundle on
each device, which is quite a lot on devices with little flash. So it
makes more sense to use the Certifix CA here too.
Persuading the OpenSSL command line tools to make a CSR with a
challengePassword was probably as much work as writing something
with luaossl would have been - it was certainly messier - but the
point is I didn't know that when I started.
# in nixos configuration.nix
systemd.services."loghost-certificate" =
let
dir = "/var/lib/certifix";
pw = builtins.readFile "${dir}/private/challengePassword";
in {
script = ''
set -eu
cd ${dir}
PATH=${pkgs.openssl}/bin:${pkgs.curl}/bin:$PATH
openssl req -config <(printf '[req]\nprompt=no\nattributes=attrs\ndistinguished_name=DN\n[DN]"C=GB\nST=London\nO=Example Org\nCN=loghost\n[attrs]\nchallengePassword=${pw}') -newkey rsa:2048 -addext "extendedKeyUsage = serverAuth" -addext "subjectAltName = DNS:loghost.lan,DNS:loghost,DNS:loghost.example.org" -nodes -keyout private/loghost.key --out certs/loghost.csr
curl --cacert certs/ca.crt -H 'content-type: application/x-pem-file' --data-binary @certs/loghost.csr https://localhost:19613/sign -o certs/loghost.crt
'';
serviceConfig = {
Type = "oneshot";
User = "root";
ReadWritePaths = ["/var/lib/certifix"];
StateDirectory = "certifix";
};
startAt = "monthly";
};
The proxy itself is just Nginx with ssl_verify_client set, but
certifix-client holds the https connection open so remember to disable
proxy buffering or you aren't getting your logs in any kind of timely
fashion.
Just as I did last year, I'm going to finish by claiming that this is
basically finished and it just needs installing on some real
devices. Hopefully I'm right this time, though.
This is something of a hobby horse of mine, so forgive the rant: when
I see something has been "sanitized" I treat it as a code smell
(per Martin Fowler, "... a surface indication that usually corresponds to a deeper problem in the system"), and
often find it reveals sloppy thinking which may not even prevent the
exploits it is supposed to guard against.
Each data item in your system is a value, which has a canonical
representation inside your system but may be represented in multiple
different external formats at the boundaries of your system.
When we say "sanitize" we imply that the input data was "insanitary"
(or even "insane", same etymological root I think) but it really
probably wasn't - it just didn't conform to the rules of some
particular representation you had in mind that you would later need to
output. So why is that particular representation special? Should
"sanitizing" strip out backticks (specal in shell)? The semicolon
(special in SQL)? The angle brackets (HTML)? The string +++ (Hayes
modem commands)? .. (pathnames)? ` The dollar sign (bound to be used somewhere)?
Non-ASCII unicode characters (can't put those in a domain name)?
Don't "sanitize". Encode and decode between the canonical internal
representation and the external representation you need to interface
with. Mr O'Leary will be happy, Sigur Rós will appreciate you've
spelled their name right, and Smith & Sons, Artisan Greengrocers
won't have their ampersand dropped.