diary @ telent

Flash is fast, flash is cool#

Sun Jan 25 09:05:44 2026

Topics: eculocate esp32 rust

This week has been about implementing OTA flashing.

The ESP32 supports an A/B partitioning scheme for the application partition, so you can safely install a new firmware without destroying the firmware that's currently running. You write to partition ota_0 having booted from ota_1 and then flip the active boot partition and do the opposite on the next update. Rust support for this is in the esp_hal_ota crate, but if you want to understand what it's doing you should first read the official Espressif C API documentation.

It's pretty easy to use: as per the example you make an Ota object, you call ota_begin on it, and then you call ota_write chunk every time you get some new data (from the network or from BLE or from the USB stick or wherever you are getting the update from) until the whole thing's transferred, then you call ota_flush at the end.

It does/did seem to have a bug (as far as I can tell): if you turn off the builtin CRC checking (because you are using a reliable network, or you have some other form of integrity check), it will try to do it anyway. It's a one-line fix which I will open a PR for as soon as I'm a bit more convinced I haven't misundestood the whole thing.

Anyway, I plugged it in, hooked it up and ... it worked. It was however a lot slower (it took minutes) than flashing using espflash over USB, which takes seconds, and this is because it (more precisely, the API afforded by embedded-storage) takes a quite naive approach to managing erase blocks: if you get 1187 bytes from the network and ask embedded-storage to write it, it will read the whole erase block (4096 bytes) into RAM, erase it - you can only erase in units of one block - and then write the modified block back. Given you'd just written less than an entire block, the next packet will cause it to erase the same block again to write a different part of it. You're probably erasing each block two or three times.

Rather than change esp-hal-ota I decided to take the coward's way out and buffer data in my application until I had a block's worth. Suddenly it got massively faster.

The other feature this week was authentication: the device should only accept legitimate (read: cryptograpically signed) firmware. The constraint here is that we can't read the whole firmware into RAM before verifying the signature - there's 384K of RAM and 4MB of flash, it's not going to fit. Our self-imposed additional constraint is that we don't want to write the firmware to flash as we go along and then validate the signature at the end, because if it's wrong then we've already overwritten a working firmware with malicious data. I say this is self-imposed because it's not actually obvious that it's a real problem unless there's some way for the bad actor to switch to the new firmware, but it doesn't seem pretty even so.

So tl;dr we need to verify the firmware before we've read it all, and before we've written any of it. We do this using the approach explained by Gennaro and Rohatgi in How to Sign Digital Streams - the first block is digitally signed and each block contains a sha256 of the following block.

We use ed25519-dalek for signature checking, because it looked easy to get running with no_std and I've heard of at least two of the authors. For SHA256 - it turns out that the ESP32-C3 has a hardware SHA accelerator, so we use esp_hal::sha here and don't need to do it in software.

I bashed my head against this for a while because I didn't read the F manual closely enough. hasher.update doesn't eat all the bytes that you feed it, and it returns the unconsumed data - so you have to run it in a loop until it comes back empty. If you just call it once, as I did, you get the hash of the first 32 bytes only.

(In the end, this - it seems to me - doesn't protect a lot better against wiping good images than the verify-at-the-end approach. If a MITM is able to substitute block 22 of 150 blocks with their own data, we will write blocks 1-21 and then abort, whereas verifying-at-the-end means we'll write blocks 1-150 but not switch to the new image. Either way we've trashed whatever was there before)

That's bascally all there is to relate this week, although I also invested a little more time reading bits of the Rust Book that our Rust study group at $work hasn't reached yet, in order to make the error handling less crappy. Next steps are to write the session registration thing so that UDP is authenticated, and to add wifi provisioning so we're not hardcoding my wifi network details.