I rewrote my blog engine (again): welcome, Cripslock#
Sat Jul 26 20:34:36 2025
'In my experience Miss Cripslock tends to write down exactly what
one says,' Vetinari observed. 'It's a terrible thing when
journalists do that. It spoils the fun. One feels instinctively that
it's cheating, somehow'
In preparation for rearranging and pouring in a bunch of fediverse
threads from an archive of the recently defunct Pleroma server at
brvt.telent.net, I have rewritten (again) the software that shows you
this blog.
it was that or figure out how to rebuild and reverse-engineer the
ten-year old Clojure program it replaces. New features are the
topic/keyword/tag thing you see in the sidebar on the right, and a new
syntax for making links between one page and another: my intention is
that it's going to get more "timeless" posts (mostly, for the
moment, about motorcycles), and so I need better ways to expose that
stuff. Something a bit like a Bliki.
(A planned new feature is site search, because Google isn't what it
once was)
Visually I carried forward all the CSS from the old site so you
probably won't notice much difference there. The service itself is
written in Fennel because well mostly
because why not?. I had to write a Textile parser again, using an
ugly combination of Lua patterns and Lpeg, but Markdown is provided by
lcmark.
Points of note:
my mental model of Textile is divergent in several ways from the
Textile spec's model of Textile, and as the rôle of Textile in
this software is to display posts that I wrote, my interpretation
is deemed locally correct. This is another reason the Textile
processing is home-grown and I didn't just, say, use Pandoc to turn
it into Markdown.
cqueues is fun. Cripslock has an HTTP server and also an inotify
thing so it can refresh when posts are added/changed, and it's quite
slick to be able to put them in the same event loop. (The side project that I temporarily put
down to write Cripslock (which is itself a side project from
Liminix which is my hobby when I am not at $WORK)
mashes up cqueues and
glib
in an inelegant way, so it was neat to see how it should work)
the initial topic tagging was done with grep and sed and is
probably quite low-quality
in the ~ 25 years since I started writing this blog (ever since the
first post insisting that it was in fact
not a blog - hindsight is
wonderful, no?) I have never specified licence terms except to add a
copyright notice. So, technically, I suppose, that would mean no
copying and arguably no copying would also mean no reading - because
it's in the nature of transmitting it across the internet that
copies end up being made at both ends and quite likely also in the
middle. I intend to rectify this, and also, if I can find some
well-written canned licences that meet my needs, to make it explicit
that I am not permitting its use for training LLMs.
The European Union must keep funding free software#
Tue Jul 16 21:56:18 2024
This is an open letter initially published in French by the Petites Singularités association. To co-sign it, please publish it on your website if your preferred language, then add yourself to this table.
Thanks to Jeremiah Lee for
the easy-to-copy-paste HTML version.
Since 2020, the Next Generation Internet (NGI) program, part of European Commission’s Horizon program, funded free software in Europe using a cascade funding mechanism (see for example NLnet’s calls). This year, according to the Horizon Europe working draft detailing funding programs for 2025, we noticed the Next Generation Internet is not mentioned as part of Cluster 4.
NGI programs have demonstrated their strength and importance in supporting the European software infrastructure, as a generic funding instrument to fund digital commons and ensure their long-term sustainability. We find this transformation incomprehensible, moreover when NGI has proven efficient and ecomomical to support free software as a whole, from the smallest to the most established initiatives. This ecosystem diversity backs the strength of European technological innovation. Maintaining the NGI initiative to provide structural support to software projects at the heart of worldwide innovation is key to enforce the sovereignty of a European infrastructure. Contrary to common perception, technical innovations often originate from European rather than North American programming communities and are mostly initiated by small-scaled organizations.
Previous Cluster 4 allocated 27 millions euros to:
“Human centric Internet aligned with values and principles commonly shared in Europe”
“A flourishing Internet, based on common building blocks created within NGI, that enables better control of our digital life”
“A structured eco-system of talented contributors driving the creation of new Internet commons and the evolution of existing internet commons”
In the name of these challenges, more than 500 projects received NGI0 funding in the first 5 years, backed by 18 organizations managing these European funding consortia.
NGI contributes to a vast ecosystem, as most of its budget is allocated to fund third parties by the means of open calls, to structure commons that cover the whole Internet scope—from hardware to application, operating systems, digital identities, or data traffic supervision. This third-party funding is not renewed in the current program, leaving many projects short on resources for research and innovation in Europe.
Moreover, NGI allows exchanges and collaborations across all the EU zone, as well as “widening countries”1, currently both a success and an ongoing progress, like the Erasmus program before us. NGI0 also contributes to opening and maintaining longer relationships than strict project funding does. It encourages the implementation of funded projects through pilots and supports collaboration within intiatives, as well as the identification and reuse of common elements across projects, interoperability in identification systems and beyond, and the establishment of development models that integrate other sources of financings at different scales in Europe.
While the USA, China, or Russia deploy huge public and private resources to develop software and infrastructure that massively capture private consumer data, the EU can’t afford this renunciation. Free and open source software, as supported by NGI since 2020, is by design the opposite of potential vectors for foreign interference. It lets us keep our data local and favors a community-wide economy and know-how, while allowing an international collaboration. This is all the more essential in the current geopolitical context: the challenge of technological sovereignty is central and free software makes it possible to respond to it while acting for peace and sovereignty in the digital world as a whole.
In this perpective, we urgently ask you to call for the preservation of the NGI program in the 2025 funding program.
1 As defined by Horizon Europe, widening Member States are Bulgaria, Croatia, Cyprus, Czechia, Estonia, Greece, Hungary, Latvia, Lituania, Malta, Poland, Portugal, Romania, Slovakia, and Slovenia. Widening associated countries (under condition of an association agreement) include Albania, Armenia, Bosnia, Feroe Islands, Georgia, Kosovo, Moldavia, Montenegro, Morocco, North Macedonia, Serbia, Tunisia, Turkeye, and Ukraine. Widening overseas regions are Guadeloupe, French Guyana, Martinique, Reunion Island, Mayotte, Saint-Martin, The Azores, Madeira, the Canary Islands. ↩︎
I responded to a Linkedin post
about refactoring code the other day, referencing the J B Rainsberger
"Simple Code Dynamo"", and I don't think I said all I wanted to say:
specifically about why I like it.
A common refrain I hear when I talk with other developers about
refactoring is that it has low or no value because it's subjective.
You like the inlined code, I like the extracted method. He likes the
loop with a repeated assignment, they like the
map/reduce. Tomato/tomato. Who's to say which is better objectively?
If you consult a catalog of refactoring techniques you'll
see that approximately half of them are inverses of the other half, so
technically you could be practicing "refactoring" all day and end up
with the code in exactly the same place as where you started. And
that's not delivering shareholder value/helping bring about the
Starfleet future (delete as applicable).
What does the Simple Code Dynamo bring to this? We start with a
definition of "Simple" (originally from Kent Beck):
passes its tests
minimises duplication
reveals its intent
has fewer classes/modules/packages
[Kent] mentioned that passing tests matters more than low duplication, that low duplication matters more than revealing intent, and that revealing intent matters more than having fewer classes, modules, and packages.
This is the first reason I like it: it gives us a definition of
"good". It's not an appeal to authority ("I made it follow the
'Strategy' pattern, so it must be better now") or a long list of "code
smells" that you can attempt to identify and fix, it's just four short
principles. And actually for our purposes it can be made shorter. How?
Let's regard rule 1 ("Passes tests") as an invariant, pretty much: if
the code is failing tests, you are in "fix the code" mode, not in
"refactor" mode. So we put that to one side.
What about rules 2 and 3? There has been much debate over their
relative order, and the contention in this article is that it turns out not to matter , because they reinforce each other.
when we work on removing duplication, often we create new functions
with the previously duplicated code and then call them from both
sites. These functions need names and often our initial names are
poor. So ...
when we work on improving names, we start seeing the same names
appearing in distant places, or unrelated names in close
proximity. This is telling us about (lack of) cohesion in our code,
and fixing this leads to better abstractions. But ...
now we're working at a higher level of abstraction, we start
noticing duplication all over again - so, "GOTO 10" as we
used to say.
You should read the article if you haven't, because this blog post is
not to paraphrase it but to say why I like it, and this is the second - and in some ways, the more powerful - reason. It gives us a place to start. Instead of being reduced to looking despairingly at the
code and saying where do I even begin? (often
followed 15 minutes later with "I want to kill it with fire"), we have
an in. Fix a poor name. Extract a method to remove some
duplication. You aren't choosing to attack the wrong problem first,
because they're both facets of the same problem.
For me this is, or was, liberatory, and I share it with you in case it
works for you too.
(For completeness: yes, we have basically ignored rule 4 here - think
of it as a tie breaker)
I bought a new SSD for my primary desktop system, because the
spinning rust storage I originally built it with is not
keeping up with the
all
the
newdemands
I'm making of it lately: when I sit down in front of it
in the morning and wave the mouse around, I have to sit listening to
rattly disk sounds for tens of seconds while it pages my desktop session back in.
For reasons I can no longer remember, the primary system partition
/dev/sda3 was originally set up as a LVM PV/VG/VL with a
bcache layered on top of that.
I took the small SSD out to put the big SSD in, so this seemed like
a good time to straighten that all out.
Removing bcache
Happily, a bcache backing device is still readable even after the
cache device has been removed.
It took either a couple of attempts or some elapsed time for this to
work, but eventually resulted in
# cat /sys/block/bcache0/bcache/state
no cache
so I was able to boot the computer from the old HDD without the old
SSD present
Where is my mind?
At this time I did a fresh barebones NixOS 23.05 install onto the new
SSD from an ISO image on a USB stick. Then I tried mounting the old
disk to copy user files across, but it wouldn't. Even, for some
reason, after I did modprobe bcache. Maybe weird implicit module
dependencies?
The internet says that you can mount a bcache backing device even
without bcache kernel support, using a loop device with an offset:
If bcache is not available in the kernel, a filesystem on the backing device is still available at an 8KiB offset.
... but, that didn't work either? binwalk will save us:
The offset is not 8K, it's 8K * 512. Don't ask me why, I only work here.
So we can get to the data using
$ sudo mount /dev/backing/nixos /mnt -o loop,offset=4194304
and copy across the important stuff like /home/dan/src and my
.emacs. But I'd rather like a more permanent solution as I want to
carry on using the HDD for archival (it's perfectly fast enough for my
music, TV shows, Linux ISOs etc) and nixos-generate-config
gets confused by loop devices with offsets.
If it were an ordinary partition I'd simply edit the partition table
to add 8192 sectors to the start address of sda3, but I don't see a
straightforward way to do the analogous thing with a logical volume.
Resolution
Courtesy of Andy Smith'shelpful blog post
(you should read it and not rely on my summary) and a large degree of
luck, I was able to remove the LV completely and turn sda3 back into
a plain ext4 partition. We follow the steps in his blog post to find
out how many sectors at the start of sda3 are reserved for metadata
(8192) and how big each extent is (8192 sectors again, or 4MiB). Then when I
looked at the mappings:
sudo pvdisplay --maps /dev/sda3
--- Physical volume ---
PV Name /dev/sda3
VG Name backing
PV Size 2.72 TiB / not usable 7.44 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 713347
Free PE 0
Allocated PE 713347
PV UUID 7ec302-b413-8611-ea89-ed1c-1b0d-9c392d
--- Physical Segments ---
Physical extent 0 to 713344:
Logical volume /dev/backing/nixos
Logical extents 2 to 713346
Physical extent 713345 to 713346:
Logical volume /dev/backing/nixos
Logical extents 0 to 1
It's very nearly a continuous run, except that the first two 4MiB
chunks are at the end. But ... we know there's a 4MiB offset from the
start of the LV to the ext4 filesystem (because of bcache). Do the
numbers match up? Yes!
Physical extent 713345 to 713346 are the first two 4MiB chunks of
/dev/backing/nixos. 0-4MiB is bcache junk, 4-8MiB is the beginning of
the ext4 filesystem, all we need to do is copy that chunk into the gap
at the start of sda3 which was reserved for PV metadata:
# check we've done the calculation correctly
# (extent 713346 + 4MiB for PV metadata)
$ sudo dd if=/dev/sda3 bs=4M skip=713347 count=1 | file -
/dev/stdin: Linux rev 1.0 ext4 filesystem data, UUID=37659245-3dd8-4c60-8aec-cdbddcb4e3c8, volume name "nixos" (extents) (64bit) (large files) (huge files)
# save the data
$ sudo dd if=/dev/sda3 bs=4M skip=713347 count=1 of=ext4-header
# backup the start of the disk, in case we got it wrong
$ sudo dd if=/dev/sda3 bs=4M count=4 of=sda3-head
# deep breath, in through nose
# exhale
# at your own risk, don't try this at home, etc etc
$ sudo dd bs=4M count=1 conv=nocreat,notrunc,fsync if=ext4-header of=/dev/sda3
$
It remains only to fsck /dev/sda3, just in case, and then it can be
mounted somewhere useful.
With hindsight, the maths is too neat to be a coincidence, so I think
I must have used some kind of
"make-your-file-system-into-a-bcache-device tool" to set it all up in the first
place. I have absolutely no recollection of doing any such thing, but
Firefox does say I've visited that repo before ...
In the course of Liminix hacking it has
become apparent that I need to understand the new Linux packet
filtering ("firewall") system known as nftables
The introductory documentation for nftables is a textbook example of
pattern 1 in Julia Evans Patterns in confusing explanations
document. I have, nevertheless, read enough of it that I now think I
understand what is going on, and am ready to attempt the challenge of
describing
nftables without comparing to ip{tables,chains,fw}
We start with a picture:
This picture shows the flow of a network packet through the Linux
kernel. Incoming packets are received from the driver on the far left
and flow up to the aplication layer at the top, or rightwards to the
be transmitted through the driver on the right. Locally generated
packets start at the top and flow right.
The round-cornered rectangles depict hooks, which are the places
where we can use nftables to intercept the flow and handle packets
specially. For example:
if we want to drop packets before they reach userspace (without affecting
forwarding) we could do that in the "input" hook.
if we want to do NAT - i.e. translate the network addresses embedded
in packets from an internal 192.168.. (RFC 1918) network to a real
internet address, we'd do that in the "postrouting" hook (and so
that we get replies, we'd also do the opposite translation in the
"prerouting" hook)
if we're being DDoSed, maybe we want to drop packets in the
"ingress" hook before they get any further.
The picture is actually part of the docs and I think it should be on the first page.
Chains and rules
A chain (more specifically, a "base chain") is registered with one
of the hooks in the diagram, meaning that all the packets seen at that point
will be sent to the chain. There may be multiple chains registered to
the same hook: they get run in priority order (numerically lowest to
highest), and packets accepted by an earlier chain are passed to the
next one.
Each chain contains rules. A rule has a match - some criteria to
decide which packets it applies to - and an action which says what
should be done when the match succeeds.
A chain has a policy (accept or drop) which says what
happens if a packet gets to the end of the chain without matching any
rules.
You can also create chains which aren't registered with hooks, but are
called by other chains that are. These are termed "regular chains" (as
distinct from "base chains"). A rule with a jump action will execute
all the rules in the chain that's jumped to, then resume processing
the calling chain. A rule with a goto action will execute the new
chain's rules in place of the rest of the current chain, and then
the packet will be accepted or dropped as per the policy of the base
chain.
[ Open question: the doc claims that a regular chain may also have a
policy, but doesn't describe how/whether the policy applies when
processing reaches the end of the called chain. I think this omission
may be because it is incorrect in the first claim: a very sketchy reading of
the source code suggests that you can't specify policy when creating a chain unless you also specify the hook. Also, it hurts my brain to think about it. ]
Chain types
A chain has a type, which is one of filter, nat or route.
filter does as the name suggests: filters packets.
nat is used for NAT - again, as the name suggests. It differs from
filter in that only the first packet of a given flow hits this
chain; subsequent packets bypass it.
route allows changes to the content or metadata of the packet
(e.g. setting the TTL, or packet mark/conntrack mark/priority) which
can then be tested by policy-based routing (see ip-rule(8)) to send
the packet somewhere non-usual. After the route chain runs, the
kernel re-evaluates the packet routing decision - this doesn't
happen for other chain types. route only works in the output
hook.
Tables
Chains are contained in tables, which also contain sets, maps,
flowtables, and stateful objects. The things in a table must all
be of the same family, which is one of
ip - IPv4 traffic
ip6- IPv6 traffic
inet - IPv4 and IPv6 traffic. Rules in an inet chain may match
ipv4 or ipv6 or higher-level protocols: an ipv6 packet won't be
tested against an ipv4 rule (or vice versa) but a rule for a layer 3
protocol (e.g. UDP) will be tried against both. (Some people [who?]
claim this family is less useful than you might first think it would
be and in practice you just end up writing separate but similar
chains for ip and ip6)
arp - note per the diagram that there is a disjoint set of hooks for ARP traffic, which allow only chains in arp tables
bridge - similarly, another set of hooks for bridge traffic
netdev - for chains attached to the ingress and egress hooks,
which are tied to a single network interface and see all traffic
on that interface. This hook/chain type gives great power but
correspondingly great faff levels, because the packets are still
pretty raw. For example, the ingress chain runs before fragmented
datagrams have been reassembled, so you can't match e.g. UDP
destination port as it might not be present in the first fragment.
There's a handy summary in the docs describing which
chains work with which families and which tables.
What next?
I hope that makes sense. I hope it's correct :-). I haven't
explained anything about the syntax or CLI tools
because there are perfectly good docs for that already which you
now have the background to understand.