On a train, on the way back from Alan & Telsa's, after my
spur-of-the-moment decision to go to the SWLUG meeting last night. A
more informal gathering than OxLUG, held in a café then moved
to the pub next door when it became obvious that there were more
people present than would fit. Fun. Back to railway station for what
is apparently fairly normal confusion over the timetable: "where's
this train going?" "the board says `terminates'" "ok, where's
Terminates, then?" -- got on a train that showed some sign of wanting
to head in the right direction, and after it pulled out, the train
manager walked through with a notepad polling the passengers for where
they wanted to go. OK, possibly just for the entirely prosaic reason
that some of the stops en route are basically request stops, but I
think that explanation is less entertaining.
Today, tour of Swansea ("here's the castle, here's the sheep shop,
here's the market, here's the marina, this is the beach") and
introduction to Welshcakes. These can be approximately described as
half-height dense scones, dusted with sugar
A few words about the progress of threading for SBCL, after my
vague hints on Monday:
Native threads, initially for x86 Linux, using clone() directly
instead of pthreads
LinuxThreads is a non-starter: it does evil things with signals
(like catch them in one thread and resignal them to others) that
don't mesh nicely with the evil things that SBCL wants to do with
signals
The exciting new threading stuff in Red Hat 8 is, well, only in
Red Hat 8
... and besides, all the interesting stuff it does with
e.g. thread-local storage is built into gcc and binutils and so on: if
you're not using them, it isn't helping you much
So far ...
CL has dynamically scoped variables. Not, thank goodness, by
default, but available. The semantics we've adopted are that (1) a
symbol's global value can be set and is visible in all threads, and
(2) it can also be dynamically bound: the dynamic binding is visible
in only the thread that it was bound in. Other threads running
concurrently still see and may also change the global value, but the
thread that's bound the variable won't see those changes.
This requires thread-local storage of some kind for the symbol
values. On x86, we point to this with a segment register
(%gs) as we're already using all the real registers. Each
thread has a vector of values, and each symbol gains a
tls-index slot which has its index symbol-value.
Spent much of the weekend fiddling with modify_ldt,
and teaching the sbcl assembler how to assemble instructions with
segment override prefixes (in fact, I still need to go back to it and
teach it how to disassemble them again)
On Monday, I looked through the diffs from the previous round of
hacking and forward-ported them - or at least, the bits that still
made sense - into the current version. This was for formerly-global
variables which are actually supposed to be per-thread (stack pointers
and the like)
On Tuesday, I had other unrelated work to do for, which I actually
get paid (note to interested readers: if you would like to see a
threaded SBCL sooner rather than later, I am available to implement
this and other SBCL/CMUCL enhancements on a contract basis. Email me for details). Spent a
couple of hours on various kinds of public transport thinking about
symbol binding and unbinding
Wednesday (yesterday) and today I wrote bind, unbind, set and
symbol-value vops, and equivalent functionality in C (following a
variation of the Extreme Programming methodologists' Once And Only
Once principle known as "Once And Only Once More") for the bits of C
code that need to do these things in situations where not enough of
Lisp is running to use the normal vops. Then I debugged it. This is
mostly a question of rebuilding, watching it segmentation fault,
attaching gdb, disassembling, cursing, set disassemble
intel, disassembling again, and scratching head. With
intermissions for previously noted LUG meeting in Cardiff, and tour of
Swansea.
<dan_b> straw poll: how many special variables do we think that sbcl binds during cold init?
<dan_b> I think 4096 sounds like a lot
<Krystof> I'd have been surprised to find that we had 4096 distinct special variables
<Krystof> * (length (apropos-list "*"))
<Krystof> 1055
<dan_b> hmm
[...]
<dan_b> aha. got it, i think
<dan_b> (storew tls-index symbol symbol-tls-index-slot other-pointer-lowtag)
<dan_b> having assigned a new tls index for the symbol, it would help if we actually stored it in the symbol strcuture for next time
Right now work is paused, because recompiling takes an hour and
eats battery life, and I'm on a train. So, time out to write diary
and plan the next bit.
(The next bit will actually be integration with the allocator and
garbage collector, given that right now it would be completely unsafe
for two threads to cons at once)
Ext3 errors continue (though no panics today yet, at least) even
with the new kernel, so that's not the problem. Would blame hardware
except that there's none of the usual scary messages from the ide
driver. Maybe a filesystem integrity problem caused (I'm guessing
here) by apm forced shutdown when battery low, at a time that the disk
was being written to, and not fully fixed by subsequent fscks.
Tempted to try mkfs and reinstall (it's the root disk, not /home or
anything important), and see if that helps. I probably have a lot of
old config files, orphaned library packages and other similar stuff
anyway, and it would be nice to get rid of them.
Weekend off, pretty much. Weekend spent preparing for,
co-ordinating and recovering from my friend Simon's stag night.
write diary and plan the next bit, he said on Thursday.
Said plans have not yet reached the enemy, because still wrestling
with bugs in the current bit. Most of them fairly silly bugs once
found : what's wrong with this code?
Why doesn't that work either? This is (also) the kind of bug
that's blindingly obvious once you realise it, of course: (or tls
tls) is not a form that will assemble into the instruction that
does set the condition register. It's a Lisp expression that
evaluates tls then evaluates tls if tls was NIL.
But then, as you see, doesn't actually use the result in any case.
Eventually I noticed the missing bit:
(inst or tls-index tls-index)
and things are substantially rosier. Now we make it all the way
through cold initialization to the toplevel, but break with what looks
like random memory corruption shortly afterwards.
There is one and only one aspect in which doing low-level stuff on an#
There is one and only one aspect in which doing low-level stuff on an
x86 is appealing: hardware watchpoints.
After spending a lot of time today tracing through the disassembly
for the note-undefined-reference function to find out how it
was getting (0 . 0) into undefined-warnings -
supposedly a list - I put a watchpoint on that memory location and
reran. The value subsequently changed in
get-output-stream-string, which is a perfectly normal library
function that has nothing to do with the compiler and certainly no
references to undefined-warnings. After seeing that it was
the work of mere minutes (although, I concede, entirely too many of
them) to realise that things would probably work a Whole Lot Better if
GC were allowed to see the thread-local value vectors so it could
update pointers when the objects move. Sigh.
Now we're back to a system that actually gets all the way through
cold-init and PCL compilation to produce a usable Lisp. Admittedly,
still not one that lets the user actually create threads (not much
point adding thread creation primitives until consing is thread-safe,
after all), but it's a start.
So, it looks like I write these entries once per CVS commit - or at#
So, it looks like I write these entries once per CVS commit - or at
least, once per version of sbcl/threads that I believe represents some
kind of advance on the previous state.
GENCGC (the GENerational Conservative GC that we use on the SBCL
x86 port) already has support for `allocation regions'. These are
small areas (typically a couple of pages each) within which consing
can be done very cheaply: by bumping a free pointer and returning its
old value. If we hit the end of the area, we have to stop and
allocate another, which doesn't have to be contiguous. So, all we
really need for parallelisable allocation is to have one of these areas
open per thread. When any thread runs out of open region it can stop
and get a lock from somewhere before updating gory gc details, but
doing that once every two pages (arbitrary number which in any case we
can tune) has got to be better than every cons (8 bytes).
So this is what I spent all week doing. Although the
code was there, my guess is that it was several years old and
had cerainly never been tested with multiple regions open at once. On
my first attempt it gave me two overlapping regions, so I added in
some stuff to stop it allocating from apparently-empty-but-still-open
regions, so in retaliation it blew its mind and spent the next several
days randomly blowing up with the kind of memory corruption bugs that
I love tracking down more than anything. So, I know substantially
more about the operation of gencgc than I used to, and I've managed to
get a spot of unrelated tidying up in there too. Which is mostly just
as applicable to the base (unthreaded) SBCL and I might even backport,
depending on how much easier/harder I think it might make the eventual
merge.
Thanks to the Phoenix Picturehouse for managing to show LoTR tonight despite
having had a power problem that knocked out half of their supply (and
judging by the neighbourhood, approximately a third of the houses on
the street. Looks like one of the phases had gone). Looking forward
to the next round of Very Secret
Diaries.
Thanks also to the Botley Road branch of Carphone Warehouse, for
deciding that my mobile phone was in fact still under warranty and
sending it back (again) for warranty repair (again). This time I
showed them the engineers' reports from its last two holidays: one
occasion they'd "reflowed filters and p.a." and the other time they'd
"reflowed p.a.s and filters" - I think I managed to make the point
that I would prefer they try something different this time (replacing
the transmitter coil and upgrading the firmware is apparently the
correct fix), but of course, I made that point to the staff in the
shop, so it remains to be seen what the engineers will actually do.
No thanks to the Cornmarket branch of the same chain, who
had decided that it was out of warranty, that the Sale of Goods Act
was not relevant (personally I disagree, having old-fashioned notions
that "fit for purpose sold" should usually imply "lasts for more than
three months" when the item in question is a mobile telephone) and
that really they could not offer any help at all. Next time I buy a
phone, it won't be from you guys. I regret that the last one was, really.
And, really, no thanks to Ericsson, for (a) producing a phone with
this bug in it anyway, (b) failing to spot it and fix it properly on
the last two repairs. Five minutes with Google will tell you all you
need to know about the T39m No Network bug - if it's been fixed in models made since end
2001, they really could have rectified it on either of the occasions
it's been back to them in 2002. Ho hum.
You wanted to know about SBCL threading? Since the last diary
entry, fixed stupid bug which was stopping the whole thread-local
symbol access from working (creating a new thread was also setting %gs
in the parent thread as well as the child, oops), then owing to not
wanting to think too hard on Sunday evening about writing new VOPs for
locking primitives, decided to take short break by reintroducing the control stack
exhaustion checking that I'd disabled when doing the initial
make-multiple-stacks work.
And also took rather longer break (Friday, Saturday, and some
portion of Sunday and Monday to return suits, PA equipment and
generally unwind afterwards) to do Best Man stuff for my friends'
wedding. No, I did not lose the rings. No, nobody had any reasons
that the two persons were not allowed to be joined in matrimony. Yes,
they are now successfully married, and guests at least appeared
to have enjoyed it. And laughed at (with?) my speech. But I don't
write here about people with no web presence of their own, so that's
all you hear about that.
Thinking about locks again, now. That's in the context of
multithreaded systems, not nuptials.
As an aside I would like to insert a warning to those who identify the#
Thu Dec 19 19:32:21 2002
Topics:
As an aside I would like to insert a warning to those who identify the
difficulty of the programming task with the struggle against the
inadequacies of our current tools, because they might conclude that,
once our tools will be much more adequate, programming will no longer
be a problem. Programming will remain very difficult, because once we
have freed ourselves from the circumstantial cumbersomeness, we will
find ourselves free to tackle the problems that are now well beyond
our programming capacity.
Edsger W Dijkstra, EWD340
When he says it, it sounds credible. When I say it, it sounds like
I'm whining.
Writing Error Messages, Rule 1: when the problem is with an
external file, print the file name. Rule 2: if there's
an OS error of some kind, print the errno information.
.. and suddenly, it works. There's some comment to be made here
also about Debian packaging, but the package was labelled experimental
anyway, so I'm probbaly not going to be too harsh there.
First impressions:
it works
it works with emacs' browser stuff, so I can use it for hyperspec
lookup
It has a weird unlabelled extra dialog box beside the url bar,
with an icon that looks like a lollipop. Clicking on this gives a
dropdown menu which allows the selection of 'Find in this Page',
dmoz.org or Google, so I would guess it's intended as some kind of
search box. There's no way of activating it that I can see, though -
certainly, typing into it and pressing Return doesn't seem to do much.
Maybe it's displaying into a hidden sidebar? From the release notes
it looks like this program appears to like sidebars. This user,
however, doesn't.
It supports that new-fangled font stuff that looks pretty but is
actually harder to read than the apps I was using five years ago
Here's a screenshot. You may need to
shift-click and save it if apache mod_proxy is interfering with my
content-types again. Need to get that looked at, yes.