diary at Telent Netowrks

Weekend off, pretty much#

Mon, 02 Dec 2002 16:30:43 +0000

Weekend off, pretty much. Weekend spent preparing for, co-ordinating and recovering from my friend Simon's stag night.

write diary and plan the next bit, he said on Thursday. Said plans have not yet reached the enemy, because still wrestling with bugs in the current bit. Most of them fairly silly bugs once found : what's wrong with this code?

(define-vop (bind) (:args (val :scs (any-reg descriptor-reg)) (symbol :scs (descriptor-reg))) (:temporary (:sc unsigned-reg) tls-index temp bsp) (:generator 5 (let ((tls-index-valid (gen-label))) (load-tl-symbol-value bsp binding-stack-pointer) (loadw tls-index symbol symbol-tls-index-slot other-pointer-lowtag) (inst add bsp ( binding-size n-word-bytes)) (store-tl-symbol-value bsp binding-stack-pointer* temp) (inst jmp :ne tls-index-valid) ;; allocate a new tls-index [...]

That's right. There's no test before we do the conditional jump. Let's try

(define-vop (bind)
  (:args (val :scs (any-reg descriptor-reg))
	 (symbol :scs (descriptor-reg)))
  (:temporary (:sc unsigned-reg) tls-index temp bsp)
  (:generator 5
    (let ((tls-index-valid (gen-label)))
      (load-tl-symbol-value bsp *binding-stack-pointer*)
      (loadw tls-index symbol symbol-tls-index-slot other-pointer-lowtag)
      (inst add bsp (* binding-size n-word-bytes))
      (store-tl-symbol-value bsp *binding-stack-pointer* temp)
      (or tls-index tls-index)
      (inst jmp :ne tls-index-valid)
      ;; allocate a new tls-index

Hmm. Why doesn't that work either? This is (also) the kind of bug that's blindingly obvious once you realise it, of course: (or tls tls) is not a form that will assemble into the instruction that does set the condition register. It's a Lisp expression that evaluates tls then evaluates tls if tls was NIL. But then, as you see, doesn't actually use the result in any case. Eventually I noticed the missing bit:

      (inst or tls-index tls-index)

and things are substantially rosier. Now we make it all the way through cold initialization to the toplevel, but break with what looks like random memory corruption shortly afterwards.

There is one and only one aspect in which doing low-level stuff on an#

Tue, 03 Dec 2002 23:01:41 +0000

There is one and only one aspect in which doing low-level stuff on an x86 is appealing: hardware watchpoints.

After spending a lot of time today tracing through the disassembly for the note-undefined-reference function to find out how it was getting (0 . 0) into undefined-warnings - supposedly a list - I put a watchpoint on that memory location and reran. The value subsequently changed in get-output-stream-string, which is a perfectly normal library function that has nothing to do with the compiler and certainly no references to undefined-warnings. After seeing that it was the work of mere minutes (although, I concede, entirely too many of them) to realise that things would probably work a Whole Lot Better if GC were allowed to see the thread-local value vectors so it could update pointers when the objects move. Sigh.

Now we're back to a system that actually gets all the way through cold-init and PCL compilation to produce a usable Lisp. Admittedly, still not one that lets the user actually create threads (not much point adding thread creation primitives until consing is thread-safe, after all), but it's a start.

So, it looks like I write these entries once per CVS commit - or at#

Wed, 11 Dec 2002 01:22:06 +0000

So, it looks like I write these entries once per CVS commit - or at least, once per version of sbcl/threads that I believe represents some kind of advance on the previous state.

GENCGC (the GENerational Conservative GC that we use on the SBCL x86 port) already has support for `allocation regions'. These are small areas (typically a couple of pages each) within which consing can be done very cheaply: by bumping a free pointer and returning its old value. If we hit the end of the area, we have to stop and allocate another, which doesn't have to be contiguous. So, all we really need for parallelisable allocation is to have one of these areas open per thread. When any thread runs out of open region it can stop and get a lock from somewhere before updating gory gc details, but doing that once every two pages (arbitrary number which in any case we can tune) has got to be better than every cons (8 bytes).

So this is what I spent all week doing. Although the code was there, my guess is that it was several years old and had cerainly never been tested with multiple regions open at once. On my first attempt it gave me two overlapping regions, so I added in some stuff to stop it allocating from apparently-empty-but-still-open regions, so in retaliation it blew its mind and spent the next several days randomly blowing up with the kind of memory corruption bugs that I love tracking down more than anything. So, I know substantially more about the operation of gencgc than I used to, and I've managed to get a spot of unrelated tidying up in there too. Which is mostly just as applicable to the base (unthreaded) SBCL and I might even backport, depending on how much easier/harder I think it might make the eventual merge.

YAY#

Thu, 12 Dec 2002 03:52:13 +0000

YAY

  1. <SB-ALIEN-INTERNALS:ALIEN-VALUE :SAP #X4095F000>
  2. /pausing 15070 before entering funcall0(0x9269615) /entering funcall0(0x9269615) 9

NIL

The answer, in case you were wondering, is "extremely cool"

Thanks to the Phoenix Picturehouse for managing to show LoTR tonight despite#

Thu, 19 Dec 2002 01:17:02 +0000

Thanks to the Phoenix Picturehouse for managing to show LoTR tonight despite having had a power problem that knocked out half of their supply (and judging by the neighbourhood, approximately a third of the houses on the street. Looks like one of the phases had gone). Looking forward to the next round of Very Secret Diaries.

Thanks also to the Botley Road branch of Carphone Warehouse, for deciding that my mobile phone was in fact still under warranty and sending it back (again) for warranty repair (again). This time I showed them the engineers' reports from its last two holidays: one occasion they'd "reflowed filters and p.a." and the other time they'd "reflowed p.a.s and filters" - I think I managed to make the point that I would prefer they try something different this time (replacing the transmitter coil and upgrading the firmware is apparently the correct fix), but of course, I made that point to the staff in the shop, so it remains to be seen what the engineers will actually do.

No thanks to the Cornmarket branch of the same chain, who had decided that it was out of warranty, that the Sale of Goods Act was not relevant (personally I disagree, having old-fashioned notions that "fit for purpose sold" should usually imply "lasts for more than three months" when the item in question is a mobile telephone) and that really they could not offer any help at all. Next time I buy a phone, it won't be from you guys. I regret that the last one was, really.

And, really, no thanks to Ericsson, for (a) producing a phone with this bug in it anyway, (b) failing to spot it and fix it properly on the last two repairs. Five minutes with Google will tell you all you need to know about the T39m No Network bug - if it's been fixed in models made since end 2001, they really could have rectified it on either of the occasions it's been back to them in 2002. Ho hum.

You wanted to know about SBCL threading? Since the last diary entry, fixed stupid bug which was stopping the whole thread-local symbol access from working (creating a new thread was also setting %gs in the parent thread as well as the child, oops), then owing to not wanting to think too hard on Sunday evening about writing new VOPs for locking primitives, decided to take short break by reintroducing the control stack exhaustion checking that I'd disabled when doing the initial make-multiple-stacks work.

And also took rather longer break (Friday, Saturday, and some portion of Sunday and Monday to return suits, PA equipment and generally unwind afterwards) to do Best Man stuff for my friends' wedding. No, I did not lose the rings. No, nobody had any reasons that the two persons were not allowed to be joined in matrimony. Yes, they are now successfully married, and guests at least appeared to have enjoyed it. And laughed at (with?) my speech. But I don't write here about people with no web presence of their own, so that's all you hear about that.

Thinking about locks again, now. That's in the context of multithreaded systems, not nuptials.

As an aside I would like to insert a warning to those who identify the#

Thu, 19 Dec 2002 19:32:21 +0000

As an aside I would like to insert a warning to those who identify the difficulty of the programming task with the struggle against the inadequacies of our current tools, because they might conclude that, once our tools will be much more adequate, programming will no longer be a problem. Programming will remain very difficult, because once we have freed ourselves from the circumstantial cumbersomeness, we will find ourselves free to tackle the problems that are now well beyond our programming capacity.
Edsger W Dijkstra, EWD340

When he says it, it sounds credible. When I say it, it sounds like I'm whining.

Happy Christmas#

Sat, 28 Dec 2002 21:50:10 +0000

Happy Christmas. Mine was, anyway.

Today I went to the AGM of the SBCL Development Team (UK Chapter). Or, put another way, I had lunch with Christophe, and we divided the universe between us.

So we talked about what we were planning and/or hoping to do this year. All being well, you can look forward to

Other possibilities - these are less likely to happen without external contributions ("we're taking patches") or funding. Or a miracle might happen, even.

Happy New Year ...

:; phoenix#

Mon, 30 Dec 2002 19:54:37 +0000

:; phoenix
:;

Writing Error Messages, Rule 0: failing silently is not acceptable

:; phoenix --display :0
Fontconfig error: Cannot load default config file
:;

Writing Error Messages, Rule 1: when the problem is with an external file, print the file name. Rule 2: if there's an OS error of some kind, print the errno information.

:; sudo apt-get install fontconfig
[...]
:; phoenix

... and suddenly, it works. There's some comment to be made here also about Debian packaging, but the package was labelled experimental anyway, so I'm probbaly not going to be too harsh there.

First impressions:

Here's a screenshot. You may need to shift-click and save it if apache mod_proxy is interfering with my content-types again. Need to get that looked at, yes.