diary at Telent Netowrks

Cool URL of the Day: In Pursuit of Simplicity: the manuscripts of Edsger W#

Mon, 01 Jul 2002 14:05:56 +0000

Cool URL of the Day: In Pursuit of Simplicity: the manuscripts of Edsger W. Dijkstra

They're scanned PDFs, of a somewhat ropey quality: readable at 1.4 magnification, though

Today is random asides day#

Mon, 01 Jul 2002 23:47:22 +0000

Today is random asides day

On Living with Packages

  1. don't ever use common-lisp-user for anything bigger than a noddy two-line test function. Anything important enough to get saved in a source file is important enough to set up a package for.

  2. define your package(s) using defpackage forms. I usually put these in their own source files, too.

  3. every file starts with an (in-package :package-name) form - except the file with the defpackage form, anyway

  4. make sure you load the files containing the defpackage forms before any of the other files

  5. use-package, import or any of the other "imperative" package stuff is, like the use of eval, generally an indication that you're doing something wrong. Not always, no. Usually

  6. a defsystem utility (e.g. mk-defsystem or - my preference - asdf) will automate the loading-stuff-in-the-right-order aspect

On being ill

Well, slightly under the weather at least. This weekend I had a wedding (not my own, no), a barbecue, a party and a nose full of microwaved cotton wool. Last Friday I had an all-afternoon wish to lie down and sleep unobtrusively in the middle of the carpet, which I now realise was probably early cold symptoms, plus a bunch of scary disk error messages from my colo box (Rackspace fitted a new disk, upgrading the OS version while they were at it, then I just had to reinstall all the locally-changed stuff), plus an evening in the pub. This is all in the nature of an excuse for not having done much hacking lately.

On naming

What really wasn't helping too much is that I'd originally set out at the start of the week to add some form of access control to the CLiki application (the CLiki site will stay as it is, but there are lots of other applications I'm imagining for which it would be better to know who the users are and whether they're Allowed), realised that the handler system for Araneida is suckier than it really wants to be for these kinds of application, and ended up redesigning large parts of it. And now I've created a new class called handler, I find that I want to introduce lots of new GFs whose names clash with existing Araneida functions. So perhaps this is time for a name change. Nobody appears to have produced http serving software called Boris yet, which is a pretty compelling reason in itself.

Anyway, we've got to a state where it basically runs again, so maybe we can get this wrapped up in time for Thursday

We have working CLiki again#

Wed, 03 Jul 2002 03:13:51 +0000

We have working CLiki again! Well, at least, all the bits I've checked so far are working again: it's not exactly been tested exhaustively yet.

I got sidetracked halfway through the afternoon staring at my four or five new GFs and thinking "hmm, why don't I use define-method-combination" until I realised that "because it provides no advantage that I can reasonably easily envisage and I do want to actually get this finished one day" was in fact the Correct Answer.

(For the avoidance of doubt, GF in this context stands for Generic Function. If I had four or five new girlfriends I'd ... well, for a start I'd remember whether I had four or five)

Whitespace sensitivity is cool#

Wed, 03 Jul 2002 14:13:47 +0000

Whitespace sensitivity is cool

For some reason I've started reading and contributing to Ward's Wiki a lot more lately. When I was first doing Wiki I was mostly using Netscape 4, which gives you some idea of how long ago it was, because I can't actually remember when was the last time I used Netscape 4. Mozilla rocks.

But, anyway, well. Except for this bit. "This bit" is that wiki has Text Formatting Rules which demand a variety of keyboard gymnastics, not least among them that various formatting (UL lists, BLOCKQUOTE, PRE etc) demands the insertion of a literal TAB (ASCII 007) character. Pressing TAB in a Mozilla text field, though, makes it go to the next text field instead of inserting anything, ^V doesn't seem to do the conventional readline "insert next character as literal" thing, and the conventional Emacs "insert next key as literal character" thing - don't press that. Without prompting to do anything with the four paragraphs you just typed, either ...

So, I cut and paste from my scratch buffer, which is silly but just about livable. But, this is wrong. This is Free Software. Even if I will probably never ever myself actually write code for a project in which the majority of a 45 minute "how to hack on $project" talk is given over to a description of getting reference counting right, I can at least file a bug report. Or at least, read the bug report that someone must already have opened on this subject and see if there's some way to do it already that I'm missing. Well, they did, and there isn't. Bug 29086, has been open for more than two years, and attracted so much argument that it's over 100k long as viewed by lynx -dump

Lessons we can learn:

(Yes, I know there's a workaround in Wiki, but it's heuristic and doesn't seem to do too great a job)

Of course, the Wiki markup rules started life as simple things,#

Wed, 03 Jul 2002 15:08:59 +0000

Of course, the Wiki markup rules started life as simple things, and it wasn't until later that people realised they could do things like typing indented-but-not-preformatted text by typing TAB : SPACE TAB at the start of the line (and possibly also typing the entire paragraph on one line; I forget whether it makes a difference in that specific instance. it does for italics, though)

This reminds me of some other software I've been using recently that starts out with a simple set of rules which combine in dizzyingly unpredictable ways to do things you would never have dreamed sensible. I'm talking about spamassassin here, of course.

I've never been entirely clear, it must be admitted, on how best to use all the weird gnus options to keep copies of outgoing mail, and eventually came to the conclusion that the simplest answer would be to Bcc it all to myself. This has the neat advantage that I can then use gnus splitting to filter incoming mail to some regular correspondent into the same folder as mail from said correspondent, and then I have both sides of the conversation in the same place. It has the (fairly innocuous) side-effect, though, that my outgoing mail loops through spamassassin before coming back to me, so I get to see how spammy my mail looks. Well, look at this

X-Spam-Status: No, hits=-98.4 required=5.0
        tests=SIGNATURE_DELIM,FOR_FREE,USER_IN_WHITELIST
        version=2.31

body     Standard signature delimiter present     SIGNATURE_DELIM    0.488  

A positive score for a signature delimiter? "- - SPACE \n", so rarely correctly implemented that it became favoured all over Usenet as the choice "get a real news reader" stick with which to beat Outlook users (and another great example of why whitespace sensitivity is a dumb idea, fwiw) is now suddenly an indication that the mail was sent from some bulk mailing program that injects fake unparseable received headers, has the wrong system date, and probably still thinks X-UIDL is a header that should be provided by the sending host? I find this not entirely plausible.

From conversations on IRC, I understand that the actual scores for each rule are computed by feeding a large pile of known-spam and another large pile of known-nonspam into a genetic algorithm and letting it work them out. This sounds like really great news for the stability of weights on each rule ...

I wish I could find my neural networks book. It was possibly the most boring book ever written about what should be an interesting subject, but I desperately want to be able to liken spamassassin to a neural network with only one neuron after the kind of particularly vicious overtraining that should have the owner in court on animal cruelty grounds, and I really could do with a reliable source to tell me whether it's a remotely fair comparison.

<dan-b> roll on, version control systems that allow directory renaming#

Thu, 04 Jul 2002 02:55:36 +0000

<dan-b> roll on, version control systems that allow directory renaming
<dan-b> this whole thing with having to pick a name before starting work really sucks

entomotomy, Zool. the science of the dissection of insects to ascertain their structure, insect anatomy.

I'm sitting on a train, going to London#

Fri, 05 Jul 2002 10:45:33 +0000

I'm sitting on a train, going to London. I don't actually want to go to London. I want to go to Didcot - not per se, I add hastily, but as the first target in a plan that subsequently involves boarding another train that will take me to Bristol. I joined this train under the impression that it also wanted to go to Didcot. Either I got the train destination wrong, or some other body (Railtrack, Network Rail, Oxford station managers, Thames Trains, the driver - it's hard to tell who's responsbile these days) did - but the Railtrack web and wap sites both agree with me, and the information screens at the station likewise. Bum.

Train number two, then#

Fri, 05 Jul 2002 12:27:22 +0000

Train number two, then. The two ticket inspectors who've looked at my ticket so far have not commented on the bit that says "Route: NOT READING". Perhaps that only restricts people who get off at Reading: I've now been through said station twice, in opposite directions, but have remained on the train. Or perhaps they could sense I was looking to start an argument and wisely didn't comment.

Well, at least I'm here, and in a talk#

Fri, 05 Jul 2002 15:56:37 +0000

Well, at least I'm here, and in a talk ...

Wireless networking aside: the university has wireless access, but ...

First you get a username and password from the registration desk. Then you wrestle with it.

  1. apt-get install pppoe pptp-linux
  2. vi /etc/ppp/peers/dsl-provider edit "user", remove lcp-echo stuff (the network is slow enough that you won't want to drop your connection every time it lags a bit), remove defaultroute option
  3. pon dsl-provider

    Bum. authentication isn't happening

  4. vi /etc/ppp/peers/dsl-provider
  5. vi /etc/ppp/peers/chap-secrets
  6. pon dsl-provider

    aha, that's better

  7. cat > /etc/ppp/peers/pptp user Gwxd962wRa noipdefault defaultroute

  1. route add -host 172.16.12.246 gw 172.16.10.250
  2. pptp 172.16.12.246 call pptp

A half hour wasted by spamassassin#

Sat, 06 Jul 2002 15:44:13 +0000

A half hour wasted by spamassassin. I reattached to the wireless lan when I got here this morning, ran fetchmail, and watched my computer eat all its swap and drive the load average to something above 40 trying to start 70 copies of spamassassin. It's really time I sorted some kind of daemon-based approach; in the meantime I just disabled the thing. Perhaps I should replace it with some elisp/gnus stuff that I can actually trust.

A PHP talk: "Using PHP for large web sites". Can be summarised as "PHP is too evil to use for large web sites" (slightly longer summary: "PHP is far too evil to use for large web sites), but was more entertaining in the long form. It was gratifying to have my uninformed three-hours-reading-the-docco opinions on PHP confirmed by someone who has actually used it in anger.

A Bugzilla talk: how to write a scalable featured database-backed bug tracking and management system.

Not that I think any of these are going to be requirements for Entomotomy in the near future, but we can hope.

They also have serious problems with low-quality bug reports, and a lot of their development (the UNCONFIRMED status, voting on bugs, etc) has been driven by the need to manage this. Hope this won't be our problem either in the near term.

Out-of-context quote of the day: "Bugzilla is a very forked project"

Bug 12411: Mozilla does not have a kitchen sink (look at the attachments)

An RT talk: RT is an insanely customizable ticket tracking system. But then, it's written in Perl, which (when done right) is almost sufficient to get you both of "customizable" and "insane" For Free.

Zope 3. You thought Zope was a web application server, but actually it's an "Operating System for the Network". I found the combination of buzzwords, warmth in the lecture theatre, and XML-like code in the slides to be unfortunately soporific, but as my network doesn't actually need an operating system I thought I'd not be missing anything earth-shattering if I slipped out early.

glibc 2.3: the existing C library locale model is inadequate#

Sun, 07 Jul 2002 10:09:46 +0000

glibc 2.3: the existing C library locale model is inadequate for threaded applications. I can believe this but I'm not sure I see why the POSIX locale model is worth caring about anyway. The current thread-local storage API is ugly, so 2.3 introduces a new thread keyword to C, which has the same approximate syntax as register or static. And people are using internal symbols (setfpucw, anyone?) and geting broken by it on glibc upgrades, so from now on all the internal symbols are marked with the version GLIBCPRIVATE.

Two years after I first started hacking on CMUCLish stuff, I still#

Wed, 10 Jul 2002 09:42:10 +0000

Two years after I first started hacking on CMUCLish stuff, I still don't understand the CMUCL compilation procedure.

(Actually I don't understand a lot of the SBCL compilation procedure either, but that's ok, because I don't need to - it Just Works (tm))

Eric Marsden is about to tell us all how it works. Once he manages to get the projector working with his Powerbook, anyway.

I've been lax updating this thing given the amount of stuff I#

Fri, 12 Jul 2002 15:09:22 +0000

I've been lax updating this thing given the amount of stuff I potentially could have been saying in it lately. So, where are we?

We are at the Libre Software Meeting 2002, in the Very high-level languages for writing applications topic. The rationale here, according to the programme - look, if you would only go and read the programme, it would save me from having to paraphrase it like this - is that the kernel is done, so now we need to write applications, and we need these very high-level languages if we're going to finish the job in timescales that the human brain can relate to. Just ask any Mozilla developer.

In fact (as you can reasonably infer from the schedule) it's all a thinly-veiled plot to get many free Common Lisp developers together in the same place. Works For Me. So we've spent the last few days discussing things like garbage collection, coping with special (a.k.a dynamically-bound) variables during multithreading, packaging, bug tracking, etc etc). And, as is often the case where two or more lisp implementors are gathered in the same room, the very high-level languages we've actually ended up using are C and x86 assembler. Yeah. Um. Christophe is sitting next to me working on floating point trap handling (see also the previous attempts to tackle this), Eric Marsden was - until an hour ago when he went to catch his train - poring over trace files to figure out why his CMUCL "small" images no longer worked, and I am watching SBCL build with a new design of non-invasive stack overflow handling. Currently it can non-invasively detect stack overflow, but it can't actually recover from same, which makes it a bit useless. Cargo cult programming is not a sane or sensible way to write x86 assembler glue.

And we've had some talks too, of which more some other time.

More on control stack exhaustion detection, then, seeing as it pretty#

Sun, 14 Jul 2002 14:07:23 +0000

More on control stack exhaustion detection, then, seeing as it pretty much works.

The goal is that when a (probably newbie) user evaluates an infinitely recursive function, we handle it safely instead of just running out of space for our stack frames and crashing. In current SBCL releases we do this by insterting a call to %detect-stack-exhaustion at the start of each lambda in safe code, but that tends to increase the code size (internally, lambdas are used all over the place), and it's not very fast, and it doesn't work at all (it's turned off) when optimization qualities are set to prefer speed to safety.

So, why don't we do it by protecting a page near the end of the control stack and inserting another clause into the SIGSEGV handler that notices this, presents the user with an error message and lets them abort the computation. Sounds simple.

  1. In the first attempt, I forgot that the control stack on x86 grows downwards not upwards: protecting the top of the stack made for a really short-lived program. Clearly I have been hacking Alpha and PPC for too long almost long enough.

  2. The second version fixed that, but the other weirdness in the x86 port is that the Lisp control stack uses the C stack pointer (there being a lack of registers on an x86 to keep pointers in, they decided to make %esp, %ebp do both). So, signal handler stack frames are also created on this stack. When we hit the guard page we get an infinite SIGSEGV recursion, because the kernel's trying to put sigcontext and siginfo structs onto the page to call our handler, and the page is still protected

  3. So, sigaltstack() seems like a good thing. All we need do is allocate a few pages somewhere else and make SIGSEGV handlers use it. Then, as our handler calls back into Lisp to give the user his aborts (this is scarily non-POSIX behaviour, but it works on all interesting platforms anyway), we'd better detect this unusual situation in the assembly glue that translates between C and Lisp calling conventions, and fix up the stack pointer to be pointing back into the usual Lisp control stack. After several attempts to write this without any knowledge whatsoever of x86 assembler (it turns out that jbe is "jump if below or equal", not "jump if bigger or equal") we actually have a Lisp that will detect when the stack overflows and drop the user into the debugger. Note for later: it would be wise if the error message cautioned him not to ask for a backtrace without limiting the number of frames printed. Ahem.

  4. Recovery is still a problem: it seems that stack unwinding wants to read the guard page. We fix this by changing the protection to protect against write only. Then we need to find somewhere sensible to re-enable the guard page: the 'ABORT' restart looks like a good place to do this. Then we need to disable the previous manually-performed stack checking, because it's getting in the way. Time for another full rebuild just to make sure it's really gone.

  5. Then, I think, we're pretty much there except for cleaning the patch up and defining some meaningful names in parms.lisp to replace the ickier magic numbers we've used. Purely to complicate the issue, the magic SIGSTKSZ (to which choice of name all I can say is ENOVWLS) is defined in <signal.h>, which file cannot be included in x86-assem.S, so we have to write a small C program to grab its value and print it out to a file that can be #included in assembler.

Oh Intel, we love you.

So why am I telling you all this? I don't know, but I felt I had to tell someone.

In other news, CLIM rocks#

Sun, 14 Jul 2002 16:42:25 +0000

In other news, CLIM rocks.

Control stack exhaustion is still in a state of "pretty much"#

Mon, 22 Jul 2002 13:31:04 +0000

Control stack exhaustion is still in a state of "pretty much" (i.e. "not") working. This past weekend was more or less the deadline for commits for the next SBCL release (0.7.6) so I thought I had better try it on something other than an x86 Linux box. First step, the Alpha. That worked OK, so, FreeBSDeers, prepare for battle.

(Why FreeBSD? There's one in the sourceforge build farm. The sourceforge build farm is sucky in entirely numerable ways, but at least it's there).

First hurdle: it uses SIGBUS instead of SIGSEGV for writes on unmapped pages. That's easy enough. Second problem, some screwup with overlapping mmaps that really ought not to (defining a signal stack in the middle of the dynamic space is bad, ok?). Third problem: the second time you hit the guard page, you get a SIGBUS followed immediately (before the sigbus handler runs, even) by a SIGILL. I don't know why.

Funniest URL of the day: this bit from the FreeBSD Developer's Handbook.

While a typical Windows application is attempting to do everything imaginable (and is, therefore, riddled with bugs), a typical Unix program does only one thing, and it does it well.

Sure. So I imagined cat -s and ls -C, or are they not typical Unix programs?

In the Olden Days when I was writing FTX13, I used periodically to#

Tue, 23 Jul 2002 12:00:14 +0000

In the Olden Days when I was writing FTX13, I used periodically to report on new versions of LISA, a production rule system for Common Lisp. As with so much of the other stuff that went into FTX13 I'd never actually downloaded it to try for myself, so these reports tended just to recycle whatever the project release notes said. The accrued boredom and low-level guilt from doing this over a six month period was one of the factors in giving up FTX13.

So, there's a new version of LISA out today (1.3), and I still haven't looked at it, and i still feel vaguely uneasy about this. Anyway, you can probably determine whether it's useful for you: see e.g. mab-clos.lisp

Ha#

Tue, 23 Jul 2002 17:29:52 +0000

Ha!

The FreeBSD problem was that it really didn't like us fixing up the stack pointer (to swap back to the normal control stack) by hand in assembly: when it's about to call the signal handler on subsequent occasions, it seems to remember that it's still on the alternate stack - note if you will that we aren't actually ever returning from the signal handler, we're just calling into lisp and letting lisp unwind past it. I assume it's storing the running-on-external-stack-p property somewhere instead of just comparing the stack pointer with the bounds of the alternate stack.

Either we fix this somehow, probably involving the perusal of FreeBSD kernel sources, or we think of a new approach. The latter sounds tempting ...

Here's how to do it, then.

  1. In the signal handler, fake a control stack frame in much the same way as fakeforeignfunction_call - the function that prepares us for calling into Lisp after a signal is received - would have done. This is actually a no-op on x86 anyway.

  2. Frob the PC in the signal context, plus sundry other registers to make it look plausible, so that the signal handler "returns" to our error-raising function. This is the lovely piece of code

        function=
           &(((struct simple_fun *)
              native_pointer(SymbolFunction(CONTROL_STACK_EXHAUSTED_ERROR)))
             ->code);
        *os_context_pc_addr(context)= function;
    

  3. Amazingly, this works. And it lets us delete all that awful x86 assembler, which pleases me no end as I feared that I'd be stuck answering questions when it turned out to be buggy.

Patch going to sbcl-devel mailing list and/or being committed (I'll decide which after I've read it again and formed an opinion on the essential/accidental grottiness ratio)

ICanCAD is a GPLed#

Tue, 23 Jul 2002 21:33:26 +0000

ICanCAD is a GPLed CAD editor for analog(ue) and mixed-mode signals. It's built using Allegro CL and their CLIM, which unfortunately makes it difficult for anyone else to hack on it unless they have the full Allegro system - CLIM doesn't come with the eval version. But you can download the ICanCAD Linux/x86 binary and play with it for yourself, which makes it kind of rare as CLIM apps go. Requires Motif 2, which is in Debian - albeit in non-free. Here's a completely unattractive screenshot of ten minutes wild flailing around.

What was it I said the other day? Oh, yeah, ``CLIM Rocks''. Well, um. it could rock even harder if they didn't all want to use that ugly Courier font all the time. Does Not Make For Pretty Pictures. Contrast with, say, Closure (that's from McCLIM, probably quite a recent snapshot of) and you'll see what I mean. Come on guys, give us stuff we can make screenshots from.

SBCL 0.7.6 is out, doesn't include my stack checking fixes. SBCL 0.7.6.2 is in CVS, and does. And I even built it on PPC to check they work there too.

I've just spent the last hour reading the SBCL generational garbage#

Wed, 24 Jul 2002 13:26:46 +0000

I've just spent the last hour reading the SBCL generational garbage collector in Borders' cafe, while thinking "I'm getting remarkably good battery life here". Too late I realise that I'm looking at the wrong percentage figure in the emacs mode line - that 91% is a measure of how much I've read, not how many of the electicity beans I've retained. That would be 42%. Doh

Aforementioned cafe hacking was in Borders on Oxford St (in London),#

Wed, 24 Jul 2002 16:37:46 +0000

Aforementioned cafe hacking was in Borders on Oxford St (in London), after a shorter-than-expected meeting with the accountant. After finishing my battery and my orange juice (and confirming that both of my wireless cards work, insofar as the ability to find the access point for some random local wireless network denotes workingness) I went off in search of electricity.

Twenty minutes later I was accosted by an employee who told me that I couldn't plug that in there. She was perfectly polite about it, and maybe it is official policy there, but it seems kind of an odd policy when so many other places (airports, train stations, even the train I took to London) either tolerate or encourage it. Oh well.

So I came home instead.

While we are on this subject, this kernel change on the sourceforge#

Mon, 29 Jul 2002 17:54:28 +0000

 While we are on this subject, this kernel change on the sourceforge
 machine meant that I had to transfer my ppc testing laboratory to Dan's
 iMac. And I found another nasty surprise waiting for me there. On the SF
 RS/6000 PPC, the floating point modifications worked as expected, giving
 the right kind of exceptions in the right circumstances. On Dan's iMac:
 
 * (/ 1.0 0.0)
 
 1.0 ; should signal DIVIDE-BY-ZERO

(Christophe's email to sbcl-devel). So, somehow, I got nominated to look at it.

I haven't actually fixed it yet. The immediate problem is that it's not sufficient to twiddle the FPSCR to enable floating point traps: you also have to set two bits in MSR (which, incidentally, you can't directly, as it's a privileged register). That's not actually a major problem, because there's a neato glibc function called feenableexcept() that does this (strace suggests that it works by installing a signal handler for SIGUSR1 that frobs the on-stack MSR, then doing kill(getpid(), SIGUSR1)). What is a major problem is getting it to stay set, because it gets reset in signal handlers, and restored when the handler returns - which punts us back into last week's problem, that half of the time, we don't return from said handlers in any conventional kind of way. Um. More details here