diary at Telent Netowrks

Did I say "restored when the handler returns"#

Mon, 05 Aug 2002 11:46:57 +0000

Did I say "restored when the handler returns"? What doesn't get restored when the handler returns - no matter how the return is done, or longjmped past, or ignoreed completely, or whatever, are the floating point trap bits themselves. This behaviour is common to at least x86, ppc32, and ppc64, and the kernel people think it's the right thing to do. So, we may as well get used to it.

Not that I have any pressing wish to do anything about it immediately, because the MSR-frobbing aspect still needs a conventional(sic) return from the signal handler to set up right, which we don't do at present, so probbaly we need to do some more control stack frobbing stuff.

So, instead I turned my attention to threading. After some discussion at the LSM we decided that co-op userland threads were just going to be not nearly as exciting as "proper" threads. General consensus regarding pthreads was also fairly rapidly arrived at ("let's not go there"), so lately I've been thinking about using clone(). This presents a number of issues

Remember all those antics last month with alternate signal stacks and SIGSTKSZ and the rest#

Wed, 07 Aug 2002 01:43:38 +0000

Remember all those antics last month with alternate signal stacks and SIGSTKSZ and the rest of it?

       starting address and size  of  the  stack.   The  constant
       SIGSTKSZ  is defined to be large enough to cover the usual
       size requirements for an alternate signal stack,  and  the
       constant  MINSIGSTKSZ defines the minimum size required to
       execute a signal handler.

SIGSTKSZ on x86 linux with whatever version of glibc I have installed here (2.2? whatever debian unstable has in it last month), is not actually big enough to call printf. Granted this is in general not a completely great idea anyway as printf is probably not reentrant, but it is somewhat disconcerting to switch on the "debug cold init by spewing messages to stderr" switch and get a whole different (and rather faster) failure mode.

We do entire garbage collections inside signal handlers (admittedly not on x86, so this specific problem doesn't arise there). Why do I feel queasy about this?

The symptom: we're getting a SIGSEGV due to writing into a write-enabled page#

Fri, 09 Aug 2002 00:25:51 +0000

The symptom: we're getting a SIGSEGV which (from looking at the arguments to the handler) appears to be due to writing into a write-enabled page. Yes, I did say enabled

The cause: I'd written (void *)foo-1 instead of (void **)foo-1

The intervening steps (in reverse order)

  1. The SIGSEGV was actually from executing an iret instruction, and nothing (much) to do with write-enabled pages.

  2. The iret was in a part of memory mostly filled with zeroes. In x86 assembler, zeroes disassemble to add %al,(%eax) which while pointless is basically harmless, so we really didn't know for how long it had been dashing through snowfields by the time it got there

  3. So, perhaps we should have a look at the stack. Here is your five minute guide to interpreting sbcl x86 stack traces:

    esp            0x403ff84c       0x403ff84c
    ebp            0x403ff870       0x403ff870

    0x403ff840: 0x00000008 0x00000008 0x0cafd99c 0x00000004 ^ sp 0x403ff850: 0x403ff850 0x0500000b 0x0d659a19 0x0caffeff ^ return addr 0x403ff860: 0x0cacc14b 0x00000000 0x0a32fbc0 0x403ff890 ^ prev frame 0x403ff870: 0x0cace37f 0x0cacc0a3 0x0caa778b 0x00000008 ^ ebp 0x403ff880: 0x0a330f14 0x00000004 0x0a330f14 0x403ff8d0 0x403ff890: 0x403ff8b4 0x0500000b 0x09463627 0x403ff874 0x403ff8a0: 0x00000014 0x0b3effd1 0x0cacc0a3 0x0500000b

    Start at ebp. The preceding word (address ebp-4) gives the ebp for the previous frame. Four words prior to that is the lisp return address (raw untagged address: x86 insns aren't all the same length after all)

  4. 0x0caffeff was full of zeroes. So were the code pointers for the preceding several frames

  5. Why do we have apparently correct control frames which contain such obviously bogus return addresses? Well, what if there were valid code there originally, which got moved by, say, GC? Like, the GC that just occurred a minute ago

  6. Oh look, we're scavenging the control stack from 0x403ffffe rather than 0x403ffffc as we should. See `The cause', above. Whatever values we look at from that angle, it's a pretty good bet they won't be recognisable lisp pointers. Duh.

Now I'm back at the state of having lisp which actually can build PCL and dump a core, I guess I should look at my new bind vop and see if it's doing anything yet

Language shapes the way we think#

Sat, 10 Aug 2002 14:53:00 +0000

Language shapes the way we think. This is easy to believe after several days reading and writing x86 assembler.

So we're talking about the new bind vop and related bits. The issue here is dynamic variable bindings: please skip the next two paras if you feel like it.

In Common Lisp - as in most languages, variables are usually lexically bound. That is, they're declared in the current function, or in the function textually enclosing it, or in that function's enclosing that, ad global. The alternative, dynamic binding, is that when a variable is not found in the current environment, we look at the environment of the caller, and if we can't find it there we work our way up the call stack. Lexical bindings make it a whole lot easier to see statically what's going on (your function behaves the same no matter who called it) which is generally considered better for everyone concerned (humans and compilers)

Which is not to say that dynamic binding is pointless. If you want to do something like pretty-print a tree using a recursive function, you might have a whole bunch of variables describing where to draw the next subtree (left edge, right edge, scaling factor, etc etc) which change as you descend a branch and which you need to restore as you unwind back up the tree. Dynamic binding gives you this behaviour For Free (as does using lots and lots of function arguments, but that gets kind of unwieldy when you realise you need to add another argument at every call site). So, CL (like Perl, in fact) provides both kinds of variable in the language . For historical reasons, we call the dynamic variables specials, and usually we mark their names with asterisks (like-so) to alert the programmer to what's going on. All clear?

OK, break over. Settle down, class.

If we want to implement dynamic binding in a fashion that makes variable lookup reasonably fast, we do it with a slot in each symbol that stores the current value, and a stack of (variable -> previous value) pairs. To rebind the variable (this is what the bind vop is for) we push the current value on the stack. To unbind, we pop the stack and set the value slot to whatever it was. That's unbind.

For a single-thread implementation this is great. You can code symbol value lookup as

(storew value symbol symbol-value-slot other-pointer-lowtag)

which translates to something dead simple like "store value in the location given by symbol+5" (FSVO 5 equal to symbol-value-slot*4+other-pointer-lowtag).

Obviously this doesn't work when you have many threads all wanting to bind the same symbols. If you have userland threads you can make the thread switch unwind and rewind the binding stack. If the kernel is doing the context switch you can't really make it do this for you, though. If the machine is SMP, there may not even be a context switch to happen: you could actually have two cpus executing lisp code simultaneously. So, you need some kind of per-thread storage area and a slot in the symbol to store an offset into this area

So, we have three options

Now I have to stop writing x86 assembler for a while, and start writing English prose - or some approximation thereto, anyway. The ILC people are probably expecting a paper some time between now and Thursday

Quiet lately#

Wed, 14 Aug 2002 02:00:58 +0000

Quiet lately. Writing English prose is just not as exciting as code - especially when it's either a conference paper or a C.V. (US: resumé) update. Yes, time to find more paid stuff. (with-blatant-commercial-opportunism "If you're looking for contract programming/consultancy in a CL- or Linux-related field, send email. CV on request")

Last night it was pointed out to me that the lyric I'd been hearing for the last 4 years as "He's got an ice lolly" (Propellerheads, Velvet Pants, Decksandrumsandrockandroll) probably in fact doesn't say that at all. Hmm. Still sounds like it to me, though.

Paul Graham, A Plan for Spam#

Fri, 16 Aug 2002 12:03:14 +0000

Paul Graham, A Plan for Spam.

I used spamassassin for a while, but removed it temporarily when it started eating my computer after being reintroduced to 200 emails at once when I'd been away from the net. And I haven't replaced it since, because I quite quickly realised that with an approximate 20:1 spam to real mail ratio after filtering out mailing list stuff, it's actually simpler to delete spam from the inbox by hand these days than it is to check the spam folder for false positives (which may be a couple of orders of magnitude rarer, and so much easier to miss). So, I don't have any filtering any more.

Probabilities better than scores? As raph pointed out, you can take logs to the probabilities and get scores, but I don't think that's the issue. The interesting point is how you arrive at the per-word numbers in the first place, and the advantage of the Bayesian system is that it's transparent. Assuming current styles of email communication, I doubt that you will see Paul Graham's webmail system decide that a valid signature delimiter is an indicator of potential spam.

But on a more general note, I think that Paul's "Defining spam" appendix is a pretty good indication that we have terminology problems. What he's built is not in itself a spam filter, it's an uninteresting-mail filter - actually a far more useful tool - and if he were to refer to it as such, a lot of the borderline cases go away. Domain renewals are interesting to me: offers from Verisign for a Free E-Commerce Web Site are not. It doesn't matter if they think I've opted in or if I have an existing relationship with them: the point is that I don't want it, and I don't need to define it as spam before deciding to filter it.

I define spam as persistent or large scale sending of email in which there is no reasonable expectation that the recipients will be interested.

This is not a good definition for computers to use; they tend to choke on words like `reasonable' - but that doesn't matter. Computers on the receiving end are just filtering for interestingness anyway and don't need to care if it's spam. Computers in the network are primarily concerned with abuse of the net, so they don't need to care if it's spam either. If it's relaying through my servers, or faking its origin, that's a good enough reason to stop it no matter what the message content.

Use of automation is a characteristic of much spam, but it's not essential or even exclusive. Suppose someone at Amazon has determined from reading my web pages that I like the Propellerheads, and sent me email to say that they have the new album at half price. That's welcome news to me, and it makes no difference whether they sent the same email to a million other people (we assume that they'd determined that those other million were equally as interested). On the other hand, you can hand-letter your offer of cheap toner cartridges on vellum with a quill ten thousand times and send me six of the copies (each to slightly different mailboxes which are all too clearly routed to the same eventual destination) by courier delivery, and it still represents the large-scale sending of mail where you clearly made no effort to determine whether the recipients were interested. Spam.

For the record, I don't want to know about toner cartridges.

Yesterday was Bletchley Park#

Mon, 19 Aug 2002 16:49:08 +0000

Yesterday was Bletchley Park. 35 miles is slightly over twice as far as anywhere I've cycled in that past two or three years, so I was quite pleased to get there in around two hours twenty minutes, especially as it turned out to be 40 miles including getting lost. 18 mph average is, I think, fairly respectable.

The Bletchley Park Computer Museum was kind of neat, but they would benefit (well, I would have benefitted) from the addition of a LispM or several.

Eventually it was time to leave, and at some point around here the realization that the return journey was likely to be significantly slower hit me. Three hours twenty, for an average speed slightly less than 13mph, and it didn't help to run out of water halfway back either. On getting to the outskirts of civilization (Headington) I found an open off-licence which would sell me a bottle of water and a Twix bar: thus rehydrated the final two miles were easy. Of course, most of them were downhill too, which didn't hurt.

Yesterday evening I had planned to go to the pub, but found I was basically too tired and went to bed in fairly short order after a really odd experience where someone sent me an encoded message I couldn't break, which sounded exactly like my phone ringing. Current working hypothesis is that perhaps it was actually just the phone ringing, and my brain had it been present would have been telling me I was already half-asleep and should turn the lights off and so forth.

Feeling remarkably well today, anyway. Mildly sunburnt, but surprisingly at least not stiff and aching all over. Maybe that happens tomorrow.

Today is deal-with-NTL day. NTL cable modems work fine until they go wrong. When they go wrong, trying to get a human being on the telephone can take most of a day. To fully express my feelings on the matter of the NTL customer service voicemail system would require the invention of several new words, but in the meantime, imagine circular voicemail systems, "your call is valuable to us", "all our operators are busy, please ring back" (after ten minutes navigating voicemail options, lovely), and enough slightly-out-of-sync customer databases that every time I ring them I learn about the existence of another one. Today I found I was in the Cable Modem technical support database with the correct postcode, but didn't show up when they did a postcode search. Or, for that matter, a search on subscriber name. My modem was in there and showed up perfectly normally on a MAC address search, but there was no link between it and me. I thought the purpose of takeovers and mergers was supposed to be to increase profit by integrating systems, not just amassing large numbers of disparate ones?

Anyway, the internet is b0rken, has been since some time on Sunday morning, and probably will be until someone at the NTL local office (which may or may not exist, because every attempt I've made to call it has been met with "number not recognized" or forwarded into the national system) gets a message to phone me back to arrange an engineer to visit. Not holding out much hope here, it must be said. If you send me email, expect to receive answers a little more slowly than usual: I've had to dust off the old analogue modem

Today is also, it happens, my birthday, not that I have any particular plans to celebrate being another year closer to death. But anyway, spoils of war so far: teatowels, set of torture^Wbarbecue implements, sundry cards, and volume 3 of the IA32 Intel Architecture Software Developer's Manual. I think that was sent without any particular reference to the time of year, but thank you anyway Mr Intel.

The CL pathname system is mostly pretty neat#

Tue, 27 Aug 2002 00:00:23 +0000

  1. P"/etc/init.d/apache" is an instance of class #<SB-PCL::STRUCTURE-CLASS PATHNAME>. The following slots have :INSTANCE allocation: HOST #<SB-IMPL::UNIX-HOST {5010EB9}> DEVICE NIL DIRECTORY (:ABSOLUTE "etc" "init.d") NAME "apache" TYPE NIL VERSION :NEWEST
  2. (pathname-directory my-path) (:ABSOLUTE "etc" "init.d")

Not all of the slots are useful on all possible systems: most Unix-based Lisps don't understand about any host other than the local one, for example. device is a bit useless on Unix too. But that's ok, it's there for when you need to manipulate pathnames on VMS boxen. Plus Unix doesn't really have file types as such; the foo.bar convention really is just a convention, so it's pretty much non-obvious what (pathname-type #p"foo.bar.baz") is without referring to your implementation. But overall it's a nifty facility that easily beats doing your own tokenizing for "/" characters.

Problem is, flushed with their success in providing mostly-useful pathnames, the ANSI people got a bit carried away and went on to invent these things called logical pathnames. At first sight these look really useful. Logical pathnames get their own hosts, and when you try to open them go they through a pattern-matching exercise to get mapped to customizable places in the real filesystem. For example

Note how the different file types (extensions) have caused it to go to two different places. Cool, huh?

Actually, No. Not very cool at all, when you start trying to actually use them. Let me just explain the rules which govern when you can use logical pathnames without getting very surprised thirty minutes later:

That's about it: pretend that you've got a filesystem image loopback mounted at that point that only Lisp can look inside, and your expectations will be approximately correct.

Example: the only reason that it looks like I've accessed lowercase files using this is that (a) lowercase names in LPNs are silently folded to uppercase, (b) the translation process to physical pathnames on Unix does case inversion.

Cool, huh?