diary @ telent

By the time you read this I'll be back home again after the#

Sun Aug 3 22:00:33 2003

Topics: lisp sbcl

By the time you read this I'll be back home again after the UKUUG's Linux 2003 summer conference, at which I was speaking on "Native threads for SBCL". The paper is at http://www.linux.org.uk/~dan/linux2003/; I'm told it's also on the conference CD and may also end up on the UKUUG web site eventually.

And the slides. Well ...

On Wednesday I roughed out how many slides I should do (one for every two minutes) and the titles. By mid-afternoon on Friday I'd more or less decided on the body content too. This was all in a text file, so should be easy to turn it into a presentation, were it not for some fairly strange requirements I have for a usable presentation graphics tool.

So, after the Scottish Buffet (a Scottish Buffet, it appears, is what we in England call a "sit-down meal", although no less tasty for all that) on Friday evening I headed back to my room, paged through the slide text for a few minutes, and when I next looked up it was early on Saturday morning and I'd written the bare bones of a presentation graphics program using CL - my first CLX program, too. (Yes, that's right, I maintain(sic) a CLX port for SBCL despite never yet having written anything that uses it)

Matters arising, in no particular order

More about the conference itself later.

Having had the same email address since sometime in 1996, and being a#

Wed Aug 6 20:09:37 2003

Topics:

Having had the same email address since sometime in 1996, and being a fairly frequent poster on usenet, I get a lot of spam. Enough in fact, that I can delete it by hand faster and more accurately than spamassassin can, even just looking at the subject/sender name. (Yes, that's right, I got annoyed with my spam filters and turned them off again)

Anyway, I thought this one was funny:

Envelope-to: webmaster@telent.net
Subject: telent.net
To: webmaster@telent.net
From: Leslie Oneal<leslieoneal@link-builder.com>

I am contacting you about cross linking. I am interested in telent.net because
+it looks like it's relevant to a site that I am the link manager for. The site
+is about search engine optimization research and technology for top search
+engine placement.

I keep the web address confidential and will send it to you only if you give me 
+permission to do so. Just let me know if it's OK, and I'll send you the web   
+address for your review. If you approve of the site, then we'll exchange links.
[---]

Now, I'm no more of an expert in "search engine optimization" than the next man, tending to believe that you can get some pretty good rankings on search engines simply by providing useful or interesting content that people want to link to, but even I know that omitting to tell people the address of your site is not the best way to increase traffic to it.

So I guess the question is: exactly how gullible does one have to be to fall for this rubbish?

http://www.function-pointer.org/#

Thu Aug 7 15:27:46 2003

Topics:

http://www.function-pointer.org/. Can you say "Greenspun's Tenth Rule of Programming"?

What I should be doing: hacking on CLiki to allow two cliki instances#

Fri Aug 8 20:11:25 2003

Topics: cliki

What I should be doing: hacking on CLiki to allow two cliki instances to share the same pages (and the same methods for search-term-relevance and the rest), so that we can have multiple `skins' for the same data.

What I'm actually doing: finishing up and committing the timezone/DST stuff I was talking about the other day. Then, probably, having a shower before going to the pub.

One of my ambitions w.r.t Lisp packaging is to make the ancient#

Sun Aug 10 20:48:20 2003

Topics: lisp sbcl cliki asdf

One of my ambitions w.r.t Lisp packaging is to make the ancient CMU Common Lisp on Linux document obsolete.

OK, so I could achieve that (and in fact, have) just by leaving it three years without an update, so perhaps I should restate that goal more clearly: I want to make this document (and others like it) unnecessary.

Anyway, another step further down that road today, with the release of asdf-install into an unsuspecting SBCL CVS tree. Now we can download and install CCLAN modules from the command line (or from inside a running Lisp)

This week, Jakob Nielsen's #

Mon Aug 11 17:30:00 2003

Topics:

This week, Jakob Nielsen's Alertbox says

Excessive word count and worthless details are making it harder for people to extract useful information. The more you say, the more people tune out your message.
then goes on to prove the point with another 500 words.

Once again I find myself in that place where something that sounds#

Fri Aug 15 16:52:26 2003

Topics: lisp sbcl

Once again I find myself in that place where something that sounds like a half-hour hack turns out to take all day and still doesn't work. This time it's implementing interrupt-thread (traditionally called process-interrupt), and, being lazy, I'm going to point you at the sbcl-devel article I wrote earlier this afternoon. Note two things since that mail

Oh, and for completeness' sake, the test program

The syscall problem indeed wasn't the problem - or even a#

Sun Aug 17 03:43:17 2003

Topics: lisp sbcl

The syscall problem indeed wasn't the problem - or even a problem.

The real problem was that we weren't saving/restoring a pile of stuff (most immediately obvious gap: floating point modes) that we should have been. This was fixed by calling call_into_lisp(function) instead of function directly. And a certain amount of messing around with our fake stack to make it both correct and intelligible to SBCL's backtrace

The non-problem was that during an interrupted select() call, eax was changing from 514 (as seen by ptrace when the signal went off) to 4 (as seen in the signal handler). The former is ERESTARTNOHAND, the latter is EINTR. It's kind of interesting that ptrace() gets to see (what as far as I can tell is) the kernel-internal value of eax, but I can't right now think of any way this might be exploited, so oh well.

return_to_lisp_function (soon to be renamed arrange_return_to_lisp_function) is now in CVS.

Having successfully thought myself back into the role of SBCL#

Mon Aug 18 12:51:40 2003

Topics: sbcl cliki lua

Having successfully thought myself back into the role of SBCL maintainer on Saturday, I decided it would be nice to fix the longstanding (well, few-months-old) bug that finalisers don't. This is actually a symptom of a more general problem: that I removed all the GC hooks completely because the way we were running them wasn't thread-safe.

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  Command           
17849 dan        9   0  150m  72m  896 S  0.0 58.5  10:17.89 gdb               

(gdb) info bre
Num Type           Disp Enb Address    What
9   hw watchpoint  keep y              *(int *) 1079316548
(gdb) cont
Continuing.
Error evaluating expression for watchpoint 9
Error accessing memory address 0x40551044: No such process.
Watchpoint 9 deleted.
Couldn't write debug register: No such process.
(gdb) ^D
A debugging session is active.
Do you still want to close the debugger?(y or n) y
Detaching from program: /home/dan/src/sourceforge/sbcl/src/runtime/sbcl, process 17967
ptrace: No such process.
(gdb) 

It kind of works. It works up until the point that the pseudo-atomic-interrupted flag mysteriously sets itself without saving the handler/signal/siginfo etc that is supposed to be saved when a PA section is interrupted, so we jump to zero and go boom.

But now it really really is time to return to my assault on CLiki and related stuff, and then investigate programmatic interfaces to Streetmap (which I think may involve XML somewhere).

I was going to write something here to describe the recent GC#

Fri Aug 22 03:15:04 2003

Topics: lisp sbcl

I was going to write something here to describe the recent GC rearrangement work. But then I spent a long time on a mail to sbcl-devel explaining it all anyway. So then I was going to put a pointer here to the list archive. But then it didn't show up in the list archive, probably due to mail servers everywhere being hammered by windows viruses and their associated bounce messages. So I may as well cut and paste, just to have something in the diary. If you read sbcl-devel, you'll (eventually) get this twice. Sorry about that.

The stop-the_world_branch is ready for wider testing (anoncvs users,
please be sure you have the latest; version.lisp-expr is
"0.8.2.38.stop_the_world.7").  It's too big a change for 0.8.3
release, but will probably be merged early in the 0.8.4 cycle.

If you have any application that uses threads, please download it and
give it a go, and report progress or lack thereof.  If you have any
kind of single-threaded SBCL, I may well have broken that in the
process : if you feel like sorting out the wreckage and
sending/committing patches, that will be a help.

What it does:

1) move from ptrace() to signals for stopping threads at GC time.
This probably makes it easier to port to other systems - except that
we're using a couple of Posix RT signals, which FleaBSD apparently
doesn't have, but I'm sure that can be worked around somehow - and
also means that you can attach strace to sbcl threads without them
dying horribly due to ptrace reparenting larks.  Thanks to Gary Byers
for pointing out (some time ago) this way of doing things.  I think he
said he got it from the Boehm collector.

2) clean up (or at any rate, rewrite to be differently messy) a pile
of stuff in the signal handling.  Sometimes this was just adding
commentary, but also I think the new functions maybe_defer_handler and
run_deferred_handler make it a lot easier to write signal handlers
which obey the pseudo-atomic rules - and to know that they're
uniformly obeyed in the existing handlers that profess to observe
them, of course.  This also fixes a signal-safety bug in
interrupt-thread (which, sadly, 0.8.3 will ship with, but
interrupt-thread didn't exist at all in 0.8.2, so if there are any
threads users who are insufficiently bleeding edge to be able to
compile their own CVS versions, they probably won't miss what they
never had)

3) I wrote last week about return-elsewhere (the new arrangement for
calling Lisp code as a result of signals being handled: see previous
message at http://article.gmane.org/gmane.lisp.steel-bank.devel/1737 )
but I don't think I announced that it was working.  It is (though 
it's only used for interrupt-thread and control stack exhausion).
Most of this was done on HEAD before the branch, but if you're going
to hack around with C-level signals stuff I suggest you look at the
branched code anyway because the new *_defer* functions make it easier
to get right.  If they're right, at least.  Otherwise they make it
easier to blame incorrect operation on me; granted that's a poor
second best, but it preserves your honour at least.

4) move from a dedicated thread for GC which runs no lisp code, to a
system where any thread can collect garbage.  The immediate advantage
is that we now have a thread-safe way of running hooks before/after
GC, and the most often asked-for hook is something to run GC
finalisers.

I haven't actually _implemented_ anything to call said hooks yet.  It
should be a couple of lines of code, but I thought I'd ask first:

a) there are a couple of hook lists for before/after gc in the system
already.  Functions on these lists take no arguments and return
nothing interesting

b) CMUCL also has notifiers (or something: I'm going on memory and
it's been a while):  these get arguments for the amount of time/memory
spent/saved/wasted/until next gc, etc.  

I'm inclined to unify hooks and notifiers in SBCL, retaining the
zero-argument hook signature, and shove any interesting-looking kind
of gc statistics into special variables where calleds could either get
them or ignore them at their leisure.  Comments?


:; cvs status version.lisp-expr
===================================================================
File: version.lisp-expr Status: Up-to-date

   Working revision:    1.1167.2.6
   Repository revision: 1.1167.2.6      /cvsroot/sbcl/sbcl/version.lisp-expr,v
   Sticky Tag:          stop_the_world_branch (branch: 1.1167.2)
   Sticky Date:         (none)
   Sticky Options:      (none)

:; tail -1 version.lisp-expr
"0.8.2.38.stop_the_world.7"


Enjoy

New release (0.4) of net-telent-date#

Sun Aug 24 02:27:40 2003

Topics:

New release (0.4) of net-telent-date. I always feel slightly guilty about the name of this package: I named it when I was on a bit of a "all packages should have guaranteed-unique names, and using the dns is the best way to ensure this" kick, but most of the actual code in it is in fact lifted from CMUCL, so calling it something with 'telent' in the name is in a certain light tantamount to passing off. Oh well.

Anyway, new function universal-time-to-rfc2822-date, which attempts to do local time. It follows the existing practice of decode-universal-time for its argument list, which essentially means there are only two ways to use it that don't require a teatowel wrapped around the neck to avoid dribbling brain on the floor:

  1. without the optional argument, you get local time
  2. with 0 as the optional argument, you get UTC (or GMT)

In a spirit of 'release early, release often', I have decided to start#

Sun Aug 24 02:56:24 2003

Topics: cliki

In a spirit of 'release early, release often', I have decided to start making new araneida/cliki/etc releases more or less as often as I fix bugs, except when I definitely know it to be in a broken state. This is as distinct from my previous approach of only releasing when I was at least moderately confident that it would work, which meant that I never released anything

Of course, the Internet (or perhaps NTL) has chosen the occasion of this decision to break, so

It might also be the impetus I need to hone my release procedure a bit further.

cl-ppcre is a portable CL#

Sun Aug 24 07:16:35 2003

Topics: sbcl cliki asdf

cl-ppcre is a portable CL package providing perl-compatible regexps. There are two things I'd like to say about it on the basis of ten minutes use

  1. It's asdf-packaged. Although it's not in cclan, which means I can't download it with asdf-install because I haven't got a trust relationship with Edi Wietz' PGP key, I can download it from the page and then painlessly install it from local disk.
    • (asdf-install:install "/home/dan/cl-ppcre.tgz") Install where? 1) System-wide install: System in #P"/usr/local/lib/sbcl/site-systems/" Files in #P"/usr/local/lib/sbcl/site/" 2) Personal installation: System in #P"/home/dan/.sbcl/systems/" Files in #P"/home/dan/.sbcl/site/" -> 2 Installing /home/dan/clppcre.tgz in #P"/home/dan/.sbcl/site/",#P"/home/dan/.sbcl/systems/" cl-ppcre-0.5.7/ [ ... and so it goes ]
  2. Although billed as perl-compatible, that doesn't mean you have to use the Perl syntax: you can also write regexes as trees. For example

    (defparameter *postcode-p*
      (let ((a '(:CHAR-CLASS (:RANGE #\a #\z)))
    	(n :digit-class)
    	(s '(:GREEDY-REPETITION 0 NIL :WHITESPACE-CHAR-CLASS)))
        (cl-ppcre:create-scanner
         `(:alternation
           (:sequence ,a ,n       ,s  ,N ,A ,A)
           (:sequence ,A ,N ,N    ,s  ,N ,A ,A)
           (:sequence ,A ,A ,N    ,s  ,N ,A ,A)
           (:sequence ,A ,A ,N ,N ,s  ,N ,A ,A)
           (:sequence ,A ,N ,A    ,s  ,N ,A ,A)
           (:sequence ,A ,A ,N ,A ,s  ,N ,A ,A)))))
    
    * (cl-ppcre:scan *postcode-p* "w1y 1ha")
    0
    7
    #()
    #()
    * 
    

    That's a regex for UK postcodes (actually, not quite. I should add the special case for GIR 0AA, which apparently is Alliance & Leicester Girobank Plc in Bootle). Isn't it just much easier to see what it does as a tree?

Sometimes I wonder whether the C library maintainers want to actually#

Sun Aug 24 22:09:08 2003

Topics:

Sometimes I wonder whether the C library maintainers want to actually discourage programmers from e.g. writing signal handlers. In glibc 2.3.1 on the 32 bit PowerPC, it was possible to access machine registers in a signal context (e.g. from a SA_SIGINFO signal handler) by doing something like

   ((context->uc_mcontext.regs)->gpr[offset])

In 2.3.2 this fails with the message

  error: structure has no member named `regs'

It looks from a brief inspection of the glibc cvs as though changes to the ucontext structure were made for compatibility purposes with ppc64. Which is probably reasonable; can't stand in the way of progress, Mr Dent. So this code should now be

   ((context->uc_mcontext.gregs)[offset]);

The problem, of course, is that this code doesn't build on the older version, which a lot of people (me, certainly, and most other people probably) are still using. So, this calls for conditional compilation based on some header feature: icky, maybe, but necessary.

Based on what header feature, exactly?

/ Major and minor version number of the GNU C library package.  Use
   these macros to test for features in specific releases.  /
  1. define __GLIBC__ 2
  2. define __GLIBC_MINOR__ 3
<features.h>

Whoever wrote this file obviously subscribed to the (quite sensible, IMO) point of view that new features were not going to be introduced in patchlevel releases. Whoever added the 64 bit support, which breaks our source code compatibility, seems to have quite different ideas :-(

I suspect we'd disagree on the details#

Sun Aug 24 23:48:00 2003

Topics:

I suspect we'd disagree on the details. In fact, we'd disagree on quite a lot of the general approach, never mind the specific details, if only because I'm quite attached to CL and have already invested significant time in it (the reader may harbour his own opinion on the direction of any possible causal relation there). But in general terms I really do think Tom Lord is onto something with his thoughts on free software architecture


For a long time, the right strategy for GNU was to build a basic unix
replacement differentiated primarily by licensing.   As software goes, 
the core of unix is a simple architecture, reflecting its history as a
design first realized by a very small team of people.

Well, that part's done and the strategy won.

Nowadays, the proprietary competition is about databases, and
productivity apps, and browsers, and middleware layers.  The software
we're competing against is not like unix: it isn't simple; it wasn't
built by a small number number of people; it's a moving target.  It
isn't a tractable project to clone this proprietary software under
different licensing.

If the goal is still "(a) build a free alternative to proprietary
software", then a new strategy is called for:  competition on
_software_architecture_, not just licensing.

Make your end-users into peers#

Mon Aug 25 19:43:26 2003

Topics: unix lisp

Make your end-users into peers

I hope Miles won't mind me excerpting from his private email in this public place, but I spent long enough editing this reply that I fel I should get full value by exposing the rest of you foolish readers to it as well. He wrote

> Just read your latest journal entry regarding Tom Lord's post about the
> direction of free software.  It's an interesting idea.  Have you had any
> thoughts about what this might look like specifically?  

To which I have to admit: no, not really. I think he's identified the problem; I don't know the solution, but I don't think it's any of the attempts we've seen so far.

I've read a little by and about Christopher Alexander and the patterns movement, and what he said about habitability and piecemeal growth struck a chord with me. The users/inhabitants of a system aren't expected to be able (or willing) to rebuild it completely, but do want to customise it appropriately to their needs. And I don't think that's what we've got in the applications world; we've got customisation knobs for everything the app designer thought it would be nice to adjust, and for everything the users have previously asked to change, but if anyone thinks of something new that they'd like changed, we're back where we started.

The Unix shell has aspects of the right thing. Your environmental customisation (~/scripts/frob.sh) is using the exact same tools as the OS itself (/etc/init.d/frob). It falls down in the large: either the quoting rules come in and bite you, or the lack of support for any kind of object that's not a stream of bytes trips you on the way out and dribbles caustic saliva from its toothless mouth until you dissolve into a small puddle.

Emacs has aspects of the right thing; most of it's written in Lisp, and so there's an ecology of stuff built up around it. That's good. X support (not per se, but the way it's been exposed to Lisp code) has tended to take it in the wrong direction, and what little I've seen of the xemacs team suggests that they're wholly misguided. (Disclaimer: I barely follow xemacs development; I just have the impression of lots of C programmers and not a lot of consideration for the high-level picture). Emacs may once have been intended as the basis for a user interface to the GNU system, but if this was a primary goal I think rms has largely failed to promote it. And yes, it would help to have threading, an option for lexical scope, objects, and UI support good for more than just putting words on the screen.

What's the message that's coming through here? I'm trying to say that I want the high level design for free systems to start looking less like application design and more like language design. In my ideal world you get twenty five points for writing a library, five for a framework, and only three for an application.

Some of the apps people say (or used to say) "we're not designing for people like us any more, this is for the end user", but that implies a power imbalance which I really don't want to be part of. I want a system that the end-user can participate in, not just one that they can take or leave. A free software community that's actually a community, if you like

So I think for the moment the summary is "habitable software", or "make your users into your peers". It's a hard sell, though: customisable software in the eyes of business doesn't mean letting the accountant edit her own startup file, it means getting a bunch of expensive consultants in to misunderstand the business and give you a new system that you still don't understand anyway; not automatically a Good Thing. Ideas welcome

The standard Perl wisdom is that There's More Than One Way To Do It#

Thu Aug 28 02:04:21 2003

Topics: lisp

The standard Perl wisdom is that There's More Than One Way To Do It.

In Lisp there are often also many ways to do the same thing. At a low level you'll often see people on comp.lang.lisp or wherever comparing approaches: debating loop vs do, or the finer points of format strings, or closures vs objects, or recursive vs iterative solutions (for the record, tail-recursion is not the Holy Grail; if the problem is natureally recursively decomposed, use recursion. Otherwise just write a loop, already. And these debates just tend to help everyone involved to acquire a more complete picture of the language.

At a higher level, of course, there are many different strategies for solving the problem, but where in Perl you might go with that other well-known saying "the Right Answer is whatever lets you get your job done" (aside: I'm sure I remember hearing this all the time; now I can't find a single instance of it on Google. Maybe I'm assigning too more pragmatism to the Perl Way than I ought; sorry, guys), in CL you typically (at least, I typically) spend a bit more thinking time upfront to decide on an elegant solution.

I appear to be suffering choice paralysis.

Weekend spent on Araneida hacking, which somehow ended up being#

Mon Sep 1 02:58:02 2003

Topics: sbcl cliki lisp

Weekend spent on Araneida hacking, which somehow ended up being SBCL hacking. Two motivations:

  1. For the new version of the Local Food Directory (a CLiki application, if it wasn't immediately obvious from looking at it) we're going to be doing interesting things like tying into Streetmap for postcode searches and suchlike. It's also prettier, which is nice if you like that kind of thing.

    The practical impact of tie-ins with an external server, though, is that our latency for replying to requests now depends on circumstances beyond our control. Most (all) of the Araneida services so far are written so that we have some fairly good idea of how long it'll take to answer a request: we only talk http to a localhost proxy instead of slow remote clients; if we send mail, we send it to an smtp listener on the local machine; and our database queries, on the sites that use databases, are mostly fairly well tuned. But, remote hosts. Can't do much with that. Had better try this threading stuff for real, then.

  2. The users of Araneida, of which there seem to be an expanding number no matter what obstacles I throw up in their way (making them all install SBCL was going to be a pretty good way to make them go away, I thought, but still apparently Not Enough) are fond of pointing out that the export-server stuff is a bit weird at best. I concur. It made sense once upon a time.

In brief, we add a new class http-listener (concrete subclasses threaded-http-listener and serve-event-http-listener) which represents a single endpoint (a host/port combination) and dispatches all stuff that comes in on that endpoint to a handler

(defclass http-listener ()
  ((handler :initform *root-handler* :initarg handler
	    :accessor http-listener-handler)
   (address :initform #(0 0 0 0) :initarg :address
	    :accessor http-listener-address)
   (port :initform 80 :initarg :port :accessor http-listener-port)
   ;; ...
   ))

Many listeners may dispatch to the same handler, and the handler may be a dispatching handler if it likes. There still needs to be some provision somewhere for lying about the hostname (as in, your external server address is foo.com:80/ but your araneida is actually on :8000) and a place to hang random extra bits that would be useful for generating apache httpd.conf segments (like where on the disk to find ssl certificates, etc), but it looks good so far.

While doing this, add some dynamic guess-how-many-processes-we-need stuff to adjust the number of serving threads based on load, slap it all back together and hammer it a bit with apachebench.

Oops. Not literally "Oops" in the sense that a Linux kernel hacker would know it, but EFAULT from accept() anyway, which is kind of analogous for a user program. sb-bsd-sockets:socket-accept calls accept with the contents of a suitably sized Lisp vector as the second argument (the sockaddr). To make sure this doesn't get relocated between our taking the address and calling out, we wrap it in without-gcing to disable gc for the duration. This is actually a really bad idea because all the idle threads block in select(), so we can't GC unless we're really really busy; somewhat perverse. And, just to be awkward, doesn't always work either. For some reason that I haven't yet found, it is possible to have a GC happen during without-gcing.

I shall spare you the story of my VOP-writing and GC staring-at experience on Sunday afternoon, but the upshot is that

It should be noted that although I started writing this entry at whatever ungodly time it says, I finished and uploaded it at around midday on Monday

That was very nearly the right answer#

Mon Sep 1 21:22:14 2003

Topics:

That was very nearly the right answer. In fact

:; /usr/sbin/ab -c5 -n10000 http://xxxxxxxxxxxxxxxxxxx:8009/Welcome
[...]
Server Software:        Araneida/0.74                                      
Server Hostname:        xxxxxxxxxxxxxxxxxxx
Server Port:            8009

Document Path:          /Welcome
Document Length:        2796 bytes

Concurrency Level:      5
Time taken for tests:   190.112 seconds
Complete requests:      10000
Failed requests:        0
Broken pipe errors:     0
Total transferred:      29582958 bytes
HTML transferred:       27962796 bytes
Requests per second:    52.60 [#/sec] (mean)
Time per request:       95.06 [ms] (mean)
Time per request:       19.01 [ms] (mean, across all concurrent requests)
Transfer rate:          155.61 [Kbytes/sec] received

Connnection Times (ms)
              min  mean[+/-sd] median   max
Connect:      -10     0   12.0      0   848
Processing:     7    42  104.7     17  1998
Waiting:        0    41  104.1     16  1998
Total:          7    42  105.9     17  1998

Percentage of the requests served within a certain time (ms)
  50%     17
  66%     25
  75%     34
  80%     42
  90%     66
  95%    105
  98%    285
  99%    665
 100%   1998 (last request)

10000 requests later with no problems, and i think we've got this nailed.