Month of December 2003

SLIME and Araneida: screenshot here and announcement here

SLIME and Araneida: screenshot here and announcement here

TODO: Some way to automatically set the optimization qualities in SBCL that lasts for longer than the current file

TODO:

  • Some way to automatically set the optimization qualities in SBCL that lasts for longer than the current file. When the current file is .sbclrc, that's not a very long time
  • Write up what the qualities do anyway somewhere (the manual would be a good place) because I can never remember.
  • That OpenGL thing from the other day
  • Another bug (I think it's a PFD thing, but) that I can't remember
  • SLIME-over-networks (preferably sshified)

Another one for the neglected dialog box collection

Another one for the neglected dialog box collection

Found while looking for other stuff: a rather literal xref port for SBCL

Found while looking for other stuff: a rather literal xref port for SBCL

OK, in principle we have working SLIME over SSH, using the client side of attachtty/detachtty

OK, in principle we have working SLIME over SSH, using the client side of attachtty/detachtty. We tell slime to attachtty hostname:/path/to/socket instead of connecting over the network, then we hack up swank a little to accept connections on a local socket instead of a TCP socket.

Why a local socket? To cater for the situation that the remote host may have more than one user. Access to unix sockets can be controlled by filesystem permissions, which saves us from having to come up with some authentication protocol in slime.

Still a fair amount of cleanup to be done (slime-disconnect tends to kill the lisp, which could be considered a Bad Thing), but I'll see if I can get committage some time soon.

Not impressed by: emacs apparently wiring together stdout and stderr of subprocesses even when you've said you want a pipe not a tty. Does this look silly to anyone else?

  (setq slime-net-process
        (let ((process-connection-type nil)) ; pipe
          (start-process-shell-command "SLIME Lisp" nil "attachtty" path "2>/dev/null")))

I am wondering if there is anyone anywhere who's using the define-page macro in Araneida

I am wondering if there is anyone anywhere who's using the define-page macro in Araneida. Revisiting its implementation now, it occurs to me that (a) it still uses the old handler model, (b) I don't understand how it works.

I think it might be good to set up a mailing list for Araneida/CLiki users, just so I have some chance of reaching more than three of them at a time

I have some kind of phone interview thing tomorrow morning, so I really should be asleep right now

I have some kind of phone interview thing tomorrow morning, so I really should be asleep right now. But

  • Small SLIME hackage so that the SBCL backend also works with Helmut's new code for choosing from a list when multiple definitions of a function are found. Right now this works for generic functions: M-. one of them and you get a list of all the methods

  • Better support for conditional GET in Araneida. Now when you call request-send-headers it returns the HTTP response code it sent, and when you call it with a sensible :last-modified argument and :conditional t, it may turn your 200 response into a 304. In which case, you get to go home early without having to do that tedious database query.

  • Using which, I managed to straighten out the conditional GET in Stargreen a little. Stargreen is an Araneida application but a very old one in parts - possibly dating right back to the time before Araneida was so named - and it uses bits of the system that I've forgotten exist and I strongly suspect nobody else has ever dared try.

Interview went not so well: feedback was that I was "wary of perl", which is somewhat embarrassing given that it's one of my three favourite programming languages

Interview went not so well: feedback was that I was "wary of perl", which is somewhat embarrassing given that it's one of my three favourite programming languages. Need to work on that a little, perhaps. Hmm.

Added another LU&D article pointer to the metacircles site: this one was about CLiki. It's giving me ideas for an article on setting up SBCL to do web development with Araneida: "0 to (port) 80 in 3000 words".

Though, conceded, probably port 8000 if you're sensible. I would tend to avoid running SBCL as root.

Hopefully, you can see from these lists that Perl provides a rich set of interfaces to low-level operating system details

Hopefully, you can see from these lists that Perl provides a rich set of interfaces to low-level operating system details. Why is this ``what Perl got right''?

It means that while Perl provides a decent high-level language for text wrangling and object-oriented programming, we can still get ``down in the dirt'' to precisely control, create, modify, manage, and maintain our systems and data.

Randal Schwartz in Linux Magazine

There's a lesson here for CL, though I'm not sure what it is. I certainly don't buy the idea that POSIX APIs are the best possible interface - not that I think this is where Randal is coming from anyway: he's clearly arguing from pragmatism. Filename manipulation, for example, is usually done in Perl by a mess of regular expressions that probably don't even always work (does . match newline?) which I don't really want to wish on anyone. On the other hand, the convenience is undeniable.

(Aside: POSIX APIs are not even necessarily, the things that Perl actually provides access to: how long did it take Perl to get reliable signal handlers?)

To some extent, SB-POSIX and similar things (will) solve this problem for SBCL. We still have a certain impedance mismatch imposed by the language itself, though: for example, NIL is the only false value in CL; 0 and 0e0 and "" are true. That said, Perl isn't entirely restricted to the C standard datatypes either: ioctls often require more fiddling with pack() and unpack() than the mere mortal should entertain.

So there's obviously a reason I'm converting Stargreen to the new Araneida handler model: it's because I want to upgrade the SBCL it runs on to something recent and, frankly, I'm not sure the old one works any more

So there's obviously a reason I'm converting Stargreen to the new Araneida handler model: it's because I want to upgrade the SBCL it runs on to something recent and, frankly, I'm not sure the old one works any more. But also, I'd like to add RSS feeds to it for forthcoming events.

Which witha little assistance from Miles Egan's xmls library, turned out to be the simple bit. Now I'm reading bits of Araneida again and thinking "I wonder why it does that". The deal is: there are various circumstances in which we want to abort a response early - for example, the authentication handler decides that the user isn't, or the response handler sees that it was that conditional GET thing I was talking about two days ago, and return 'not modified' therefore saving on all that expensive database stuff. We could set a flag in the request header (the 'request' in araneida has attribues of both http request and response: it's more of a 'transaction' object, I suppose) to say "we've finished, don't send them any more data", but then all subsequent stuff has to check it. Which is silly. If we're done outputting, we're (except in the case that the user's request kicks off some asynchronous computation that he gets the answer of some other way) probably done calculating too.

It appears that we have a signallable response-sent condition for this situation. In fact I even seem to have documented it. I guess I should probably make request-send-headers use it too, then.

Is anyone having problems with infinitely recursive errors when they try to retrieve pages for which there is no handler? I was, but it might just be me. The problem appears to be in handle-request (handler) where it attempts to request-send-error on a stream that's probably not there any more, but I'm loathe to commit a fix until I've worked out why it's also apparently pointlessly resignalling response-sent: I want to be able to explain or remove it. (This is where I came in, yes). Mail lispweb if you're seeing this too.

Next year's New Years Resolutions, exclusive preview:

Next year's New Years Resolutions, exclusive preview:

  • Find some other word than 'So' to start diary entries with. I'm predicting that enforcing this may involve small elisp hacks.
  • Eat something more closely resembling a balanced diet.

Editing on both CLiki and the ALU Wiki is temporarily disabled

Editing on both CLiki and the ALU Wiki is temporarily disabled. There's not a lot to say about the combination of intelligence and social skills that must be involved in defacing a world-writable web site, so I'll let it rest.

Not a lot of free stuff hacking lately. I've been working on Stargreen and thinking about testing. Which developed into thinking about completely redesigning the underneath of Araneida to separate request and response so that it could sensibly support an HTTP client as well as a server. Which stopped developing when I realised that most of it is presently, right now, YAGNI.

Then yesterday was divided more or less equally between travelling to Manchester and travelling back from Manchester; the intervening period spent doing a written Perl test (and swearing at MS Word, which is not my idea of the world's best Perl editor, really). Manchester quite nice but not a lot of free time to explore - spent most of it in Waterstones reading about truss rods.

My intention today was to hack a couple of fixes into CLiki such that there's a reasonable time limit on the "you can amend a prior edit without it appearing again in Recent Changes" feature, because I believe that's what our friend from Brazil was doing

My intention today was to hack a couple of fixes into CLiki such that there's a reasonable time limit on the "you can amend a prior edit without it appearing again in Recent Changes" feature, because I believe that's what our friend from Brazil was doing. Also to get on with some Stargreen testing.

So what I did instead was teach SLIME to compile ASDF systems. This is still ongoing, but you can see the results so far

I've had this Sluggy Freelance strip stuck to the wall behind my desk since I moved here

I've had this Sluggy Freelance strip stuck to the wall behind my desk since I moved here. Just thought I'd share.

(Note to landlord: no, not really stuck to wall. Stuck to bottom of picture which hangs on previously existing picture hook. No Blu-tak involved)

Upgraded the CLiki version on www.cliki.net to 0.4.1; now we can have page names with dots in them

Upgraded the CLiki version on www.cliki.net to 0.4.1; now we can have page names with dots in them. Also re-enabled posting for everyone other than A N Other; not that that will dissuade anyone who really wants to mess it up, but 90% of A N Other's content seems to be from people who don't care a lot whether they mess it up or not.

There are two reasons I'm showing you this screenshot

There are two reasons I'm showing you this screenshot. The other one (besides that the McCLIM Listener looks really cool and any excuse to take pictures of it is a good one) is that it's (i) running in SBCL, (ii) running usably fast, (iii) not eating all the CPU in an idle loop.

Ingredients: an SBCL that understands SB-FUTEX, running on a Linux kernel that supports futexes. A McCLIM patched with this patch (and see this message as well for the context). Some random frobbing.

I sort of wrote a condition variable implementation for CMUCL as well, but it's utterly untested - and frankly, unlikely to work, given that despite lying in bed this morning for three hours carefully doing nothing I still didn't actually get any sleep.

I've been working on and off (mostly off) lately on a replacement for the terribly outdated CMUCL HOWTO, and I think there's probably enough in it now that people might find it useful even though it's clearly unfinished

I've been working on and off (mostly off) lately on a replacement for the terribly outdated CMUCL HOWTO, and I think there's probably enough in it now that people might find it useful even though it's clearly unfinished.

...according to is an unofficial Getting Started guide to SBCL on Linux. Comments welcome.

In other news, I seem to have disposed of the useless parent thread in multithreaded SBCL builds, so now it will only show up once in ps instead of twice.

New Araneida, CLiki, and detachtty packages, though nothing particualrly earth-shattering in any of them

New Araneida, CLiki, and detachtty packages, though nothing particualrly earth-shattering in any of them.

To: lispweb@red-bean.com, clump@caddr.com
Subject: ANN: new Araneida, CLiki (again)
From: Daniel Barlow
Date: Mon, 15 Dec 2003 09:58:38 +0000
-text follows this line-

Mostly bug fixes and other small changes prompted by the recent defacing of the CLiki front pages by some wazzock from Brazil

  • CLiki now depends on Miles Egan's xmls library as well as everything else

  • Araneida has better support for conditional GET (may save you some bandwidth; may even save you some cpu or database time, especially if you're using it behind a caching reverse proxy)

New in Araneida 0.83

  • Clear up RESPONSE-SENT so that it works as described.

  • Convenient support for conditional GET requests: the new keyword argument :CONDITIONAL T to REQUEST-SEND-HEADERS will cause it to compare the last-modified date in the response with the if-modified-since date in the request, sending a 304 response and then signalling RESPONSE-SENT when appropriate.

  • REDIRECT-HANDLER may now take a relative urlstring for :LOCATION, which is merged against the handler's base url

  • Cleared up a lot (but not all) of the compilation warnings and style-warnings

New in CLiki 0.4.2

  • Now depends on xmls package for rss generation (stop complaining at the back there: it's small, it's asdf-installable if you ignore the lack of GPG key, and it decruftifies the code noticeably)

  • Tweaked the feature that allows users to collapse multiple edits to the same page such that they only show on Recent Changes once. Now it only works if the two edits happened within ten minutes of each other.

  • Update README to include web address for lispweb list

  • 'Create new page' link to site root, and other links in the default HTML stuff fixed (pointed out by Erik Enge, patch from Ivan Toshkov, thanks both)

  • example.lisp package name changed to something less clashable

The other day I had a questionnaire by email (16 lines of text in only 64k of MIME-encoded Word document) from a recruitment agency which among other things asks:

The other day I had a questionnaire by email (16 lines of text in only 64k of MIME-encoded Word document) from a recruitment agency which among other things asks:

  • Experience with Unix?
  • How many years?
  • How strong are you?

I'm sorely tempted to answer "A few quite unsavoury episodes but on the whole mostly positive", "seven, with time off for good behaviour", and "can lift own weight".

Really, how hard can it be to ask questions that would actually tell you something about the candidate? You want to know about Perl? Ask what their favourite CPAN module is, and why (or whether they've written one, and what it does). You want to know about Linux? Ask them which distribution they'd recommend, or what they most like about the 2.6 kernel. $RDBMS? Ask how (in the case of MySQL, the more adversarial may substute "whether") it compares to $OTHER_RDBMS.

  • It'd take a few minutes to fill out, but no longer than I'm now going to have to spend deliberating on whether my Linux abilities (I have debugged my userland signal handling code by inserting printks in the kernel/I did not write the kernel code in question - or any significant part of the rest of the kernel) should be classed as "intermediate", "advanced", "expert", "guru", or "minor deity". Some people will worship anything, after all.

  • It'd take a few minutes to mark, too, but then (a) you'd get fewer responses to start with, and (b) who's really going to rate themselves as anything less than "advanced" on the job's core required skills? At least the answers would mean something

Is it fair to expect agencies to know this stuff? For 20% of the first year's salary, there's some kind of case to be made along those lines, yes. I certainly think it's not unreasonable to expect them to know when they don't know, and spend some time in constructive discussion with the client to rectify this.

(They also want to know if I can "raw code" HTML. In the privacy of my own home, yes, but I think I'd get arrested if I tried it in public and as far as I know "but I'm a web designer" is not an admissible defence against a charge of indecent exposure.

Look Ma, no "stand up in court" double entendres. What? Damn.)

For the record, some of the agencies I've met or talked with in the last couple of months have been doing pretty useful jobs: they're often filtering several hundred inappropriate applications for a single job, and then they're phoning the potential employer regularly to chivvy them into actually making a decision. Because, to be honest, some potential employers could probably go for months or years with this potential completely untapped, if it weren't for someone to call them up periodically and press them to make their mind up. I do think there's a place for them (and to forestall the obvious rejoinder, I don't think it has to be the second circle of hell) but there do seem to be more than a few places where the recruiter-client relationship is not all it could be, and everyone suffers as a result.

Another job description I'm looking at now says that "Candidates with experience of Pearl will be at an advantage". Who's Pearl? Would I like her?

There's no CVS for telent at present, because it lives on the wrong side (or, from my personal point of view, the right side) of my cable modem, and said cable modem connection is not working

There's no CVS for telent at present, because it lives on the wrong side (or, from my personal point of view, the right side) of my cable modem, and said cable modem connection is not working. I'm stuck with an analogue modem that can't hold a connection for more than five minutes - and in a country which still bills local calls by the minute anyway, so in some sense that's more of a feature than a bug.

Discovery made this morning: the conditional GET support in Araneida 0.83 is backwards. I doubt that anyone is using it much yet, but if they are they're not using it successfully. Updated release later today.

I recently had occasion to write the following code. I like extensible http servers -

(defclass session-request (araneida:request)
  ((session :initarg :session :reader request-session)
   (cache-p :initarg :cache-p :initform t :accessor request-cache-p)))

(defmethod handle-request-authentication ((h (eql ssl-handler)) method request) (let ((id (request-cookie request "SESSION"))) (unless (zerop (length id)) (change-class request 'session-request) (setf (slot-value request 'session) id))))

;;; disable caching if {initiate,ensure}-session have been called (defmethod araneida:request-send-headers ((request session-request) &rest rest) (if (request-cache-p request) (call-next-method) (apply #'call-next-method request :cache-control "no-cache" :pragma "no-cache" :conditional nil :expires (get-universal-time) rest)))

What's happening here is that we have cookie-based authentication, and we have a caching reverse proxy in front of Araneida. This is a shared cache, so we don't want to cache any response that needs to check the user's credentials - if one user gets information relating to another, I'm sure you'll agree that would be bad. On the other hand, we don't want to make the static files (graphics, CSS, etc) uncacheable. So, we have

  • a handle-request-authentication method that covers all requests that might conceivably have a session cookie. It reblesses (ahem. excuse the Perl terminology, please) the incoming REQUEST into a SESSION-REQUEST which has the extra slots we need.

  • (not shown) code in the functions initiate-session and ensure-session (one of which is called by every handler that actually needs the session information) to set request-cache-p NIL.

  • a specialisation of Araneida's request-send-headers method that, when the request is marked uncacheable in this way, overrides whatever caching would otherwise be done for it.

No changes to the handlers, no changes to Araneida (except for fixing the if-modified-since bug) and significantly improved cacheability. OK, it could have been neater if we were designing this from scratch - for example, by arranging URLs so that all the uncacheable stuff is under a common root - but we have four years of accreted URL exports that we want to stay backward-compatibile with. In the circumstances, I think this is pretty neat.

From: iProfileCentral Subject: Weekly Job Seeking Reminder

From: iProfileCentral 
Subject: Weekly Job Seeking Reminder!

Daniel

Your job seeking status is set as 'Seriously Looking', but you haven't interacted with your iProfile for over a month!

We've taken this to mean that you're no longer looking for work, so we've changed your job seeking status to 'Not Currently Looking!'

Thank you for your proactive status management! Your iProfile service lists "skills" and years of experience with each! It is less than a year since I filled the form out! Therefore I have not needed to interact with it!

The PostgreSQL documentation claims that the range of an 'interval' type is +/- 178000000 years

The PostgreSQL documentation claims that the range of an 'interval' type is +/- 178000000 years. So why do I get this?

stargreen=> select interval '3280756607 second' ;
      interval       
---------
 24855 days 03:14:07
(1 row)

Does that answer look familiar? It should

CL-USER> (+ (* 3 3600) (* 14 60) 7 (* 86400 24855))
2147483647

So, there must be some other way to make Postgres convert from CL universal time to its own time format

stargreen=> select timestamp with time zone '1901-01-01 0:0:0' + interval '3280756607 second' ;
        ?column?        
------------
 1969-01-19 04:14:07+01
(1 row)
because the obvious solution isn't one.

create or replace function to_universal_time(timestamp with time zone) returns bigint as 'select cast(extract(epoch from $1) as bigint)+2208988800 ;' language sql;

create or replace function to_universal_time(timestamp with time zone) returns bigint as 
'select cast(extract(epoch from $1) as bigint)+2208988800 ;' language sql;

create or replace function from_universal_time(bigint) returns timestamp with time zone as 'select timestamp with time zone ''1970-01-01 GMT'' + cast( ($1-2208988800)||'''' as interval);' language sql;

And while we're on the subject, aren't timezones cool?

:; date --date '1970-01-01'
Thu Jan  1 00:00:00 BST 1970

Yes, for the time between 27th October 1968 and 31st October 1971, the UK's political time was uniformly one hour in advance of what the sun said it should be.

<dan_b> YAY POLITICIANS!

Nuff said.

Today I am looking for a bug in allocation that I suspect I introduced into SBCL about a year ago

Today I am looking for a bug in allocation that I suspect I introduced into SBCL about a year ago.

SBCL inherits from CMUCL a fairly nontraditional memory management scheme: at startup it mmap()s several large chunks of memory at fixed addresses, then doles out parcels of the contents when Lisp does things that require memory. The three large chunks involved are

static space: this contains some objects that must be at known addresses for reasonably efficient code generation. For example, comparisons against nil and t occur quite a lot, so we put these in known places at the start of static space. There are also a number of objects that the C runtime needs to be able to find, like sb!kernel::internal-error: it's simplest to chuck these in static space too, and when wee generate the initial core file we can also write out a header file for C describing the addresses (see src/runtime/genesis/constants.h). The initial static objects vary by backend, and the list can be found at the end of compiler/target/parms.lisp

read-only space: this contains a number of functions and bits of functions assembled from src/assembly/target. These are mostly to help with non-local control transfers (stuff like CATCH and THROW) and for generic arithmetic.

dynamic space: everything else. Ordinary allocation requests are served from dynamic space.

PURIFY

Once upon a time (and in fact, right up to the present day on non-x86 ports) the garbage collector was a pretty simple Cheney collector with two semispaces (dynamic space 0 and dynamic space 1, in fact). Allocation happened in one of them: GC copied all live objects from it into the other. Given that the standard core (containing the library routines and compiler and so on) in SBCL is around 20Mb and unlikely ever to become unreferenced, this was 20Mb of copying on each collection.

The obvious fix, then was to accept that this data would remain live indefinitely, and move it all somewhere that it would be safe from GC. This is what purify does: it's basically the same as GC, but instead of copying into the other dynamic space, it copies each object into whichever of read-only or static space seems appropriate - read-only if it knows that the object will never be modified (e.g. the strings for symbol names), static otherwise. (There are a couple of other differences: purify removes the trace table offsets and fixups list from code objects, because if they're never going to be mofved again, they'll never need fixing up again.) Once this is done, the dynamic space is basically empty and a fresh core image can be dumped for future runs.

Generational GC

The arrival of the "gencgc" generational GC mostly didn't affect any of this, but it does change the expectations of users. Once upon a time, Lisp users were trained to believe that every CONS is sacred ("if a CONS is wasted/GC makes you late"), but given a slightly less dumb GC - as has become more usual in the last, say, twenty years (or 12 months if you're a Java programer) - there shouldn't be quite the same need for all the resourcing and object reuse that hackers of yore were accustomed to.

Generational GC is based on the premise that most objects die very quickly after they're created, and any object that survives past infancy is probably going to live for a long time. So we should be able to save much time at a cost of not much space by collecting the new objects often, and the old objects less often.

First we have to be able to identify the new objects - there might be references to them from anywhere in allocated memory. We use a write barrier to limit the search space: with appropriate use of mprotect() and a SIGSEGV handler, we arrange matters such that writes to old objects are trapped and noted. When it comes time to GC, we can ignore the pages that have not been written to since the last GC, because they can't contain any references to objects created since the last GC.

Copying the surviving objects is a function of the number (and size) of objects that need it - and has nothing (to a first approximation, anyway) to do with the number of objects that don't need it. So, generating short-lived garbage should be essentially free.

Makes sense?

Diversion ends

So why this digression into principles of GC? What's the catch? The catch is that in SBCL this is not the case. The write barrier only applies to dynamic space, and we have to do a full scan over the static space (about 5Mb on this machine) each time.

The obvious fix would be to add a write barrier to static space. The more elegant fix, though, would be to not put 5Mb of stuff in there in the first place. Remember that purify exists so that the Cheney GC doesn't copy the entire compiler/library backwards and forwards on each GC, but gencgc is not going to do this anyway. A full GC happens only very rarely - in fact we can prod the gencgc slightly to ensure that it doesn't happen at all unless specifically requested. There are six generations: instead of purifying we could tenure the library code into generation 5 and then just use 0 thru 4 thereafter.

In outline this is as simple as it sounds: just change some numbers around. Testing so far, though, reveals that we tenure about 40Mb of stuff (instead of the 20Mb that PURIFY leaves us with) and that we do it with some fairly incredible wastage.

Today I am looking for a bug in allocation that I suspect I introduced into SBCL about a year ago, and it's only by fixing a bug that was introduced into SBCL about five years ago that I realised its full impact

Today I am looking for a bug in allocation that I suspect I introduced into SBCL about a year ago, and it's only by fixing a bug that was introduced into SBCL about five years ago that I realised its full impact.

In addition to the dynamic space itself, gencgc has a page table containing per-page data such as which generation the objects on the page belong to, whether the page has been written since last GC, etc.

Each allocation in gencgc is done from an "allocation region": a large object will get a region to itself, whereas a small object will share a region with others. Regions are typically created with minimum 8k size, and serve two purposes: (a) allocation from a region is cheap - each thread gets a region of its own, and an allocation request that can be satisfied from within the region doesn't need us to lock the allocator and mess with globally visible resources; (b) when a region is "closed" so that no further allocation can be done in it (when it's full, or when we need to GC) its start address is written into the page table entries for each page it encompasses.

To explain why (b) is important we take a short diversion into the "conservative" aspect of gencgc. Gencgc is mostly an exact copying GC, but for the C call stack, which mixes Lisp pointers and random untagged data. If we find something that looks like a Lisp pointer on the stack we should keep the object alive, in case it is, but shouldn't relocate it, in case it isn't: rewriting stack data that happens to look like a pointer into the Lisp heap would be bad. So we run over the C stack looking for things that might plausibly be Lisp pointers, and tag the pages they refer to as 'dont_move'. If one of these objects is in the middle of a page and the object preceding it has overlapped from the previous page, we don't want to tag half that object as immovable but not the other half, so in fact we have to look as far back as the region start address for our page, and tag everything from there to the next oobject that starts on a page boundary.

Here is the five year old bug: the page table is not saved with the core, and when a core is initially loaded, every page_table entry is initialized with the region start address being the start of dynamic space. The whole of dynamic space is therefore in the same region, so if a pointer to any of it is found on the C stack, that will cause some pretty significant portion of it to be locked down and not collected. Usually you don't see this bcause it's normal to purify before saving a core, so the dynamic space tends to be empty at this time. In our new impure scheme, though, we have 40Mb of dynamic space and none of it is going to go away.

So, that's a fairly simple one to fix, I thought: look for objects that start on page boundaries, and treat them as beginning new regions. And so it was, but now I find my bug: the first GC after loading the core thrashes the machine to death for several minutes.

The cause of which, I still don't know. In brief, though: when we're looking for a place to start a new alloc_region, it's sometimes possible to tack it onto the same page as a previous one. (Usually I wouldn't expect a region to end in the middle of a page but I'm guessing it might have been closed before being filled by GC or something). Although this works fine in the mutator, for some reason when we're allocating for the collector, we sometimes ("sometimes" meaning that we usually have to compile about half of PCL for the bug to manifest) manage to induce heap corruption by putting new alloc_regions onto part-full pages. Since March or so this year we've been running with this disabled, meaning that regions always start at the top of the page, but I am here to tell you now, brothers and sisters, that collecting 40Mb of live objects at once and putting each new alloc_region on its own page is a very bad idea: the space wastage is actually high enough that you might look at it displayed in the GC statistics and think "nah, that's clearly bogus data, nothing to worry about". I should know: I did.

The offending code (that is, the heap corrupting code, not the waste-of-space workaround) is, as far as I can tell, identical in effect to that in CMUCL. I'm not foreseeing imminent enlightenment here.

Not a particularly good test, this, because I don't have an unmolested SBCL binary for the same exact version as the one I'm hacking

Not a particularly good test, this, because I don't have an unmolested SBCL binary for the same exact version as the one I'm hacking. But here are the numbers, anyway.

Before: an ordinary SBCL 0.8.6.16

  • (time (dotimes (i 400) (gc)))

Evaluation took: 20.432 seconds of real time 19.946968 seconds of user run time 0.079988 seconds of system run time 0 page faults and 4935680 bytes consed.

After: an unpurified 0.8.6.37, with some rather vicious cleanups in the allocation routines (which seem to have fixed the bug described earlier, but more testing will be required)

  • (time (dotimes (i 400) (gc)))

Evaluation took: 13.578 seconds of real time 13.35597 seconds of user run time 0.077988 seconds of system run time 0 page faults and 4096000 bytes consed.

For comparison: CMUCL "x86-linux 3.1.2 18d+", whatever that is

  • (time (dotimes (i 400) (gc))) Compiling LAMBDA NIL: Compiling Top-Level Form:

Evaluation took: 12.49 seconds of real time 12.315128 seconds of user run time 0.057991 seconds of system run time [Run times include 12.37 seconds GC run time] 0 page faults and 0 bytes consed.

Um... um... About 35ms per GC, and give that that just returns it to CMUCL speeds, I rather suspect that it's all due to fixing the fragmentation bug and nothing to do with static space at all.

It should be noted that the stuff to remove fixups and trace tables has been disabled temporarily, until I find out why it's breaking things. Due to this and probably other reasons too, the dynamic space usage is around 47Mb, and the core is a quite rotund 92Mb. Yum.

The gencgc cleanups alluded to were the removal of most of the special-casing for large objects. There is now only one criterion for whether an object is handled as a large object, which is that its size exceeds large_object_size. If so, it will be allocated in its own region, which will not share any pages with any other region. It no longer matters whether you call gc_alloc_large or gc_quick_alloc_large; the same thing happens in either case.

It's far too close to Christmas to commit any of this right now (wait for after 0.8.7) but before I do I'd like to find out whether the gencgc policy changes (and the fragmentation bug fix they make possible) have any effect on their own, or if this speedup is all due to the static space now only being 1456 bytes long.

X-Spam-Report: Spam Filtering performed by sourceforge.net

X-Spam-Report: Spam Filtering performed by sourceforge.net.
        See http://spamassassin.org/tag/ for more details.
        Report problems to
        https://sf.net/tracker/?func=add&group_id=1&atid=200001
        1.1 MAILTO_TO_SPAM_ADDR URI: Includes a link to a likely spammer email

The message in question is from a sourceforge-hosted mailing list. There are no URIs in the message itself, so which of

This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click

or

https://lists.sourceforge.net/lists/listinfo/sbcl-help

does the "Spam Filtering performed by sourceforge.net" think is the likely spammer?

This is quite obviously the wrong place to announce it (unless maybe for the googlebot), but in the early hours of the morning today I added RSS feeds to the Stargreen site. ObLisp: using Miles Egan's xmls library. ical would be cooler, really, but I think that's going to wait until people are actually using ical clients.

(Hmm. Does the iPod have calendaring?)

Happy Christmas

Happy Christmas

Currently playing with: versioning in CLiki and RSS in this diary

That's the last time I turn computers off when I go away

That's the last time I turn computers off when I go away. The PSU in the Alpha appears to be dead, and the PSU fan in the x86 desktop was sticking as it went round. It seems to be better now it's warmed up a little, though. For the moment I've pulled the disk out of the Alpha to regain access to some CVS repositories,

OK, see how you get on with http://ww.telent.net/diary/diary.rss

OK, see how you get on with http://ww.telent.net/diary/diary.rss

There are people (well, there's one particular person using an RSS aggregator called rawdog) grabbing the rss feed for my rambles every twenty minutes

There are people (well, there's one particular person using an RSS aggregator called rawdog) grabbing the rss feed for my rambles every twenty minutes. Right through the night, too. Which seems a trifle excessive to me, but hey, it's getting a 304 every time, so no big deal.

Means I probably should tweak the Analog configuration, though, if I want its lovely statistics to be at all meaningful.

I see that LWN has got its predictions for what happens to Linux in 2004 up

I see that LWN has got its predictions for what happens to Linux in 2004 up. Inspired by that, I'm going to make some predictions of my own about Lisp in 2004

  • 2004 will not be the "Year of Lisp". When was the Year of Linux? The Year of the Internet? The Year of the LAN (remember that one?) Exactly. 2004 will be the year that some number of people suddenly realise they can use Lisp to solve their problems, and it will be a bigger number than had the same realisation last year, and 2005 will have more still. But that doesn't make either (any) of them the Year of Lisp.

  • SBCL will get callback support, and threads on some non-x86 platform (most likely PPC). It's kind of cheating for me to make SBCL predictions, though, so expect more when I've talked to Christophe and we've decided what we want to do.

  • Sometime after it gets its CVS repository back on the net, CMUCL will release 19a, which will have approximately what the current CMUCL CVS versions have in them. The CMUCL project suffers hosting problems on an approximately two-yearly basis, so will not seriously drop off the net again until 2005.

  • Gary Byers will tire of the low takeup of OpenMCL 0.14 pre-releases and will label it an official release so that users actually get it and start beating on the native threads support seriously. No major problems will result.

  • SLIME will replace ILISP as the "default" emacs interface for free Lisps. It will gain thread support.

  • MCCLIM will progress from its current status as the project that everyone admires and nobody actually uses, to a new position as a project that everyone installs to play with the Listener, but doesn't then use for anything serious. This will not change until a real application (probably Maxima) is released that depends on it. Lots of talking and approximately no hacking will be done on a CLIM-based editor/IDE, and a CLIM Gtk backend.

  • A "major" Linux distribution (i.e. Red Hat or SUSE) will follow the lead set by Debian, Gentoo and Slackware, and put a decent general-purpose CL system (CMUCL or SBCL) into their distribution

  • They'll screw it up by having someone do it who doesn't actually use Lisp, so it'll be (a) an old version, (b) a brand-new release which is missing the rushed post-release patches that actually make it work, (c) compiled with really dumb compilation options (SBCL without threads, or CMUCL with incorrect source paths, or similar), or (d) missing bits (like the FreeBSD SBCL port has no contrib). Maintainers of whichever CL it is will bemoan the fact that the distro fouled it up so badly. The distro hacker will claim that he asked for advice on IRC/usenet but everyone was too busy arguing about politics and/or Norway to give him an answer. People will look back through Dejanews or the IRC logs and find his question, which was written at the end of a 16 hour day wrestling with some out-of-date tarball from Sunsite and started with "Why is this stuff all so totally undocumented, broken, and completely fucking incomprehensible. Lisp sucks" and escalated from there. Everyone will take sides. A bug will be reported in the vendor's bug reporting system, which will be closed with the tag 'WONTFIX' or local equivalent.

    (Note: it doesn't have to go that way. Red Hat and SUSE hackers are warmly invited to introduce themselves as such on the relevant development lists and ask for packaging advice. We're friendly enough if we don't think we're being trolled, and it's in all our interests to get decent quality packages into the major distributions)

  • comp.lang.lisp will become still more offtopic and still less useful. At least one defender of proprietary Lisp development and one prominent free Lisp hacker will blow up loudly and publicly and swear that he is leaving {cll, usenet, Lisp, software} forever. Nobody will believe him, though many will claim to.

    (Hmm. How public have I been about my departure from cll? I expect it's temporary until I get broadband sorted out again, but after an hour playing with Dejanews yesterday I have to admit I'm not missing it much. But it's still 6 hours before 2004 as I write this, so you should probably look for another free Lisp hacker anyway)

  • Peter Siebel's book will be released, making many people happy and me jealous.

  • Sourceforge will suffer some major and long-term screwage, affecting many free Lisp projects. By the time it returns to normal service

    • SBCL and ECL will have moved elsewhere

    • CCLAN as a brand name will have been largely forgotten, but asdf and asdf-install will be moved to common-lisp.net.

    • ILISP probably won't have noticed.

  • I haven't heard anything about ILC 2004 yet; I don't know what the plans are. I do know that Ray de Lacaze did an absolute ton of work last year and I was sorry to miss it. Whether ILC happens or not this year, I predict more Lisp presence at other conferences, including LSM/RMLL, UKUUG, LinuxTAG etc.

  • Probable vapourware in 2004: CL-Emacs (unless Gilbert resurfaces or Luke Gorrie gets into it), productized cirCLe, an SMTP server with Bayesian spam filtering, another LispOS and another attempt at a c.l.l FAQ

I don't know if this is something that Livejournal users can tell already for themselves, but just in case it's useful: I see http://www.livejournal.com/users/dan_b_feed/ in my referrer log

I don't know if this is something that Livejournal users can tell already for themselves, but just in case it's useful: I see http://www.livejournal.com/users/dan_b_feed/ in my referrer log. If this helps you with "friends lists" or however that stuff works, please feel free.

Remember what I said before Christmas about GC frobbing

Remember what I said before Christmas about GC frobbing? With the new simpler region allocation policy described there, but no changes to static space usage :

  • (time (dotimes (i 400) (gc)))

Evaluation took: 30.413 seconds of real time 30.354383 seconds of user run time 0.058991 seconds of system run time 0 page faults and 4939776 bytes consed. NIL

and in fact we can easily double that again by disabling the check for read-only objects and dumping everything in static space, so it looks like the write barrier stuff does make a difference. So why is CMUCL doing in 13 seconds what takes us 30 (the easy answer is "it has a better object/region allocation policy", obviously, but better why?) and how much faster would it be if it were also to remove purify-into-static-space?

In happier news, while thinking about tidying up today I found a Brian Aldiss novel I apparently haven't read yet.

I lied

I lied. Well, I was mistaken: in the course of tidying up purify so that I might understand it, I managed to break it. !(a && !b) is not, De Morgan will happily tell you, the same thing as (!a && b). Doh. I was copying a whole bunch of stuff (constant boxed vectors, or possibly unboxed inconstant vectors) to static space that should have been happy in read-only space. So static space was up to 9.5Mb from its normal 5, hence the extra time spent checking it. If it takes an approximate extra 10s to GC when there's an extra 5Mb of static space to check, this also goes some way to explain why CMUCL is faster: in 18e it only has 2.5Mb of static space, so that's 5s off the GC time, leading us to predict that it should take about 15s. It's actually a bit under (13s) but that's only a couple of seconds still to claw back, then.

All times are to execute (time (dotimes (i 400) (gc)))

For a change I decided to google "dan_b" instead of some other variation on my name, and stumbled across (among other things) an irc log from 1998

For a change I decided to google "dan_b" instead of some other variation on my name, and stumbled across (among other things) an irc log from 1998

<lilo> it seems to me we do this for the fun of it, and for things we can use
<esr> lilo, I don't play politics and I don't spend any time trying to tear down or disband anybody else.  I wish 
everybody would realises that that shit is a waste of time.
<lilo> some of us definitely do :)

I remember watching this at the time. I don't remember if it was before or after VA, but I'm guessing it was before.

More RSS fun

More RSS fun. Apparently weblog rss feeds should have items in reverse chronological order, just like the weblogs they syndicate. So, I switched mine around. Also added the first sentence of each item as the item title: this will fall down in cases where my rather poor heuristics for detecting sentences, or in entries where my inverted pyramid has overbalanced. Apologies to anyone using an RSS client which will therefore decide that these are all brand new entries and make them read all my old entries again.

If anyone would like to recommend a GUI RSS aggregator for Linux that's not (a) Straw, or (b) Emacs-based, please feel free. There's nothing particularly wrong with Straw, but I worry whether the debugging messages it spits out from time to time are caused by problems with it or problems with my rss.


telent netowrks

Geeky stuff about what I do. Many include Lisp, Android, Javascript, Linux and matters arising. For my other personality (less tech and more skating/cycling), see coruskate