diary at Telent Netowrks

Thread bug 3 identified#

Mon, 19 Jul 2004 01:39:59 +0000

Thread bug 3 identified. Background: gcstoptheworld is called from within a WITHOUT-INTERRUPTS form. WITHOUT-INTERRUPTS doesn't actually mask signals at the kernel level, but does set a flag that the C-level signal handlers check when deciding whether to defer a signal. SIGTHREAD_EXIT (the RT signal that the parent thread is sent when a child thread dies, a la SIGCHLD for traditional processes) is on the deferrable list (because it calls Lisp code), so if a thread dies between the start of WITHOUT-INTERRUPTS and the world getting stopped, there's nothing available to wait() for it. Thus it becomes a zombie and will never react to requests to suspend itself for GC.

Alternate strategy: make threadexithandler just wait() and then set thread->state=STATEDEAD, so the GC can see the thread's no more, then do the Lisp-level stuff and destroythread at some later time. Now we can remove SIGTHREADEXIT from the deferrable list. Pretty arbitrarily, we've decided that at "after a GC" would be a reasonable later time; so now if you disable GC you have also disabled dead thread collection.