[Openmcl-devel] forking to dump?

Gary Byers gb at clozure.com
Wed Jul 9 14:54:20 PDT 2008



On Wed, 9 Jul 2008, Hans Hübner wrote:

> Hi,
>
> one of my applications is a web server (it needs to run continously)
> with a very large in-memory data set.  I can recover the data into a
> freshly started Lisp from files, but in order to save time, I would
> like to regularily dump images.  Now, dumping an image takes 1-2
> minutes and the Lisp exits when it is done, so I thought it might be

Out of curiousity, how large is the image ?

> an option to duplicate the running process by forking, then dump the
> image in the child process.  I did some experiments, but my naive
> attempts do not work:
>
> (when (zerop (#_fork))
>  (ccl:save-application "teh-dump"))
>
> It appears as if SAVE-APPLICATION does work at all in the child
> process - it just seems to exit having done nothing.
>
> Is there some way how this can be made to work?  In principle, fork
> should create a process with the same memory layout as the parent
> process, but propably there are issues with threads that one needs to
> handle?
>

On Linux, #_fork is documented to only copy the calling thread from
the parent to the child.  (I think that the same is true on Darwin
and FreeBSD as well, but their man pages don't seem to mention these
newfangled thread things at all ...)

What this generally means is that in the child process, data structures
that describe threads continue to exist, but there's really only one
real OS-level thread left in the child process (as if all of the other
threads died without cleaning up after themselves.)  These defunct 
threads might appear to hold locks or other resources, and ... well,
generally, #_fork and threads don't get along too well: there are
lots of ways to lose in the child process.

In CCL, the thread that's created by the OS when the application starts
up is the value of CCL::*INITIAL-PROCESS*.  Ordinarily, this thread
does some lisp initialization, creates a thread to run the REPL, and
then spends most of its time sleeping, waking up a few times a second
to handle "periodic housekeeping tasks" (forcing output to *TERMINAL-IO*,
handling ^C interrupts, a few other things.)  QUIT and SAVE-APPLICATION
are both implemented as something like:

(progn
  (process-interrupt
    ccl::*initial-process*
   (lambda ()
     (shutdown-other-threads) ; and do other cleanup stuff
     (do-internal-quit-or-save-application)))
  ;; Kill the current thread in an orderly fashion if possible
  (process-kill *current-process*))

In the world of a child (OS) process after #_fork, the value of
CCL::*INITIAL-PROCESS* will be one of these "defunct threads" (a
perfectly reasonably-looking thread object from the points of view of
both the lisp and the threads library, but something with no real
OS-kernel-level thread "underneath" it.)  The PROCESS-INTERRUPT
has no effect, the calling thread exits, and no heap image is
written.

You can try to work around this particular lossage by doing:

(process-interrupt ccl::*initial-process*
   (when (eql (#_fork) 0) (save-application ...)))

which causes the one thread which the child OS process inherits
to be the initial thread; it then interrupts itself (basically
a nop) and runs the internals of SAVE-APPLICATION to shut things
down and write the image.

I tried that and it happened to work on FreeBSD, but the environment
in which the child finds itself - where all of the other threads that 
claim to be there aren't really there, even if they hold locks -
offers lots of ways to lose.  (It offers even more ways to lose
on Darwin, where Mach-level exception handling isn't set up in
the child process.)



> Thanks,
> Hans
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>


More information about the Openmcl-devel mailing list