[Openmcl-devel] help on diagnosing an OpenMCL crash

Gary Byers gb at clozure.com
Fri Jun 18 13:44:03 PDT 2004



On Fri, 18 Jun 2004, Gary King wrote:

> We're running OpenMCL 0.14.2-p1 under OS X 10.3.4 on several different
> machines. We have a fairly large Lisp tick-based simulation / data
> collection application and have run into a problem recently when trying
> to run the simulation for long periods of time. Note that this runs
> successfully under MCL. Under OpenMCL, one of the following behaviors
> occurs:
>
> (). OpenMCL just quits and returns to the command prompt at the terminal
> (). OpenMCL quits with a bus error
>
When it quits with a bus error, what -exactly- does the message say ?

Is there a crash log in ~Library/Logs/CrashReporter/ ?  (My dim
recollection is that you used to have to do something to enable this,
or to enable logging to application-specific log files, or something.
I can't find any such option in Panther's Console.app and seem to
have application-specific crash logs ...)

> I'm not sure what's going on here and I'm not sure how to find out! Are
> there things I can do to help ensure that OpenMCL ends up in the
> debugger instead of returning to the command point?

In most cases, a thread-level exception handler should catch the
exception that would eventually lead to a SIGBUS signal.  The
exceptions (as in "exceptions to the rule ...") that I can think
of involve:

1) foreign threads that have never called back into lisp
2) the thread that handles exceptions for all other threads
3) a thread running in the environment immediately after #_fork()
   and before #_exec()

(2) handles exceptions by pushing an exception frame on the faulting
    thread's stack and then making it call some code in the lisp
    kernel.  In theory, this can get a SIGBUS if pushing the frame
    on a suspended thread's stack access unmapped memory; it's not
    clear how this could happen unless the suspended thread had
    severely overflowed its stack and this had gone undetected.

That possibility is probably more likely than the others, but it
doesn't sound very likely ... I'd be interested in seeing the
crash dump if you can find it, but it's not clear that it'd
tell us -why- the thread's stack pointer is in or near an unmapped
page, if that's indeed what the problem is.


>
> thanks,
> --
> Gary Warren King, Lab Manager
> EKSL East, University of Massachusetts * 413 577 0176
>
> When you lose small businesses, you lose big ideas.
>    -- Ted Turner, Washington Post May 30, 2003
>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>



More information about the Openmcl-devel mailing list