[Openmcl-devel] Re: [Bug-openmcl] NSViewHierarchyLock Assertion failure

Gary Byers gb at clozure.com
Tue May 11 21:37:02 PDT 2004



On Tue, 11 May 2004, Raffael Cavallaro wrote:

>
> On May 11, 2004, at 9:29 PM, Gary Byers wrote:
>
> > How does it crash ?  Is there any output generated anywhere, does the
> > beachball cursor appear and not got away, or does something else
> > happen ?
>
> Sorry about the lack of specificity.
>
> If run from the IDE, when I close an animating tiny-loop window, the
> IDE simply goes away - no "The Application OpenMCL has unexpectedly
> Quit. Would you like to file a report" dialog. The  Console log shows
> the following:
>
> Unhandled exception 11 at 0x00010a14, context->regs at #xf02c7848
> Read operation to unmapped address 0xf892f2fc
>   In foreign code at address 0x00010a14
> ? for help
> [592] OpenMCL kernel debugger:

You'll generally never get to the "unexpectedly quit" dialog (though
it'd sometimes be handy to be able to propagate an exception from the
kernel exception handler to the next exception handler in the chain.
Currently, if the application was launched from the finder the kernel
debugger will get EOF when it tries to read from standard input (which
is /dev/null), and just exits when it gets the EOF; if it instead
propagated the exception, the user'd get a bit more feedback.

As it turns out, address #x10a14 is in the lisp kernel's exception
handler itself: it tries to read the instruction that the PC's pointing
at, so that it can more quickly handle the exceptions related to consing.
If the PC's pointing into the ozone - and I'd guesss that #xf892f2fc
qualifies as being in the ozone - we fault again.  Oops.

I guess that the next question is why we're trying to execute code at
#xf829f2fc.  I don't know.  I can't reproduce any of this, and hadn't
heard that there was a problem until earlier today.


>
>
> If run from a terminal, when I close an animating tiny-loop window, I
> get the spinning beach ball, and the terminal output is similar:
>
> ? Unhandled exception 11 at 0x00010a14, context->regs at #xf03fa788
> Read operation to unmapped address 0x0c54b710
>   In foreign code at address 0x00010a14
> ? for help
> [594] OpenMCL kernel debugger:
>

The kernel debugger's backtrace option might or might not be able to
walk past the double exception.  Can it ?  What does it say we were
executing when we wandered off into the ozone ?

> I could send all the code, but the only relevant parts are those I've
> changed in tiny.lisp:
>
> (defvar max 9)
> (defvar min 3)
> (defvar step 2)
> (defvar numsides min)
>
> Note that numsides is a constant in the original tiny.lisp. Here it
> varies to animate different polygons.
>
>
> and, inside tiny-setup, right before it returns w, the ns-window, I've
> added the following which changes numsides, and redraws the polygon
> window:
>
> (labels ((draw-a-polygon (n)
>                           (declare (optimize (speed 3) (safety 0) (space
> 0) (compilation-speed 0)))
>                           (setf numsides n)
>                           (slet ((view-bounds (send my-view 'bounds)))
>                                 (send my-view :draw-rect view-bounds)
>                                 (send my-view :display-rect
> view-bounds)))
> 	      (loop-up ()
>                           (declare (optimize (speed 3) (safety 0) (space
> 0) (compilation-speed 0)))
>                           (loop for i from min to max by step do
> 		               (draw-a-polygon i)))
> 	       (loop-down ()
>                           (declare (optimize (speed 3) (safety 0) (space
> 0) (compilation-speed 0)))
>                           (loop for i from (- max step) downto (+ min
> step) by step do (draw-a-polygon i))))
> 	(loop do (loop-up) (loop-down)))
>
>
> These are the only changes I've made. I get the same, consistently
> repeatable crash if I just:
>
> (require 'tiny-loop)
> (cll::tiny-setup)
>
> and then close the Polygon Window (i.e., the drawing output window)
> while it is drawing/looping/animating the first time after launching
> OpenMCL, either from a terminal, or the IDE. later invocations and
> window closings will not reliably cause a crash - it must be
> immediately after starting OpenMCL. As I noted, entering a break first
> will prevent OpenMCL from crashing when I close the drawing window.

Hmm.

The last thing that cocoa.lisp does is to call:

(start-cocoa-application)

which spawns off a thread to take over periodic "housekeeping" tasks
from the initial thread, then forces the initial thread to run the
Cocoa event loop.  When a newly-initialized NSApplication is first
sent a "run" message, it does some further initialization, and it often
takes a few seconds before the event loop gets started.

The call to (REQUIRE-COCOA) returns as soon as the call to
START-COCOA-APPLICATION returns (which can be a few seconds before the
application's fully initialized.)  If you do:

(progn
  (require "COCOA")
  (start-using-cocoa-as-if-it\'s-fully-initialized))

you can run into the fact that it's quite likely -not- fully initialized.

I've run into this in the past, and came up with an ad hoc
not-quite-right mechanism to deal with the problem.  A method that's
invoked in the application delegate when the application's finished
its initialization signals a semaphore, and clients who want to wait
for Cocoa to finish initializing (as opposed to those that just feel
lucky ...) before trying to use it are supposed to wait on that
semaphore (see "cocoa-application.lisp").

The idea's probably right, but the execution's flawed: it'd be a whole
lot saner if the effect of:

(require "COCOA")

was to only return control to the calling thread after everything had
been loaded and the event loop was up and running.  The fact that you
only encounter problems on the first call makes me think that the call
is happening too early (and entering a break loop as a workaround is
consistent with that); the fact that this also seems to happen from
a saved Cocoa application seems to shoot holes in an otherwise fine
theory ...

>
> BTW, I didn't mean to suggest some sort of cargo cult notion that
> merely removing TINY-LOOP from *modules*  was having some bizarre
> effect. I was simply noting that doing so would force tiny-loop.lisp to
> be recompiled by the (require 'tiny-loop) form.
>

The theories that I'm tossing around sort of revolve around the notion
that this is timing-sensitive (either in terms of Cocoa initialization
or in terms of thread contention, or both).  Those types of bugs are
generally very hard to reproduce (I haven't had any luck so far), and
practically anything that affects the precise sequence of events and
their timing may affect whether or not a problem manifests itself.

> Raf
>
>
> Raffael Cavallaro, Ph.D.
> raffaelcavallaro at mac.com
>



More information about the Openmcl-devel mailing list