[Openmcl-devel] fragile GUI Programming: Unhandled exception 10

Gary Byers gb at clozure.com
Fri Oct 30 13:32:28 UTC 2009






On Fri, 30 Oct 2009, Alexander Repenning wrote:

> Debugging in CCL has improved a lot but unfortunately there is still a very
> frequent case when building GUIs which essentially reduces the process of
> CCL programming to roughly the same level as C++ or Java, i.e., programming
> problem -> stack trace -> spinning beach ball 
> 
> When raising just about any kind of condition in the "initial" thread the
> AltConsole will come up with some handy exploration features including a
> simple backtrace. So far so good. But then when trying to exit the break
> loop one gets 
> 
> Unhandled exception 10 at 0x9137eaa7, context->regs at #xbffff2bc
> 
> No options left at this point. Spinning beach ball, force quit

One option is to type a "B" at the kernel debugger prompt, unless
you're saying that the kernel debugger prompt doesn't appear in
the AltConsole window after the "Unhandled exception ..." message.
The resulting backtrace might provide information that would identify
the problem.  (It might not provide enough information, but some 
information is generally better than no information.)

If instead of providing that information you're simply noting that
"crashing is bad", for once I'm in full agreement with you.

> 
> It would be OK if this would be a rare case but in a Cocoa app MOST code is
> running the initial thread (e.g., menu functions, view, window call backs)
> which means that one is always dangerously close to a complete application
> crash even for simple problems as, say, calling a missing function. This
> really should not be the case with Lisp.

As I'm sure you're aware, not all of the code that's involved here is Lisp
code (the address 0x9137eaa7 is in some foreign code somewhere; the kernel
debugger usually tries to foreign addresses symbolically).  ObjC code tends
to be more runtime-extensible and "lisp friendly" than vanilla C is and it
has a condition system that supports UNWIND-PROTECT-like cleanup mechanisms.

Lisp certainly offers similar mechanisms, but it's possible to write code
that doesn't use them.  For instance:

(defun bad-code (pathname)
   (let* ((f (open pathname ...)))
     ...
     (if ... (error ...)
     ...
     (close f)))

is generally less robust (more likely to leave a file open) than
similar code that used WITH-OPEN-FILE or otherwise used UNWIND-PROTECT
to ensure that the file was closed regardless of whether or not an
error occurred.  UNWIND-PROTECT costs -something- (it's basically
just saving/restoring registers and other info), but in most contexts
the benefits of being able to clean up in exceptional situations greatly
outweigh what are likely very minor costs.

The situation in the ObjC world is a bit different; historically, the
cost of establishing and disestablishing ObjC exception handlers was
fairly high (that got a lot better in the 64-bit ObjC runtime
introduced with Leopard), and tradeoffs are perceived differently:
it's likely to be seen as much more important to optimize the
execution path involved in setting up and tearing down the graphics
and drawing state around a call to a #/drawRect: method than it is
to ensure that everything's cleaned up properly if that #/drawRect:
method does a non-local control transfer (lisp THROW or ObjC exception)
before exiting.

If errors occur in event handlers, then doing :POP (or, perhaps
slightly better, choosing the restart that offers to abandon
processing of the current event and prepare to process the next one)
often "works", in the sense that there's usually very little state
that has to be cleaned up when unwinding the event thread's stack (and
what state there may be - autorelease pool management, for one thing -
is often protected by ObjC cleanup code).  It's not guaranteed (by
Apple) that all such state is in fact protected, and there isn't
much that can be done if it isn't, but in practice recovering from
an error in an event handler often works.

I think that there are cases where the lisp runtime should also
be a little more careful to use UNWIND-PROTECT than it is; the
editor should probably try to ensure that #/endEditing will reliably
follow #/beginEditing, for instance.  But I do think that it's often
possible to recover from trivial errors in methods called in response
to events; if the error's in a #/mouseDown: method, you may have to
refrain from clicking in the view until the error's corrected but
I find that the system is usually in a usable state after basically
discarding an event in response to an error in a handler.

The situation with #/drawRect: and variants is a bit different; when
Cocoa notes that a view needs redisplay, it does a lot of graphics
setup (at both the Cocoa and Quartz layers, at least) to ensure that
drawing occurs in the right context and to ensure that the calling
thread is allowed to draw in the view before calling #/drawRect: .
Those state changes and that locking are undone when #/drawRect:
returns, but I don't believe that that cleanup code is "protected"
(guaranteed to run in the event of a non-local control transfer)
by the ObjC equivalent of UNWIND-PROTECT.  (If you get an error
in a lisp-implemented #/drawRect: method and choose a restart which
returns to the event loop, we should run all ObjC and lisp cleanup
handlers in the right order on the way back to the event loop; I
don't think that there -is- any post-#/drawRect: cleanup, though
a lisp programmer might expect that sort of cleanup to exist.

Since I don't know how to ensure that that cleanup will occur in
the event of a non-local exit, one approach to being able to have
errors in #/drawRect: methods and live to tell about it is to establish
a restart in the #/drawRect: method and transfer control to it in case
of error:

(objc:defmethod (#/drawRect: :void) ((self some-view) (rect #>NSRect))
   (with-simple-restart (abandon-drawing "Stop trying to draw in ~s" self)
     (progn
       ...
       (if (zerop (random 10))
         (error "You're number's up.")
         (do-some-drawing ...)))))

The default #/drawRect: method does nothing; if a method like the above
gets an error, it won't do any drawing but (if the ABANDON-DRAWING restart
is invoked in the break loop) it'll return, the Cocoa runtime will clean
up the graphics state and release locks, and it won't really care that
nothing was drawn; it's generally possible to run the IDE environment
and to fix the buggy method.

If we instead do a non-local exit (THROW/INVOKE-RESTART) out of the
break loop, what happens next likely depends on lots of factors (possibly
including 32/64-bit issues, exact library versions, etc.)  On x86-64
in OSX 10.6.1, it doesn't seem to be possible to effectively draw in
that view again (things behave as if the view was permanently locked;
I don't know what the real problem is.)  On other platforms and in
other contexts, one might see other behavior.  I doubt if the details
are very interesting, but I'd guess that establishing the restart
(or otherwise ensuring that the method returns) would avoid a lot
of unpleasant behavior.

There may be similar issues with some event handling methods in
some contexts; if there were, it might be necessary to handle
those cases specially (and avoid doing non-local exits past code
that should be "unwind-protected" at the ObjC level but isn't.)
In practice, I think that in (most ? all ?) of the cases where I've
had errors in event handlers it was indeed possible to simply
abandon the event.

It's certainly reasonable to expect lisp-like behavior from lisp
code; it's less reasonable to expect that kind of behavior from
ObjC code, and recovering from errors on the event thread requires
either the cooperation of both kinds of code or at least some
basic awareness of the issues on the part of the developer.  I
don't think that it's possible to get ObjC code that wasn't
written with error recovery in mind to cooperate with the
error recovery process; I can't imagine what that'd mean.





> 
> To reproduce (just tested with CCL 1.4):
> 
> ;; resizable window with red fill
> (in-package :ccl)
> 
> 
> (defclass crasher-view (ns:ns-view)
>   ()
>   (:metaclass ns:+ns-object))
> 
> ;;this works
> (objc:defmethod (#/drawRect: :void) ((self crasher-view) (rect :<NSR>ect))
>   (ccl::with-autorelease-pool 
>       (#/set (#/redColor ns:ns-color))
>     (#_NSRectFill (#/bounds self))))
> 
>


> 
> (ns:with-ns-rect (frame 100 100 200 200)
>   (let* ((w (make-instance 'ns:ns-window
>               :with-content-rect frame
>               :style-mask (logior #$NSTitledWindowMask
> #$NSClosableWindowMask #$NSResizableWindowMask)
>               :backing #$NSBackingStoreBuffered
>               :defer #$YES))
>          (view (make-instance 'crasher-view :with-frame (#/frame
> (#/contentView w)))))
>     (#/setContentView: w view)
>     (#/orderFront: w nil)
>     w))
> 
> ;; redefine with buggy version, resize window to cause problem
> (objc:defmethod (#/drawRect: :void) ((self crasher-view) (rect :<NSR>ect))
>   (ccl::with-autorelease-pool 
>       (#/set (#/redColor ns:ns-color))
>     (#_NSRectFill (#/bounds self))
>     ;; call function that does not exist
>     (missing-function 22)))
> 
> 
> in AltConsole you get:
> 
> > Error: Undefined function CCL::MISSING-FUNCTION called with arguments (22)
> .
> > While executing: (:INTERNAL CCL::|-[CrasherView drawRect:]|), in process
> Initial(0).
> > Type :GO to continue, :POP to abort, :R for a list of available restarts.
> > If continued: Retry applying CCL::MISSING-FUNCTION to (22).
> > Type :? for other options.
> 1 > :pop
> Unhandled exception 10 at 0x9137eaa7, context->regs at #xbffff2bc
> Exception occurred while executing foreign code
>  at __removeHandler2 + 199
> ? for help
> [46133] Clozure CL kernel debugger: 
> 
> 
> What are the options at this point? Is there a way to pop back from the
> problem, i.e., abort the event processing without a crash? One could wrap
> all the call back code up with some error handlers but then one does not get
> a backtrace to find the real problem which can be tricky in the case of some
> complex call back function calling all kind of other functions. 
> 
> What is this exception "Unhandled exception 10 at 0x9137eaa7, context->regs
> at #xbffff2bc" ?
> 
> Any suggestions are appreciated,  Alex
> 
> 
> 
> 
> 
> Prof. Alexander Repenning
> 
> 
> University of Colorado
> 
> Computer Science Department
> 
> Boulder, CO 80309-430
> 
> 
> vCard: http://www.cs.colorado.edu/~ralex/AlexanderRepenning.vcf
> 
> 
> Openmcl-devel Devel <openmcl-devel at clozure.com>
>


More information about the Openmcl-devel mailing list