[Openmcl-devel] A bug and a quick fix

Gary Byers gb at clozure.com
Sat Dec 10 15:01:20 PST 2011


Googling for "close console window crash" returns something over 69 million
results.  I didn't read all of them, but some of the first few seem to describe
the same kind of symptoms that you reported.

If Windows's default behavior was to terminate the process via EndProcess or
_exit, there wouldn't be any running thread in the process faulting while
trying to access its (or some other thread's) thread-local storage.  Whatever
Windows's default behavior is, it seems clear that we want to avoid it.

Giving running threads a chance to clean up after themselves (ultimately, via
CCL:QUIT) sounds better than abruptly exiting when the control message is
received, but there's a tradeoff there: Windows gives the process (IIRC)
5 seconds to exit, then reverts back to its default behavior.  QUIT will
-almost- always complete in a very short period of time, but how long it
takes depends on how many threads are running, whether GC occurs and how
long it takes, and other factors.  I don't think that I've ever seen QUIT
take longer than 5 seconds, but I believe that it's possible (if unlikely.)

None of the choices:

a) trying to QUIT cleanly on receipt of a control message and hoping that
    it finishes in time.  (It often will, but this isn't guaranteed.)
b) terminating the process abruptly (without giving threads a chance to
    close files, etc.)
c) letting Windows do whatever it does and crashing

I was looking at this the other day (running a 32-bit CCL on a 64-bit
Windows), and the debugger that I was using didn't seem able to set
breakpoints in the 32-bit process and I wasn't able to see what was
happening in any detail.  I wasn't able to tell when the console
window closed relative to when other things happened, but it was
certainly gone by the time that the crash dialog appeared.  If the
console window closes before the process terminated, that sounds like
a whole other set of things to go wrong while threads are exiting.

Some of this might have something to do with CCL, but at this point that's
not clear.  To my ear, "checking to ensure that thread-local storage exists
before accessing it" sounds like "checking to ensure that the stack is still
mapped before pushing something on it"; both situations are so anomalous
that it's not too productive to think about what to do if they arise and
may be useful to think about how to avoid them.

In any case, if you're distributing a program intended to run in a
Windows console window, that offers no way of quitting besides
clicking on the close button, and clicking on the close button
crashes, you might want to consider changing that.  You might be
confident that arranging for QUIT to be called soon (as your proposed
change does) will avoid the issue (and yes, taking longer than 5
seconds to QUIT is certainly very, very unusual even if it's
theoretically possible.)  If you're not entirely confident of that,
you might consider offering some other way to quit and disabling the
console's "close" button; ccl/cocotron/WaltConsole/WaltConsole.c shows
how to do the latter.

On Thu, 8 Dec 2011, CRLF0710 wrote:

> 2011/12/8 Gary Byers <gb at clozure.com>:
>> but I saw the crash when
>> running under the standard Windows commmand interpreter (and probably
>> would have if I'd just double-clicked on wx86cl.exe and run in a
>> console.) ?I have no idea what exactly causes the crash if this
>
> Yeah, and this problem is especially critical if you have deployed your program
> with ccl::save-application and sent it to others. When they try to
> close it.   BANG~ :)
>
>> Your fix below will make CCL act as if it got a SIGQUIT signal and
>> it'll try to quit in a somewhat orderly fashion (closing files and
>> running other cleanups first.) ?That seems reasonable, but it's probably
>> not totally unreasonable to just call _exit() or the Windows equivalent
>> in that case.
>
> well, i think that is the default behavior. Actually Windows will just
> try to quit the application if you do nothing there (return FALSE).
> if you call TerminateProcess there, of course CCL will quit
> immediately, but may cause data loss in case there are unflushed
> streams or things like that.
> if you call _exit() or EndProcess there. . I don't think that is very
> different from the default behavior. You see, when windows try to
> close the application itself, the exception with AUX_TCR = NULL will
> be raised (Why is the exception raised anyway...).   Maybe it won't
> raise a exception, maybe it will. I don't know. Maybe you can give it
> a test?
>
> But i think the signal approach is more natural, because that's what
> we are already doing for Ctrl-C event. (We're adding support for
> Ctrl-Break, Ctrl-Close, Logoff and Shutdown, doesn't have to be this
> signal, SIGQUIT, SIGTERM, SIGKILL, or anything like that is just ok,
> only if it is handled properly by CCL.)
>
> -- 
> Wir m?sen wissen; wir werden wissen!
> CrLF.0710
>
>



More information about the Openmcl-devel mailing list