[Openmcl-devel] Modal dialog problems with CCL 1.9 32/64 on Mountain Lion

Gary Byers gb at clozure.com
Thu Aug 30 19:45:25 PDT 2012



On Thu, 30 Aug 2012, Alexander Repenning wrote:


> I don't think hoping Mountain Lion will just go away is a very
> productive approach. ?

Well, the alternative is pretty depressing.

But ... whenever I get depressed, I like to shop in the App Store, and now
that's easier than ever!

(Think carefully: if I wasn't taking this seriously and trying to figure
out what was going on, I'd ignore the issue and wouldn't write lengthy
email messages. This being the real world, I do have other responsibilities
and customers who think that their issues are important, and even if I thought
that Lion and Mountain Lion weren't (mostly) steaming piles of crap I'd still
have the same responsibilities and prior commitments.

If you've thought about this carefully and still don't get it, you
probably never will.  Continue to believe that I don't have your deep
sense of responsibility or whatever it is that you believe, but I don't
think I'm out of line in getting tired of hearing about it.)


> Matthew's efforts are not wasted. I can crash
> CCL?1.9-dev-r15450-trunk 32 and 64 bit with and without his mods (he added
> gui:execute-in-gui). However, with?gui:execute-in-gui I have to select files
> to make the crash happen.?

To clarify: you're repeatedly calling a version of CHOOSE-FILE-DIALOG that
is essentially:

(gui:execute-in-gui (lambda ()
                      (let* ((open-panel (#/openPanel ns:ns-open-panel)))
                        (#/retain open-panel)
                        (#/runModal open-panel)
                        (#/release open-panel))))

(like your CHOOSE-FILE-DIALOG2 below)

and whether or not you get the memory/other problems is dependent on whether
or not a file gets selected (even though nothing in the code above seems to
care ?)

If so ... GUI:EXECUTE-IN-GUI is the right idea here, but I'm ignorant enough
of how dispatch queues work and of how they interact with the Cocoa event loop
to be a little skeptical of that function's implementation. (To say it again,
that skepticism is based on nothing more than my ignorance.)

Another way of doing the important part of what GUI:EXECUTE-IN-GUI does is
to do it at the Cocoa level; since we're not interested in a return value for
this test, we can define an ObjC method that does what CHOOSE-FILE-DIALOG2
does.  (We can define this method on any class; for the sake of argument,
we'll define a new class here.)

(defclass example (ns:ns-object)
   ()
  (:metaclass ns:+ns-object))

(objc:defmethod (#/chooseFileDialog2 :void) ((self example) arg)
   (declare (ignorable arg))
   (let* ((panel (#/openPanel ns:ns-open-panel)))
     (#/retain panel)
     (#/runModal panel)
     (#/release panel)))

(defvar *instance-of-example* (make-instance 'example))

(dotimes (i 100)
   (#/performSelectorOnMainThread:withObject:waitUntilDone:
     *instance-of-example*
     (objc:@selector #/chooseFileDialog2)
     +null-ptr+
     t))

It'd be interesting to know whether that behaves any differently for you
than the one which uses GUI:EXECUTE-IN-GUI.

My model of how your application works is that the user clicks on a
button and that causes CHOOSE-FILE-DIALOG to be called (and eventually
causes problems) and that all happens on the main/event thread.  If
that's true, then no mechanism for forcing the file panel to open on
the main thread should be necessary and any difference between
mechanisms isn't relevant. If the mechanisms behave differently and
GUI:EXECUTE-IN-GUI is responsible for some of the problem, then we
should figure out what the issue is and fix it so that things like
this can be done more reliably, but any (hypothetical) problems with
GUI:EXECUTE-IN-GUI wouldn't be involved in your application unless it
works very differently than I assume it does.

(And again, looking at the mechanism here is a stab in the near-dark;
any concerns that I have about the current mechanism are based entirely
on ignorance.)

> 
> At any rate, I think CCL and Mountain Lion are not playing together well.
> What do you want me to do to get things fixed: mail big check, submit more
> test cases, try with more tools, send Apple motivational speaker...?


In approximate order of complexity:

1) please try the alternate mechanism above and let me know if anything
can be said about it relative to the current mechanism.

2) You say that a loop that creates N threads works fine and that one
that calls CHOOSE-FILE-DIALOG2 and creates a thread on each iteration
fails.  Does a loop that merely calls CHOOSE-FILE-DIALOG2 have problems,
or is it the combination of using the open-panel and creating a thread ?
If the "alternate mechanism" of #/performSelectorOnMainThread:... seems
viable in (1), does it affect this.

3) More ambitious: try this test in ObjC.  If it's possible to demonstrate
that ObjC and Mountain Lion don't play together well, it might be possible
to interest Apple in that fact.  (This may be more likely if it's possible
to convince them that that matters; I'm not confident of that, but it's
worth a try.)

Obviously, standard use of NSOpenPanels works in applications written in
ObjC; just as obviously, they seem to work fine in the CCL IDE (and don't
involve any version of CHOOSE-FILE-DIALOG.)  The goal here is to see if
doing things in as similar a way as possible to what your CCL-based tests
do without having to say the L word, and if Lisp is out of the picture
it might be easier to get someone to either identify a problem with the
way things are done or a bug in Mountain Lion.

I'm serious in suggesting this (and understand that it'd be a non-trivial
project.)  If Matt wasn't busy goofing off by developing an application
for a customer (and doing so on Mountain Lion) I'd ask if he had time to do
it; if anyone has time and interest, I can further elaborate on some of the
things that CCL does that may be interesting.

Sending an Apple motivational speaker isn't recommended; I can't be
responsible for their personal safety.  If they were beaten to a
bloody pulp I'd certainly have an alibi, but ... let's just say that
my idea of what "changing the world, one person at a time" means in
this context has evolved over the last few years.

> 
> Alex
> 
> 
> 
> 
> 
> 
> _________
> 
> ;; CCL 1.9-dev-r15450-trunk (32/64) crash on Mountain Lion 10.8.1
> 
> 
> (defun choose-file-dialog2 ()
> ? ;; 100% kosher: retain, no use of depreciated calls
> ? (let ((panel (#/retain (#/openPanel ns:ns-open-panel))))
> ? ? (#/runModal panel)
> ? ? (#/release panel)))
> 
> 
> (defun THE-AMAZING-MEMORY-SURGE ()
> ? (dotimes (i 100)
> ? ? (gui:execute-in-gui #'(lambda () ?;; if you use this then you will get
> spinning progress bar and folder content no show
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? (choose-file-dialog2)))
> ? ? (ccl::process-run-function "pretent to load project" ?#'(lambda ()))))
> 
> 
> ;; this will pop up a file chooser for a number of time. Select a file and
> watch the Activity Monitor.?
> ;; Set view > update frequency in Actvity Monitor to fast (0.5s) for best
> results
> ;; Watch out for Clozure CL % CPU and Real Mem
> ;; for some time Real Mem will go up gradually (memory leak) then at some
> unpredicatable time it will SURGE to GIGABITES of memory?
> ;; and ultimatley crash CCL
> ;; with with-autorelease-pool may crash quite quickly with a Unhandled
> exception 10, comment out if needed
> 
> ; (the-amazing-memory-surge)
> 
> 
> ___________________
> 
> 
> On Aug 30, 2012, at 8:45 AM, Gary Byers wrote:
> 
> 
>
>       On Wed, 29 Aug 2012, Alexander Repenning wrote:
>
>             On Aug 29, 2012, at 6:00 PM, R. Matthew Emerson
>             wrote:
> 
>
>             ?????On Aug 29, 2012, at 7:04 PM, Alexander
>             Repenning
>
>             ?????<Alexander.Repenning at colorado.edu> wrote:
> 
>
>             ?????I think I have here a pretty Kosher (uses
>             retain, does not
>
>             ?????use depreciated functions) version of the
>             dialog + memory
>
>             ?????surge problem. This version
>             of?choose-file-dialog is
>
>             ?????completely stripped of any non essential
>             activity. It does
>
>             ?????not even return the path, i.e., there is no
>             practical
>
>             ?????value to this function.
>
>             Please have a go and see if you can or cannot
>             experience that
>
>             Memory surge phenomenon. Please follow the
>             instructions closely.
>
>             Otherwise you may miss the issues at sometimes can
>             be kind
>
>             subtle.?
> 
>
>             ?????;; CCL 1.8.1 64 (Mac App store) crash on
>             Mountain Lion
>
>             ?????10.8.1
>
>             The version of CCL in the Mac App Store still
>             contains a bug in it
>
>             that Mountain Lion triggers. ?I am pretty sure that
>             your test case is
>
>             running into that bug.
> 
> 
>
>       At the very least, the way that you presented your test case
>       triggers
>       a known bug in that version of CCL and says nothing about
>       whether or
>       not some other bug remains.
>
>             I think we are going around in circles.
> 
>
>       I would have used a harsher term for the above, but OK:
>       continually
>       using a 64-bit version of CCL that's known to have a bug which
>       has
>       similar symptoms is "going around in circles." ?You may actually
>       be going through some careful and controlled testing procedure
>       and
>       just spacing out and clouding the issue like this when you
>       report
>       your findings, but this wastes time and makes it harder than it
>       should be to take you seriously. ?(This is not the first time
>       that
>       you've done this.)
>
>             While that bug does sound similar in
>
>             spirit I am quite sure its not the one because:
>
>             1) the error also manifests itself in the 32 bit
>             version of CCL
> 
>
>       Matt wouldn't have wasted his time if the message he replied to
>       had
>       said so clearly.
>
>       I'm honestly not trying to be dismissive or sarcastic here.
>       ?This
>       stuff is complicated, and it's important to be as precise as
>       possible
>       when discussing it (much more precise than one might be in
>       casual
>       conversation.) ?That may take extra effort, but the alternative
>       seems
>       unacceptable to me.
> 
>
>             2) We tried the most recent version of
>             CCL?1.9-dev-r15450-trunk?
>
>             (DarwinX8632)! and the 64 bit version. Both the test
>             case and the full app
>
>             crashed with the same Memory surge.
>
>             ?
>
>             When I try your modified version (which we actually
>             did try before as well)
>
>             things APPEAR to be better as long as you just
>             dismiss the dialog with ESC.
>
>             However, the reason for that appear to be that then
>             the file choose dialog
>
>             goes into this super slow, spinning indeterminate
>             progress indicator mode
>
>             where it does not list contents of folder. That is
>             interesting. However, if
>
>             you actually try to select a file, instead of
>             pressing ESC, then there is a
>
>             good chance it will crash even faster than before. I
>             never made it beyond
>
>             the first attempt. Can you confirm?
> 
> 
>
>       I've seen a spinning progress indicator (something that would
>       have been a
>       beachball cursor a few OS revisions ago, and generally indicates
>       that progress
>       isn't being made) appear in the lower left corner of the open
>       panel. ?I wasn't
>       paying close attention to when this did and did not appear, but
>       my impression
>       is that that there was some correlation between that cursor
>       spinning around
>       and some kinds of misbehavior (e.g., "empty" or largely empty
>       panel views that
>       shouldn't be empty.)
>
>       I don't think that I've seen this since trying to use a
>       CHOOSE-FILE-DIALOG
>       implementation that (at a minimum) retained the panel before it
>       was used
>       and released it afterwards. ?I'm 100% sure that I haven't seen
>       excessive
>       memory or CPU utilization, but I've only seen that once (and
>       only while
>       running your application.)
> 
>
>             3) An even simpler test just starting 1000
>             processes, one after the other,
>
>             does not exhibit the problem (32 bit).
> 
>
>       Note that a thread/process in 32-bit CCL needs about 2.5MB of
>       foreign memory
>       just for its stacks; it also uses other finite resources
>       (semaphores, message
>       ports, etc.) ?You can't have 1000 runnable threads in 32-bit CCL
>       because the
>       ~2.5GB of foreign memory isn't available; the only way that a
>       loop that calls
>       PROCESS-RUN-FUNCTION 1000 times can run to completion is some
>       older threads
>       exit before newer ones are created, and the only way that
>       happens is if those
>       threads run to completion (and the more threads are created and
>       competing for
>       CPU time and other resources, the less deterministic that is.)
>
>       In practice, I'd expect something like:
>
>       (dotimes (i 1000) (process-run-function "nothing" (lambda ())))
>
>       to use a lot of CPU but probably not exhaust virtual memory
>       (simply because
>       all of the CPU contention keeps the thread running that loop
>       from running
>       very often.) ?This isn't guaranteed, and it's such a ridiculous
>       thing to do
>       that I'd find it difficult to get too worked up about things if
>       it didn't.
>
>       If we put a delay in that loop:
>
>       (dotimes (i 1000)
>       ?(choose-file-dialog)
>       ?(process-run-function "nothing" (lambda ())))
>
>       we're probably effectively serializing thread creation, but
>       we're also affecting
>       the environment in which CHOOSE-FILE-DIALOG runs (competing for
>       some of the
>       same OS resources that the thread is competing for.) ?I'd expect
>       that to work
>       as well, but of course it's also a ridiculous thing to do and
>       it's hard to care
>       too much about whether it does or not.
>
>       Also ridiculous (in different ways) is:
>
>       (dotimes (i 1000)
>       ?(choose-file-dialog))
>
>       where CHOOSE-FILE-DIALOG retains the open-panel appropriately.
>       ?I haven't
>       seen that fail, but I haven't counted to 1000 either.
>
>       Apple distributes a command-line tool called "heap", which will
>       scan the
>       (malloc'ed) heap zones of a specified process and (among other
>       things)
>       identify the number and sizes of the ObjC instances (the program
>       calls them
>       "classes" ...) it finds there. ?The ObjC heap shouldn't change
>       significantly
>       on each iteration and (when not answering email or trying to do
>       work that
>       we actually get paid for) I've been trying to verify that it
>       doesn't.
>
>             In summary I think we still don't know what is going
>             on but I don't think it
>
>             is connected to that specific malloc issue you
>             mention.
> 
>
>       No one else thinks so either (at least not directly), but if you
>       say
>       "running a test in a version of CCL affected by that issue says
>       something
>       useful" a lot of time gets wasted by Matt or me saying "no it
>       doesn't" and
>       then by you saying "and by 'version of CCL affected by that
>       issue', I mean
>       'other versions'". ?I am emphatically in favor of streamlining
>       this process.
>
>             Need to run but I do remember reading that in
>             sandbox mode (we are NOT
>
>             running in sandbox or or are we?) file choosers are
>             NOT running in the main
>
>             thread but in some other special thread.?
> 
>
>       They actually run in another OS-level process, which means that
>       the internals
>       of NSOpenPanel are probably a lot different than they used to be
>       (and may
>       change again if and when the whole "sandboxing" issue goes
>       away.)
>
>             hmmm....
>
>             puzzled, ?Alex?
> 
> 
>
>       I'm a little less puzzled than I had been. ?The implementation
>       of
>       CHOOSE-FILE-DIALOG (actually of GUI::COCOA-CHOOSE-FILE-DIALOG)
>       was clearly
>       wrong and could result in methods being invoked on freed objects
>       (which
>       in turn could cause malloc heap corruption that snowballs into
>       further
>       malloc heap corruption ...) ?Modifying the state of a freed
>       object could
>       involve modifying free memory (usually relatively harmless) or
>       modifying
>       the state of some other object that's subsequently allocated at
>       the freed
>       address (usually relatively harmful.) ?I don't know for sure
>       that any of
>       this leads to the memory/CPU problems that you've seen, but it's
>       certainly
>       plausible that it could. ??I'm fairly certain that it could lead
>       to the
>       cosmetic problems (apparently empty directories and other
>       display problems).
>
>       The problem described in ticket 1005 caused almost exactly the
>       same symptoms.
>       Recall that that involved allocating a per-thread data structure
>       whose address
>       could be used to identify a Mach message port; traditionally (I
>       don't know
>       if this has changed), a Mach message port identifier has been a
>       32-bit value.
>       So, the code that allocated that data structure was:
>
>       (let* ((free-later ())
>       ??????(p nil))
>       ?(loop
>       ???(setq p (#_malloc few-hundred-bytes))
>       ???(if (and (is-32-bit-pointer p)
>       ????????????(can-be-used-as-mach-port p)) ; not coincidentally
>       also Mach port name
>       ?????(return)
>       ?????(push p free-later)))
>       ?;; Free anything that failed the test above
>       ?(dolist (bad free-later) (#_free bad)))
>
>       That was changed: especially on Mountain Lion, #_malloc would
>       often return
>       pointers for which IS-32-BIT-POINTER wasn't true, so we use
>       alternatives to
>       #_malloc and #_free on 64-bit platforms (and IS-32-BIT-POINTER
>       is therefore
>       always true.) ?The second test - that the pointer's address can
>       be used as
>       a Mach port name - is intended to catch conflicts with the
>       (often essentially
>       random) set of existing Mach port names. ?The second test is
>       implemented by
>       trying to use P as a Mach port name and seeing if that fails; it
>       could fail
>       if P was coincidentally already a Mach port name, but might (I'd
>       have to RTFM)
>       also fail for other reasons (like "Mach is too damned busy now;
>       try later." Yes,
>       it's 2012.) ?If the loop above was run while Mach was too damned
>       busy, we'd keep
>       _mallocing until we were out of memory, and the one time that I
>       was able to
>       get your application to fail malloc's heap was full. ?(I don't
>       know that this
>       is what's happening, but it may be worth exploring.)
>
>       To state the obvious: Matt and I (and anyone else at Clozure who
>       could
>       look at this) have other work that customers are paying us to do
>       and
>       that other work has to take priority over this; if you think
>       otherwise, you are quite simply wrong and that isn't open to
>       discussion (with me, my partners, or anyone else *). ?If you're
>       just
>       getting someone's spare time to look at this, that seems like an
>       even
>       stronger reason to not waste that time with things like "running
>       this
>       test in a version of CCL known to exhibit these symptoms for
>       other
>       reasons exhibits these symptoms."
>
>       ------
>       [*] I used to deal with a company whose slogan was "We cheat the
>       other
>       guy, and pass the savings on to you!". ?Sadly, they aren't in
>       business
>       anymore ...
>
>             Gary mentioned this bug in a previous message:
>
>             - there's a known bug in 64-bit CCL on OSX that can
>             cause lisp thread
>
>             creation
>
>             ??to go into a horrible CPU-burning/memory-thrashing
>             state. ?I think
>
>             that that
>
>             ??bug's been present for a long time (since PPC64
>             days), but it's
>
>             apparently
>
>             ??much easier to trigger on 10.8 (and/or recent
>             versions of CCL) than
>
>             it has been.
>
>             ??The problem ultimately has to do with whether or
>             not #_malloc
>
>             (actually #_calloc)
>
>             ??returns a 64-bit pointer whose high 32 bits are 0
>             and there can be
>
>             many factors
>
>             ??that affect that (many of them subtle), and the
>             fix is to stop
>
>             assuming that
>
>             ??it does and allocate such pointers ourselves.
>
>             ??That's been fixed (in the trunk for a few weeks
>             and in the 1.8 tree
>
>             ??for a few days) in svn; the symptoms happen to be
>             very similar to
>
>             ??what people have reported seeing with
>             CHOOSE-FILE-DIALOG, but the
>
>             ??CHOOSE-FILE-DIALOG problems seem to occur for at
>             least some people
>
>             ??in 32-bit CCL (which was never affected by this
>             thread-creation
>
>             ? problem) and in freshly-updated 64-bit versions.
>
>             The fix for this bug is not yet in the Mac App Store
>             version of CCL.
>
>             ?I'll try to update the Mac App Store version soon,
>             but in the
>
>             meantime, please try using up-to-date CCL obtained
>             via Subversion
>
>             (either trunk or 1.8).
>
>             I modified your test case to make the call to the
>             open panel take
>
>             place in the main thread. ?It seemed to work as
>             expected for me in an
>
>             up-to-date trunk CCL.
>
>             ;; modified to use gui:execute-in-gui
>
>             (defun THE-AMAZING-MEMORY-SURGE ()
>
>             ? (dotimes (i 100)
>
>             ? ? (gui:execute-in-gui #'(lambda ()
>
>             ? ? ? ? ? ? ? ? ? ? ? ? ? ? (choose-file-dialog2)))
>
>             ? ? (ccl::process-run-function "pretent to load
>             project" ?#'(lambda
>
>             ()))))
>
>             (defun choose-file-dialog2 ()
>
>             ? ;; 100% kosher: retain, no use of depreciated
>             calls
>
>             ? (let ((panel (#/retain (#/openPanel
>             ns:ns-open-panel))))
>
>             ? ? (#/runModal panel)
>
>             ? ? (#/release panel)))
>
>             (defun THE-AMAZING-MEMORY-SURGE ()
>
>             ? (dotimes (i 100)
>
>             ? ? (ccl::with-autorelease-pool
>
>             ? ? ? ? (choose-file-dialog2)
>
>             ? ? ? (ccl::process-run-function "pretent to load
>             project"
>
>             ?#'(lambda () )))))
>
>             ;; this will pop up a file chooser for a number of
>             times. Each
>
>             time just press ESC and watch the Activity Monitor.?
>
>             ;; Set view > update frequency in Actvity Monitor to
>             very often
>
>             (0.5s) for best results
>
>             ;; Watch out for Clozure CL % CPU and Real Mem
>
>             ;; for some time Real Mem will go up gradually
>             (memory leak)
>
>             then at some unpredicatable time it will SURGE to
>             GIGABITES of
>
>             memory?and ultimately crash CCL
>
>             ;; with with-autorelease-pool CCL may crash quite
>             quickly with a
>
>             Unhandled exception 10, comment out if needed
>
>             ; (the-amazing-memory-surge)
>
>             _______________________________________________
>
>             Openmcl-devel mailing list
>
>             Openmcl-devel at clozure.com
>
>             http://clozure.com/mailman/listinfo/openmcl-devel
>
>             Prof. Alexander Repenning
>
>             University of Colorado
>
>             Computer Science Department
>
>             Boulder, CO 80309-430
>
>             vCard:
>             http://www.cs.colorado.edu/~ralex/AlexanderRepenning.vcf
> 
> 
> 
> Prof. Alexander Repenning
> 
> 
> University of Colorado
> 
> Computer Science Department
> 
> Boulder, CO 80309-430
> 
> 
> vCard: http://www.cs.colorado.edu/~ralex/AlexanderRepenning.vcf
> 
> 
> 
> 
>



More information about the Openmcl-devel mailing list