[Openmcl-devel] Error Running Currency Converter

Gary Byers gb at clozure.com
Wed Jul 30 16:42:41 PDT 2008


I was able to look at this (without having the lisp segfault on me every
time I looked at it funny) and think that it's now fixed in the trunk
svn.  (The problem was caused/exposed by some code that's been in the
trunk for a while, but which isn't in 1.2)

If it's not fixed (after svn update/rebuild-ccl :full t) or if it
recurs, please re-open
<http://trac.clozure.com/openmcl/ticket/320>

Although the details of the problem generally aren't interesting, here's
a bit of background that may or may not be documented elsewhere.

A pointer to foreign memory generally doesn't remain valid across
sessions: a foreign object (simple function or variable, ObjC class,
etc.)  generally doesn't have the same address across sessions (some
environments deliberately try to randomize the addresses where shared
libraries load to make security exploits slightly harder to write;
different library versions may be involved, etc.)  It'd generally
be a coincidence if a foreign object's address didn't change and
a subtle error to assume that it wouldn't.

To help catch such errors (and make them less subtle), SAVE-APPLICATION
changes the type of every (non-NULL) MACPTR in the heap to "DEAD-MACPTR";
most primitive operations on MACPTRs typecheck their arguments and
signal a type error if they receive a DEAD-MACPTR as an argument.

So:

? (defvar *p* (#_malloc 1))
*p*
? (setf (%get-unsigned-byte *p* 0) 17)
17
? (%get-unsigned-byte *p* 0)
17
? (save-application ...)
;; run the new image
? (%get-unsigned-byte *p* 0)

should signal an error complaining that the value of *P* is a DEAD-MACPTR
(and not a MACPTR.)  It'd be blind luck if the address encapsulated by
*P* was still valid (and could be referenced without causing a memory
fault), and even blinder luck if it still pointed to 17.

In that example (which isn't that implausible), the error's a user-level
error.  In the implementation itself, there are lots of opportunities
for similar errors.


On Wed, 30 Jul 2008, Chris Van Dusen wrote:

> Thanks for all of the replies regarding this.
> I understand that the current instability makes determining what's a user
> error versus an implementation error difficult, and don't want to be a pest
> about it.  In this particular case, I went by the responses that I had seen
> in my search for this error (i.e., it was an error in the user's code), and
> asked from that perspective.  In hindsight, I guess it would have helped if
> I had sent the code. :)
>
> Thanks again,
> Chris.
>
> On Wed, Jul 30, 2008 at 3:15 PM, Gary Byers <gb at clozure.com> wrote:
>
>> And in case anyone's confused by that reply:
>>
>> r10236 fixed a bug in the currency-converter example in 1.2 (a method
>> that was implicitly defined to return :id returned NIL.)  The same
>> bug's in the trunk, but the error that Chris is seeing seems to be
>> occurring during ObjC (re)initialization, long before we get to that
>> point.
>>
>> r10246 fixed a bug in the kernel code that caused the frame pointer
>> register to be restored incorrectly when PROCESS-INTERRUPT returned.
>> that often led to a segfault and general wackiness.
>>
>> On Wed, 30 Jul 2008, Gary Byers wrote:
>>
>>> I'd be surprised if it does, but it may fix at least large parts of
>>> the general trunk instability that I was admonishing about.  Just
>>> doing:
>>>
>>> (require "COCOA")
>>>
>>> was (for me) a reliable way to trigger the bug.  (The initial thread
>>> segfaulted when it returned from loading the Cocoa library via
>>> PROCESS-INTERRUPT, and trying to track down another problem when
>>> things were that flaky seemed pointless.)
>>>
>>> I'd also seen similar segfaults during QUIT, and I don't think that
>>> that's quite the same scenario (PROCESS-INTERRUPT is used in QUIT,
>>> but (IIRC) none of the functions invoked via PROCESS-INTERRUPT while
>>> quitting ever return, and I'm not sure if that's another symptom
>>> of the same problem.  (In QUIT, the thread that got the segfault
>>> was something other than the initial thread, and the initial thread
>>> got tired of waiting for it to die and terminates it with extreme
>>> prejudice.)
>>>
>>> On Wed, 30 Jul 2008, R. Matthew Emerson wrote:
>>>
>>>>
>>>> On Jul 30, 2008, at 3:41 PM, mikel evins wrote:
>>>>
>>>>>
>>>>> On Jul 30, 2008, at 12:26 PM, Chris Van Dusen wrote:
>>>>>
>>>>>> Mikel,
>>>>>>
>>>>>> Running CurrencyConverter, I get this:
>>>>>>
>>>>>> Error during early application initialization:
>>>>>>
>>>>>> value #<A Dead Mac Pointer> is not of the expected type MACPTR.
>>>>>>
>>>>
>>>>
>>>> Does http://trac.clozure.com/openmcl/changeset/10236 fix this problem?
>>>>
>>>>
>>>> _______________________________________________
>>>> Openmcl-devel mailing list
>>>> Openmcl-devel at clozure.com
>>>> http://clozure.com/mailman/listinfo/openmcl-devel
>>>>
>>>>
>>> _______________________________________________
>>> Openmcl-devel mailing list
>>> Openmcl-devel at clozure.com
>>> http://clozure.com/mailman/listinfo/openmcl-devel
>>>
>>>
>> _______________________________________________
>> Openmcl-devel mailing list
>> Openmcl-devel at clozure.com
>> http://clozure.com/mailman/listinfo/openmcl-devel
>>
>



More information about the Openmcl-devel mailing list