[Openmcl-devel] Trace/BPT trap in 1.0
Gary Byers
gb at clozure.com
Sat Dec 3 17:39:40 PST 2005
I think that I've found the problem, and the executive summary is "Doh!".
There are some constants defined in ccl/lisp-kernel/constants.h
/* (fixnumshift = 2 on PPC32) */
#define TCR_FLAG_BIT_FOREIGN fixnumshift
#define TCR_FLAG_BIT_AWAITING_PRESET (fixnumshift+1)
#define TCR_FLAG_BIT_ALT_SUSPEND (fixnumshift+2)
#define TCR_FLAG_BIT_PROPAGATE_EXCEPTION (fixnumshift+2)
There are two problems:
1) these are supposed to be bit indices, not masks, but there are a few
things that're inconsistent about this.
2) the last two constants (TCR_FLAG_BIT_ALT_SUSPEND and
TCR_FLAG_BIT_PROPAGATE_EXCEPTION) should have distinct values.
Mostly because of the first problem, the lisp kernel function that
handles per-thread exceptions gets confused: the foreign thread
has (1<<TCR_FLAG_BIT_FOREIGN) = (1<<2) = 4 set in the "flags" field
of its thread context record (tcr), and the exception handling
function does:
if (tcr->flags & TCR_FLAG_BIT_PROPAGATE_EXCEPTION) {
tcr->flags &= ~TCR_FLAG_BIT_PROPAGATE_EXCEPTION;
return 17; /* return a non-zero value */
}
but it should be doing something like:
if (tcr->flags & (1<< TCR_FLAG_BIT_PROPAGATE_EXCEPTION)) {
tcr->flags &= ~(1<<TCR_FLAG_BIT_PROPAGATE_EXCEPTION);
return 17; /* return a non-zero value */
}
It just so happens that the value of TCR_FLAG_BIT_PROPAGATE_EXCEPTION
is equal to the value of (1<<TCR_FLAG_BIT_FOREIGN). Wackiness ensues;
the exception handler sees a foreign thread, misinterprets it as being
a request to propagate the exception to the next handler (GDB if it's
running, probably nothing, otherwise ...) and adds insult to injury
by clearing the TCR_FLAG_BIT_FOREIGN bit ...
TCR_FLAG_BIT_PROPAGATE_EXCEPTION was added in 1.0; the value is wrong
(conflicts with TCR_FLAG_BIT_ALT_SUSPEND), it's tested for incorrectly
in the lisp exception handler (catch_exception_raise()), and it's
set incorrectly in response to the kernel debugger's (P)ropagate Exception
command.
I don't think that much (if any) lisp code looks at these bits, so it
should be possible to just fix the 3 (or more) places in the kernel
sources that're confused about this and check those fixes into CVS.
Rebuilding the kernel should fix the problem.
On Wed, 30 Nov 2005, todd ingalls wrote:
> Hi I was wondering if anyone could help me diagnose this problem.
>
> In pre 1.0 version of openmcl the code below ran fine. This is a simplified
> version of a high resolution timer.
>
> In 1.0 version of openmcl, i get a Trace/BPT trap error and dppcl dies.
>
> I have traced the problem down to the call to #_PrimeTimeTask, but can't seem
> to figure out what to do from there. Does anyone have any suggestions in
> regards to what I should do next to determine why this is happening?
>
> thanks
>
> PS. I am running OSX 10.3.9.
>
>
> (ccl::open-shared-library "Carbon.framework/Carbon")
> (ccl::use-interface-dir :carbon)
>
> (progn
> (defparameter *tmtask* nil)
> (defparameter *counter* 0)
> (if *tmtask*
> (progn
> (#_RemoveTimeTask *tmtask*)
> (#_DisposeTimerUPP (ccl::pref *tmtask* :<TMT>ask.tm<A>ddr))))
> (setf *tmtask* (ccl::make-record :<TMT>ask))
> (setf (ccl::pref *tmtask* :<TMT>ask.tm<W>ake<U>p) 0)
> (setf (ccl::pref *tmtask* :<TMT>ask.tm<R>eserved) 0)
>
> (ccl::defcallback time-task-callback (:<TMT>ask<P>tr tmTaskPtr)
> (if (> (incf *counter*) 100)
> (progn
> (#_RemoveTimeTask tmTaskPtr)
> (#_DisposeTimerUPP (ccl::pref tmTaskPtr :<TMT>ask.tm<A>ddr)))
> (#_PrimeTime tmTaskPtr 1)))
>
> (setf (ccl::pref *tmtask* :<TMT>ask.tm<A>ddr)
> (#_NewTimerUPP time-task-callback))
> (#_InstallXTimeTask *tmtask*)
> (#_PrimeTime *tmtask* 1))
>
>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>
More information about the Openmcl-devel
mailing list