[Openmcl-devel] build failure on x86_64 linux

Gary Byers gb at clozure.com
Mon Feb 4 14:44:01 PST 2008


This is a bit handwavy, but we had problems where Linux kernes of
roughly that vintage (2.6.17) had problems.  (This was under Fedora,
and a Linux kernel x.y.z from Distro A may or may not be similar
to the same version from Distro B.

The problems that we saw seemed to only arise under some amount
of stress (several threads doing a lot of disk and network I/O),
but the symptoms (random, nonsensical signals) seemed to be
caused by the OS kernel losing track of thread-specific data.
(The lisp would get an exception, the lisp kernel would try
to use that thread-specific data to handle the problem, the
data would be wrong and the exception handler would scribble
over random memory, causing further exceptions.)

That problem went away when the affected user upgraded their
Linux kernel.  (In their case, they were able to upgrade to
2.6.22. which was current at that time.)

I don't know exactly what kernel versions were affected by this
(in exactly what distros), and I don't know for sure if this is
the cause of the problems you're seeing.  Ordinarily, an exception
that the lisp doesn't know how to handle results in a trip to
the kernel debugger, and the "losing track of thread-local-data"
problem that affected some Linux kernels of that approximate
vintage is one way in which the lisp's exception handling mechanisms
could get very, very confused and lead to this kind of abrupt
process termination.

If you can upgrade to a later Linux kernel (ideally 2.6.22+),
please try that and let me know if the problem persists.


On Mon, 4 Feb 2008, Robert Goldman wrote:

> Robert Goldman wrote:
>> R. Matthew Emerson wrote:
>>> 
>>> On Feb 4, 2008, at 4:23 PM, Robert P. Goldman wrote:
>>> 
>>>> Gary Byers wrote:
>>>>> I have no idea what that means.  (Well, it means that the lisp
>>>>> received a SIGTRAP signal, which it should handle; I mean that
>>>>> I have no idea why it didn't.)
>>>>> 
>>>>> What Linux distro ?  What (OS) kernel version ?
>>>> 
>>>> Mandriva x86_64, kernel 2.6.17-14mdv
>>>> 
>>>> By the way, I'm running quite successfully with the snapshot version.
>>> 
>>> Did you happen to follow these instructions?
>>> http://trac.clozure.com/openmcl/wiki/UpdateFromSubversion
>>> 
>>> If so, you might try changing the
>>> 
>>> $ svn co http://svn.clozure.com/publicsvn/openmcl/trunk/ccl
>>> to
>>> $ svn co http://svn.clozure.com/publicsvn/openmcl/branches/1.1/ccl
>>> 
>>> I'm not sure that will help.  Those two URLs are supposed to contain the 
>>> same stuff.  Maybe it's worth a try.  (Or if you're running the snapshot, 
>>> which contains cvs meta-information, you can do a cvs update.)
>>> 
>>> We're planning on making a new release Real Soon Now;  that should help 
>>> straighten some of this out
>> 
>> I did follow the UpdateFromSubversion instructions, thanks.  I believe that 
>> it might be possible to do
>> 
>> svn switch http://svn.clozure.com/publicsvn/openmcl/branches/1.1/ccl .
>> 
>> to get the stuff in the other repository....
>> 
>> I will report whether that fixes things on the linux platform.
>
> I just pulled from the branch and got exactly the same build error.
>
> Best,
> r
>
>
>



More information about the Openmcl-devel mailing list