[Openmcl-devel] OpenMCL for Linux x86-64 available for testing

Gary Byers gb at clozure.com
Thu May 4 11:23:54 PDT 2006



On Thu, 4 May 2006, James Bielman wrote:

> Gary Byers <gb at clozure.com> writes:
>
>> This seems to be fixed in CVS now.
>
> Cool, the CFFI test suite finishes now, with 13/194 failures---most of
> them seem to be related to passing floats/doubles.  I'll try to narrow
> these down further.
>
> Also, I've noticed that I hit the kernel debugger sometimes when
> quitting OpenMCL, but it's never been reproducible.  But, it does do
> it every time after running the CFFI testsuite.  I can try to narrow
> this down to a specific source file or test if that will help.

If you get a chance, please try:

gdb /path/to/lx86cl64
(gdb) x/i 0x410561

where 0x410561 is the address where the crash is reported.

In the kernel that I have, that address is in a function called
"make_dynamic_heap_executable"; on the PPC - where there are
separate instruction and data caches - it's necessary to jump
through some cache-related hoops whenever code is written to
memory (including cases where the GC moves a function and therefore
"writes code to memory, or at least to the data cache."  The x86
and amd64 caches behave differently, and that function probably
isn't necessary (and it'd be silly to die while doing something
unnecessary.)

There was another bug a few days ago, where a C expression that
was supposed to be "~15" wound up as "~!5" (that's an exclamation
point); the short version is that that caused crashes when 
PROCESS-INTERRUPT interrupted a thread running C code if that
thread's stack was aligned in certain ways, and QUIT uses
PROCESS-INTERRUPT to tell threads to kill themselves in as
orderly a fashion as they can.  I think that that was fixed
in the 060503 archive, but that was so long ago that I don't
remember too well.

>
> Other than the testsuite failures, I've been using OpenMCL in Slime to
> work on some zlib bindings this morning and it's been working great.
>
> ...
> 13 unexpected failures: FUNCALL.DOUBLE26, FUNCALL.FLOAT26, DEFCFUN.BFF.1,
>   DEFCFUN.BFF.2, DEFCFUN.DOUBLE26, DEFCFUN.FLOAT26, CALLBACKS.FLOAT,
>   CALLBACKS.DOUBLE, CALLBACKS.FUNCALL.2, CALLBACKS.DOUBLE26,
>   CALLBACKS.DOUBLE26.FUNCALL, CALLBACKS.FLOAT26, CALLBACKS.FLOAT26.FUNCALL.
>
> How many times shall we repeat the tests? [0]:
>
> Unhandled exception 11 at 0x410561, context->regs at #x40230878
> ? for help
> [19826] OpenMCL kernel debugger: b
>
> Framepointer [#x36] in unknown area.

This usually means that it's crashed while running C code,
but I'm not sure that I believe that.  %rbp contains something
that lisp probably wouldn't have put there ...

> [19826] OpenMCL kernel debugger: r
> %rax = 0x0000000000554CD0      %r8  = 0x000000000000200B
> %rcx = 0x0000000000000008      %r9  = 0x000000000040D7C4
> %rdx = 0x00002AAAABC67000      %r10 = 0x000000000041054C
> %rbx = 0x00002AAAABC66F9D      %r11 = 0x0000000000000000
> %rsp = 0x0000000040230F08      %r12 = 0x0000000000000000
> %rbp = 0x0000000000000036      %r13 = 0x00003000043489FF
> %rsi = 0x000000000000200B      %r14 = 0x0000000000000000
> %rdi = 0x0000000000000042      %r15 = 0x0000000000000000
> %rip = 0x0000000000410561   %rflags = 0x0000000000010246

... but most of the registers contain sensible-looking lisp
values.

I wonder if something chopped off the high 32 bits of %rip (the
program counter) ?

> [19826] OpenMCL kernel debugger: l
> %rsi (arg_z) = ()
> %rdi (arg_y) = #<imm #x0000000000000042>
> %r8  (arg_x) = ()
> ------
> %r13 (fn) = #<Anonymous Function #x00003000043489ff>
> %r10 (ra0) = (tra ?) : #x000000000041054c
> ------
> %r15 (save0) = 0
> %r14 (save1) = 0
> %r12 (save2) = 0
> %r11 (save3) = 0
> ------
> %rbx (temp0) = #<13-element vector subtag = 25 @#x00002aaaabc66f9d ()>
> %r9  (temp1) = (tra ?) : #x000000000040d7c4
> %rcx (temp2) = 1
> ------
> %cx (nargs) = 1 (maybe)
> [19826] OpenMCL kernel debugger:
>
> James
>
>



More information about the Openmcl-devel mailing list