[Openmcl-devel] Random crashing

Gary Byers gb at clozure.com
Fri Jul 18 09:45:59 PDT 2008



On Fri, 18 Jul 2008, Osei Poku wrote:

> Ok... It happened again after recompiling the kernel.  I managed to attach a 
> gdb session to the process and it is still running so I can possible provide 
> more feedback if you need.  My current gdb session log is inserted below.
>

It basically shows that one thread is reading (from standard input)
and that all other threads are waiting for a semaphore that'll
allow them to wake from a suspended state.)

In other words, you're in the kernel debugger.

> (gdb) info threads
> 9 Thread 0x40263950 (LWP 3271)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 8 Thread 0x404c7950 (LWP 3272)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 7 Thread 0x4072b950 (LWP 3305)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 6 Thread 0x4098f950 (LWP 3306)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 5 Thread 0x40bf3950 (LWP 3307)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 4 Thread 0x40e57950 (LWP 6093)  0x00002adafe591bfb in read () from 
> /lib64/libc.so.6
> 3 Thread 0x4131f950 (LWP 6094)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 2 Thread 0x410bb950 (LWP 6095)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 1 Thread 0x2adafe820880 (LWP 3268)  0x00002adafe2ca2cb in sem_timedwait () 
> from /lib64/libpthread.so.0
>

Thread 4 above is the one which got the exception, suspended other threads,
and is now trying to read a character in the kernel debugger.

To see you you got there, set a breakpoint at the (%rip) address where the
exception occured.

Before doing much of anything, tell GDB to ignore signals that the lisp
handles:

(gdb) source lisp-kernel/linuxx8664/.gdbinit

Then set the breakpoint, and "continue" (so that the kernel debugger
can run):

(gdb) br *0x00002ADAFE2CA325

(gdb) continue

In the kernel debugger, type X.  Back in GDB, you'll have hit the
breakpoint (in some thread).

(gdb) info thread

If it's thread 4 (the one that entered the kernel debugger and was
in read() in the 'info threads' output above.  If it's some other
thread ... well, that's -probably- not interesting (unless the other
thread gets an exception at the same place.)

Where are you (where is address *0x00002ADAFE2CA325) and how did
you get there ('bt' in GDB) ?





More information about the Openmcl-devel mailing list