[Openmcl-devel] Random crashing
Gary Byers
gb at clozure.com
Mon Jul 21 14:53:53 PDT 2008
If you still have the debugging session running, could you do:
(gdb) p/x *(TCR *)0x417e77d0
That address is the value of the "tcr" argument to "resume_tcr()" in
frame #7 in the backtrace below, so if you don't still have the
debugging session and reproduce the problem, we want to see what
the value of the "tcr" argument to resume_tcr() at the point was
at the point where resume_tcr() called sem_post() and crashed.
The gdb command above means "print, in hex, this contents of
what this address points to, interpreting that address as
being of type "pointer to TCR" (where a TCR is a "Thread Context
Record" that contains several interesting fields.)
'resume_tcr()' basically does 'sem_post(tcr->resume)', and a crash
would make sense if tcr->resume was NULL. If it was, then one of
the threads that's doing sem_timedwait() on its 'resume' semaphore
would presumably be waiting on a NULL semahore, and that doesn't
make sense.
On Mon, 21 Jul 2008, Osei Poku wrote:
> Got it to crash again....
>
>>
>> Where are you (where is address *0x00002ADAFE2CA325) and how did
>> you get there ('bt' in GDB) ?
>>
>>
> This time %rip = 0x00002ABAFDCCD325. After I set the break point, continued
> and typed X into the kernel debugger, I arrive here in gdb. I will try not
> to screw up the debugging session like last time so that I can provide
> additional information.
>
> (gdb) bt
> #0 0x00002abafdccd325 in sem_post () from /lib64/libpthread.so.0
> #1 0x000000000041b3e2 in resume_tcr (tcr=0x40e577d0) at
> ../thread_manager.c:1376
> #2 0x000000000041c0ba in resume_other_threads (for_gc=<value optimized out>)
> at ../thread_manager.c:1544
> #3 0x000000000041d62e in lisp_Debugger (xp=0x4131dd60, info=0x4131e110,
> why=11, in_foreign_code=1, message=0x4131db10 "Unhandled exception 11 at
> 0x2abafdccd325, context->regs at #x4131dd88") at ../lisp-debug.c:919
> #4 0x000000000041a2c6 in signal_handler (signum=11, info=0x4131e110,
> context=0x4131dd60, tcr=0x4131f7d0, old_valence=1) at
> ../x86-exceptions.c:1070
> #5 <signal handler called>
> #6 0x00002abafdccd325 in sem_post () from /lib64/libpthread.so.0
> #7 0x000000000041b3e2 in resume_tcr (tcr=0x417e77d0) at
> ../thread_manager.c:1376
> #8 0x000000000041c146 in lisp_resume_tcr (tcr=0x417e77d0) at
> ../thread_manager.c:1418
> #9 0x000000000041a0c8 in handle_exception (signum=<value optimized out>,
> info=0x4131eaa0, context=0x4131e6f0, tcr=0x4131f7d0, old_valence=0) at
> ../x86-exceptions.c:910
> #10 0x000000000041a218 in signal_handler (signum=4, info=0x4131eaa0,
> context=0x4131e6f0, tcr=0x4131f7d0, old_valence=0) at
> ../x86-exceptions.c:1064
> #11 <signal handler called>
> #12 0x00003000400110ab in ?? ()
> #13 0x00003000404265fc in ?? ()
> #14 0x000000000040e0ac in _SPnthrowvalues () at ../x86-spentry64.s:1404
> #15 0x00002aaaad3e0110 in ?? ()
> #16 0x0000000000000008 in ?? ()
> #17 0x0000000000000000 in ?? ()
> (gdb) info threads
> 10 Thread 0x40263950 (LWP 6218) 0x00002abafdccd2cb in sem_timedwait () from
> /lib64/libpthread.so.0
> 9 Thread 0x404c7950 (LWP 6219) 0x00002abafdccd2cb in sem_timedwait () from
> /lib64/libpthread.so.0
> 8 Thread 0x4072b950 (LWP 6223) 0x00002abafdccd2cb in sem_timedwait () from
> /lib64/libpthread.so.0
> 7 Thread 0x4098f950 (LWP 6224) 0x00002abafdccd2cb in sem_timedwait () from
> /lib64/libpthread.so.0
> 6 Thread 0x40bf3950 (LWP 6225) 0x00002abafdccd2cb in sem_timedwait () from
> /lib64/libpthread.so.0
> 5 Thread 0x410bb950 (LWP 8021) 0x00002abafdccd2cb in sem_timedwait () from
> /lib64/libpthread.so.0
> * 4 Thread 0x4131f950 (LWP 8307) 0x00002abafdccd325 in sem_post () from
> /lib64/libpthread.so.0
> 3 Thread 0x40e57950 (LWP 8308) 0x00002abafdccd2cb in sem_timedwait () from
> /lib64/libpthread.so.0
> 2 Thread 0x41583950 (LWP 8309) 0x00002abafdccd2cb in sem_timedwait () from
> /lib64/libpthread.so.0
> 1 Thread 0x2abafe223880 (LWP 6215) 0x00002abafdccd2cb in sem_timedwait ()
> from /lib64/libpthread.so.0
> (gdb)
>
More information about the Openmcl-devel
mailing list