[Openmcl-devel] Random crashing

Gary Byers gb at clozure.com
Wed Jul 2 16:01:21 PDT 2008


About the only thing that I can tell you is that you called
SWANK:CREATE-SERVER and crashed in foreign (C) code at the 
address 0x00002B56B5BE22A0.  I don't know what foreign code
is at that address, but the lisp kernel is generally down
around 0x410000 on x86-64 Linux, so the address is most
likely in some shared library (if it's anywhere at all.)

On Linux, you can get a coarse idea of what memory regions are mapped
(and, when applicable, of what files they're mapped to) by looking
at /proc/<pid>/maps, where <pid> is the process id of the lisp
process.  It might be good to know if the address is mapped and
what it's mapped to, but if the problem is something like "a bad
parameter is being passed to some foreign function", we'd really
need to know what foreign function.

There's been a problem in 1.2, whereby foreign pointers (MACPTRs)
don't get invalidated when an image is saved.  (It's generally
the case that a foreign address is "per session"; invalidating
the pointer is supposed to make it harder to use a stale foreign
address.)  I only got around to fixing that in 1.2 a few days ago;
it was part of the problem that kept someone from loading shared
libraries on FreeBSD. I never figured out exactly -why- that
was part of the problem, but it certainly seemed to be.

If you can do an "svn update" and a (rebuild-ccl t) and the
problem goes away, great ... if not, I can try to explain how
to debug this with GDB, but it may take a while to track it 
down this way.  (I -would- like to track this down.)




On Wed, 2 Jul 2008, Osei Poku wrote:

> Hello,
>
> About 5 times a day on a particular machine, ccl drops into the kernel
> debugger in an unrecoverable way.  ie pressing X does not return
> control back to lisp.  The following is a copy of the output on the
> terminal.  I have not included the other half of the session with is
> on the other side of swank because it might not be necessary to debug
> this problem.  If it is needed to completely understand the problem, I
> can provide that directly.  As is shown in the output, the lisp
> backtrace is not available.  So there might be something other than
> the lisp code going on here.  As I said, this problem has only occured
> on this particular machine.
>
> The output of uname -a is
>
> Linux fatterbox 2.6.22.5-31-default #1 SMP 2007/09/21 22:29:00 UTC
> x86_64 unknown unknown GNU/Linux
>
>
>
> Any help/insight into what this is about is appreciated.
>
> Osei
>
>
>
>
> bash-2.05a$ lisp
>
> ; loading system definition from ccl:tools;asdf-install;asdf-
> install.asd.newest into #<Package "ASDF0">
>
> ; registering #<SYSTEM ASDF-INSTALL #x300040E5BD6D> as ASDF-INSTALL
>
> ;;; ASDF-Install version 0.6.10
>
> ; loading system definition from home:slime;swank.asd.newest into
> #<Package "ASDF0">
>
> ; registering #<SYSTEM :SWANK #x300040F39FAD> as SWANK
>
> ;Loading #P"/home/wtam/.slime/fasl/2008-04-24/openmcl-version_1.2-
> r9226-rc1__(linuxx8664)-linux-x86-64/swank-backend.lx64fsl"...
>
> ;Loading #P"/home/wtam/.slime/fasl/2008-04-24/openmcl-version_1.2-
> r9226-rc1__(linuxx8664)-linux-x86-64/metering.lx64fsl"...
>
> ;Loading #P"/home/wtam/.slime/fasl/2008-04-24/openmcl-version_1.2-
> r9226-rc1__(linuxx8664)-linux-x86-64/swank-openmcl.lx64fsl"...
>
> ;Loading #P"/home/wtam/.slime/fasl/2008-04-24/openmcl-version_1.2-
> r9226-rc1__(linuxx8664)-linux-x86-64/swank-gray.lx64fsl"...
>
> ;Loading #P"/home/wtam/.slime/fasl/2008-04-24/openmcl-version_1.2-
> r9226-rc1__(linuxx8664)-linux-x86-64/swank.lx64fsl"...
>
> ; Warning: These Swank interfaces are unimplemented:
>
> ;           (ACTIVATE-STEPPING ADD-FD-HANDLER ADD-SIGIO-HANDLER CALLS-
> WHO FIND-SOURCE-LOCATION MACROEXPAND-ALL REMOVE-FD-HANDLERS REMOVE-
> SIGIO-HANDLERS RESTART-FRAME RETURN-FROM-FRAME SLDB-BREAK-AT-START
> SLDB-BREAK-ON-RETURN SLDB-STEP-INTO SLDB-STEP-NEXT SLDB-STEP-OUT)
>
> ; While executing: SWANK-BACKEND::WARN-UNIMPLEMENTED-INTERFACES, in
> process listener(1).
>
> Welcome to Clozure Common Lisp Version 1.2-r9226-RC1  (LinuxX8664)!
>
> ? (swank:create-server :port 4007 :dont-close t)
>
> ;; Swank started at port: 4007.
>
> 4007
>
> ? exception in foreign context
>
> Exception occurred while executing foreign code
>
> ? for help
>
> [20166] OpenMCL kernel debugger: ?
>
> (G)  Set specified GPR to new value
>
> (R)  Show raw GPR/SPR register values
>
> (L)  Show Lisp values of tagged registers
>
> (F)  Show FPU registers
>
> (S)  Find and describe symbol matching specified name
>
> (B)  Show backtrace
>
> (T)  Show info about current thread
>
> (X)  Exit from this debugger, asserting that any exception was handled
>
> (K)  Kill OpenMCL process
>
> (?)  Show this help
>
> [20166] OpenMCL kernel debugger: R
>
> %rax = 0x0000000000000000      %r8  = 0x000000004072B7E8
>
> %rcx = 0xFFFFFFFFFFFFFFFF      %r9  = 0x000000004072B7E8
>
> %rdx = 0x0000000000000000      %r10 = 0x0000000000000000
>
> %rbx = 0x0000000040E577E8      %r11 = 0x0000000000000246
>
> %rsp = 0x000000004072A278      %r12 = 0x000000004072B7E8
>
> %rbp = 0x000000004072A730      %r13 = 0x000000004072A758
>
> %rsi = 0x0000000000000028      %r14 = 0x0000000000000004
>
> %rdi = 0x0000000000000000      %r15 = 0x000000004072AAE0
>
> %rip = 0x00002B56B5BE22A0   %rflags = 0x0000000000010246
>
> [20166] OpenMCL kernel debugger: F
>
> f00: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f01: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f02: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f03: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f04: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f05: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f06: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f07: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f08: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f09: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f10: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f11: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f12: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f13: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f14: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> f15: 0x00000000 (0.000000e+00), 0x0000000000000000 (0.000000e+00)
>
> mxcsr = 0x00001f80
>
> [20166] OpenMCL kernel debugger: B
>
> Framepointer [#x4072A730] in unknown area.
>
> [20166] OpenMCL kernel debugger: T
>
> Current Thread Context Record (tcr) = 0x4072b7e8
>
> Control (C) stack area:  low = 0x404d8000, high = 0x4072c000
>
> Value (lisp) stack area: low = 0x2aaaab2f1000, high = 0x2aaaab502000
>
> Exception stack pointer = 0x4072a278
>
> [20166] OpenMCL kernel debugger: L
>
> %rsi (arg_z) = 5
>
> %rdi (arg_y) = 0
>
> %r8  (arg_x) = 135157501
>
> ------
>
> %r13 (fn) = 135156971
>
> ------
>
> %r15 (save0) = 135157084
>
> Segmentation fault
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>



More information about the Openmcl-devel mailing list