[Openmcl-devel] Need advice to debug segfault when running concurrent selects in clsql/postgresql
Paul Meurer
paul.meurer at uni.no
Tue Oct 29 08:14:22 PDT 2013
Hi,
I need some advice on how to further debug the following.
I am consistently observing crashes when I do run concurrent database selects using clsql against a PostgreSQL backend. I am running the newest ccl-1.9 64bit on CentOS, the PostgreSQL library advertises itself as being thread safe. Here is the code I am running:
(dotimes (i 16)
(ccl:process-run-function
(format nil "test~d" i)
(lambda (i)
(with-database (*default-database* *connection-spec* :if-exists :new)
(select [text] :from [text-table] :limit 10000)
(print i)))
i))
This form can be run several times without problems, but eventually I get a segfault. I tried to debug in gdb, where I see that the crash seems to be GC-related (see below). The crash always happens at the same place in bits.c.
I am aware that this is a complex scenario, where either the db lib, or uffi/clsql, or clozure could be the culprit, and it does not seem to be trivial to boil this down to a minimal case. So I would be grateful if somebody could give me some advice as to what would be the most promising way of nailing down this bug.
----------
? Unhandled exception 11 at 0x412360, context->regs at #x7f3ea52ed538
Exception occurred while executing foreign code
received signal 11; faulting address: 0x307e3f94d000
invalid permissions for mapped object
…
and in gdb:
(gdb) br *0x0000000000412360
Breakpoint 2 at 0x412360: file ../bits.c, line 45.
(gdb) continue
Continuing.
[Switching to Thread 0x7f3ea52ef700 (LWP 3974)]
Breakpoint 2, set_n_bits (bits=<value optimized out>,
first=<value optimized out>, n=<value optimized out>) at ../bits.c:45
45 *wstart++ = ALL_ONES;
1: x/i $pc
=> 0x412360 <set_n_bits+112>: movq $0xffffffffffffffff,(%rax)
(gdb) bt
#0 set_n_bits (bits=<value optimized out>, first=<value optimized out>,
n=<value optimized out>) at ../bits.c:45
#1 0x000000000041111c in rmark (n=52914162892765) at ../x86-gc.c:770
#2 0x00000000004116fd in mark_root (n=<value optimized out>) at ../x86-gc.c:516
#3 0x0000000000411b05 in mark_ephemeral_root (n=<value optimized out>)
at ../x86-gc.c:650
#4 0x000000000040bfa2 in mark_memoized_area (a=0x1e926e0,
num_memo_dnodes=10288289) at ../gc-common.c:1473
#5 0x000000000040d9f0 in gc (tcr=<value optimized out>,
param=<value optimized out>) at ../gc-common.c:1688
#6 0x0000000000412c9b in gc_from_tcr (tcr=<value optimized out>,
param=<value optimized out>) at ../x86-exceptions.c:2924
#7 0x0000000000413358 in gc_like_from_xp (xp=<value optimized out>,
fun=0x412c70 <gc_from_tcr>, param=0) at ../x86-exceptions.c:2881
#8 0x000000000041341e in gc_from_xp (xp=<value optimized out>,
param=<value optimized out>) at ../x86-exceptions.c:2936
#9 0x0000000000414ad1 in allocate_object (xp=0x7f3ea52ee440, bytes_needed=32,
disp_from_allocptr=19, tcr=0x7f3ea52ef570,
crossed_threshold=<value optimized out>) at ../x86-exceptions.c:204
#10 0x0000000000414b9d in handle_alloc_trap (xp=0x7f3ea52ee440,
tcr=0x7f3ea52ef570, notify=0x7f3ea52ee1cc) at ../x86-exceptions.c:644
#11 0x0000000000415552 in handle_exception (signum=<value optimized out>,
info=<value optimized out>, context=0x7f3ea52ee440,
tcr=<value optimized out>, old_valence=<value optimized out>)
at ../x86-exceptions.c:1193
#12 0x00000000004157fa in signal_handler (signum=11, info=0x7f3ea52ee7f0,
context=0x7f3ea52ee440) at ../x86-exceptions.c:1466
#13 <signal handler called>
#14 0x0000302000bdca65 in ?? ()
#15 0x0000000000000052 in ?? ()
--
Best wishes,
Paul
More information about the Openmcl-devel
mailing list