[Openmcl-devel] Need advice to debug segfault when running concurrent selects in clsql/postgresql

Andreas Thiele andreas at atp-media.de
Thu May 8 07:32:12 PDT 2014


Hi Gary,

 

For you info: I can reproduce the error on my Intel i3-3240.

 

Good parameters I found for testing in this snippet:

 

(defun test ()

  (let* ((n 100)

                (m 32)

                (sem (make-semaphore)))

    (dotimes (i n)

      (dotimes (j m)

                (process-run-function (format nil "TEST THREAD ~a" j)

                                                     (lambda ()

                                                               (let (list)

                                                                 (dotimes (k
300000)

                                                                   (push
(getstring) list)))

 
(signal-semaphore sem))))

      (dotimes (j m) (wait-on-semaphore sem)))))

 

After approx. 30 to 60s I get the error described in this thread:

 

Unhandled exception 11 at 0x41ccb0, context->regs at #x7f1955fdb6c8

Exception occurred while executing foreign code

at set_n_bits + 112

received signal 11; faulting address: 0x307e40179000

invalid permissions for mapped object

? for help

[522] Clozure CL kernel debugger:

 

This was run with 

 

Version 1.9-r15757  (LinuxX8664)

 

under Debian.

 

Unfortunately I have similar problems (random crashes after 1-2 weeks
segfault during gc) in my production system using the 1.9 32-bit version.

 

I cannot reproduce the error discussed here in 32-bit version.

 

System in use:

 

Distributor ID: Debian

Description:    Debian GNU/Linux 7.3 (wheezy)

Release:        7.3

Codename:       wheezy

 

processor       : 0

vendor_id       : GenuineIntel

cpu family      : 6

model           : 58

model name      : Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz

stepping        : 9

microcode       : 0x19

cpu MHz         : 1600.000

cache size      : 3072 KB

physical id     : 0

siblings        : 4

core id         : 0

cpu cores       : 2

apicid          : 0

initial apicid  : 0

fpu             : yes

fpu_exception   : yes

cpuid level     : 13

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3
cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx f16c
lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept
vpid fsgsbase smep erms

bogomips        : 6784.56

clflush size    : 64

cache_alignment : 64

address sizes   : 36 bits physical, 48 bits virtual

power management:

 

.. and 1+2+3 .

 

Best Regards

Andreas

 

 

 

Von: openmcl-devel-bounces at clozure.com
[mailto:openmcl-devel-bounces at clozure.com] Im Auftrag von Gary Byers
Gesendet: Mittwoch, 11. Dezember 2013 19:16
An: Paul Meurer
Cc: openmcl-devel at clozure.com Development
Betreff: Re: [Openmcl-devel] Need advice to debug segfault when running
concurrent selects in clsql/postgresql

 

The bad news is that I couldn't reproduce the bug at all on a Core i7.
The good news is that the symptoms happen very quickly on a used Xeon X5355
that I have now,
That seems to match what you were seeing, so we're likely at least seeing
the same
things.

(Whatever they are ... )



On 12/10/2013 12:07 PM, Paul Meurer wrote:

 

Am 09.12.2013 um 21:23 schrieb Gary Byers <gb at clozure.com>:





Rumors that I've become distracted and forgotten about this are at least
slightly exaggerated.

Could you (Paul) please try to reproduce the problem in the current trunk
(r15975 or later) ?  I haven't been able to do so, but I often found
the problem harder to reproduce than you did.

 

I did test the newest trunk (after runnnig (rebuild-ccl :full t), giving me
Welcome to Clozure Common Lisp Version 1.10-dev-r15975M-trunk
(LinuxX8664)!).

 

Unfortunately, the problem remains the same, I am getting the same two types
of symptoms you are describing.

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20140508/59678eda/attachment.htm>


More information about the Openmcl-devel mailing list