HI James,<br><br>Interesting question.<br><br>What about the case where the two tasks pushed onto the task stack complete their subsequent pushes to the receive stack before the first call to popping the receive stack? Since this is all a matter of timing, it looks like this should become a viable case at certain timing boundary conditions. Specifically, when the sleep time is close to the time it takes the initial thread to slog through those pushes, the ensuing threads are tickled to life and disrupt the machine, and of course CCL is meanwhile fiddling with its thread internals, and perhaps doing a gc during all this. That time would be quite variable, seems to me, having to do with how busy ccl is and other threads running on the system.<br>
<br>For that matter, when the sleep time is small enough the "later" task will sometimes beat the "sooner" one.<br><br>Also, the printing of the "." in the test function may cause even more thread mayhem. I found that eliminating that decreased the likelihood of this effect.<br>
<br>I modifed your test a bit, and have run several iterations with various sleep times, the the results pretty much bear this out, I think. At sleep of 0.03 I get an average of about 1500 iterations before encountering a case of popping 'later first, after 'sooner and 'later are on the stack. The likelihood of this happening increases as the sleep time decreases 0.02, 171 iterations<br>
0.01, 978<br>0.001 296<br>0.0001 78<br><br>And at low sleeps like 0.00001 am about to pop the 'later before 'sooner is even pushed.<br><br>Erik.<br><br><div class="gmail_quote">On Wed, May 9, 2012 at 11:55 AM, James M. Lawrence <span dir="ltr"><<a href="mailto:llmjjmll@gmail.com" target="_blank">llmjjmll@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I thought my example was straightforward enough, though as I mentioned<br>
I wish it were smaller. Following your suggestion, I have replaced the<br>
queue with a stack. I have also taken out the condition-wait function<br>
copied from bordeaux-threads. My pop function now resembles your<br>
consume function.<br>
<br>
The same assertion failure occurs.<br>
<br>
I am unable to reproduce it with high debug settings, or with tracing,<br>
or with logging.<br>
<br>
The test consists of a pair of worker threads pulling from a task<br>
queue. We push two tasks: one task returns immediately, the other task<br>
sleeps for 0.2 seconds (it can be 0.5 seconds or whatever, it just<br>
takes longer to fail). Since we have two workers, we should always<br>
obtain the result of the sleeping task second. A signal is getting<br>
missed, or something.<br>
<br>
Clozure does not pass the stress tests for my library, while other CL<br>
implementations do. I've put much effort into narrowing down this<br>
Clozure-only bug to this test case.<br>
<br>
I have found and fixed race conditions in Ruby which persisted for<br>
years. We both know that multi-threaded code can seem OK until poked<br>
in right (wrong?) place.<br>
<br>
My first inclination was to point the finger at bordeaux-threads,<br>
which is why I asked about its condition-wait function. It may not<br>
have a race condition since Clozure uses atomic counts (which remember<br>
the signal) instead of condition variables (which don't). However it<br>
is not obvious what happens for arbitrary numbers of threads waiting<br>
and signaling at arbitrary times. I had hoped that someone would<br>
reject the validity of bordeaux's condition-wait.<br>
<br>
This is now moot since condition-wait is out of the picture.<br>
Incidentally if bordeaux-threads has a bogus implementation on Clozure<br>
then this is news to me. If not then my original pop-queue should<br>
work, though somewhat roundaboutly as Clozure sees it.<br>
<br>
I also wondered if threads were somehow accumulating, causing Clozure<br>
to become overwhelmed, but ccl:all-processes reports the same number<br>
of threads on each iteration.<br>
<br>
;;; raw-stack<br>
<br>
(defstruct raw-stack<br>
(data nil))<br>
<br>
(defun push-raw-stack (value q)<br>
(setf (raw-stack-data q) (cons value (raw-stack-data q))))<br>
<br>
(defun pop-raw-stack (q)<br>
(if (raw-stack-data q)<br>
(multiple-value-prog1 (values (car (raw-stack-data q)) t)<br>
(setf (raw-stack-data q) (cdr (raw-stack-data q))))<br>
(values nil nil)))<br>
<br>
;;; stack<br>
<br>
(defstruct stack<br>
(impl (make-raw-stack))<br>
(lock (ccl:make-lock))<br>
(sema (ccl:make-semaphore)))<br>
<br>
(defun push-stack (object stack)<br>
(ccl:with-lock-grabbed ((stack-lock stack))<br>
(push-raw-stack object (stack-impl stack))<br>
(ccl:signal-semaphore (stack-sema stack))))<br>
<br>
(defun pop-stack (stack)<br>
(ccl:wait-on-semaphore (stack-sema stack))<br>
(ccl:with-lock-grabbed ((stack-lock stack))<br>
(multiple-value-bind (value presentp)<br>
(pop-raw-stack (stack-impl stack))<br>
(assert presentp)<br>
value)))<br>
<br>
;;; run<br>
<br>
(defun test ()<br>
(let ((tasks (make-stack)))<br>
(loop<br>
:repeat 2<br>
:do (ccl:process-run-function<br>
"test"<br>
(lambda ()<br>
(loop (funcall (or (pop-stack tasks)<br>
(return)))))))<br>
(let ((receiver (make-stack)))<br>
(push-stack (lambda ()<br>
(push-stack (progn (sleep 0.2) 'later)<br>
receiver))<br>
tasks)<br>
(push-stack (lambda ()<br>
(push-stack 'sooner receiver))<br>
tasks)<br>
(let ((result (pop-stack receiver)))<br>
(assert (eq 'sooner result)))<br>
(let ((result (pop-stack receiver)))<br>
(assert (eq 'later result))))<br>
(push-stack nil tasks)<br>
(push-stack nil tasks))<br>
(format t "."))<br>
<br>
(defun run ()<br>
(loop (test)))<br>
_______________________________________________<br>
Openmcl-devel mailing list<br>
<a href="mailto:Openmcl-devel@clozure.com">Openmcl-devel@clozure.com</a><br>
<a href="http://clozure.com/mailman/listinfo/openmcl-devel" target="_blank">http://clozure.com/mailman/listinfo/openmcl-devel</a><br>
</blockquote></div><br>