[Openmcl-devel] Semaphore troubles

Thu May 10 04:04:05 UTC 2012

```Try this code.

Had to borrow code for queue.  (My is comment is *damn* stack and *damn* dequeueing threads)

try (test 50000)

;;; -*- Mode: Lisp; Syntax: Common-Lisp; -*- File: utilities/queue.lisp

;;;; The Queue datatype

;;; We can remove elements form the front of a queue.  We can add elements in
;;; three ways: to the front, to the back, or ordered by some numeric score.
;;; This is done with the following enqueing functions, which make use of the
;;; following implementations of the elements:
;;;   ENQUEUE-AT-FRONT - elements are a list
;;;   ENQUEUE-AT-END   - elements are a list, with a pointer to end
;;;   ENQUEUE-BY-PRIORITY - elements are a heap, implemented as an array
;;; The best element in the queue is always in position 0.

;;; The heap implementation is taken from "Introduction to Algorithms" by
;;; Cormen, Lieserson & Rivest [CL&R], Chapter 7.  We could certainly speed
;;; up the constant factors of this implementation.  It is meant to be clear
;;; and simple and O(log n), but not super efficient.  Consider a Fibonacci
;;; heap [Page 420 CL&R] if you really have large queues to deal with.

(defstruct q
(key #'identity)
(last nil)
(elements nil))

;;;; Basic Operations on Queues

(defun make-empty-queue () (make-q))

(defun empty-queue? (q)
"Are there no elements in the queue?"
(= (length (q-elements q)) 0))

(defun queue-front (q)
"Return the element at the front of the queue."
(elt (q-elements q) 0))

(defun remove-front (q)
"Remove the element from the front of the queue and return it."
(pop (q-elements q)))

;;;; The Three Enqueing Functions

(defun enqueue-at-front (q items)
"Add a list of items to the front of the queue."
(setf (q-elements q) (nconc items (q-elements q))))

(defun enqueue-at-end (q items)
"Add a list of items to the end of the queue."
;; To make this more efficient, keep a pointer to the last cons in the queue
(cond ((null items) nil)
((or (null (q-last q)) (null (q-elements q)))
(setf (q-last q) (last items)
(q-elements q) (nconc (q-elements q) items)))
(t (setf (cdr (q-last q)) items
(q-last q) (last items)))))

(defstruct queue
(impl (make-empty-queue))
(lock (ccl:make-lock))
(sema (ccl:make-semaphore)))

(defun queue (object queue)
(ccl:with-lock-grabbed ((queue-lock queue))
(enqueue-at-end (queue-impl queue) (list object)))
(ccl:signal-semaphore (queue-sema queue)))

(defun dequeue (queue)
(ccl:wait-on-semaphore (queue-sema queue))
(ccl:with-lock-grabbed ((queue-lock queue))
(if (empty-queue? (queue-impl queue))
(values nil nil)
(values (remove-front (queue-impl queue)) t))))

(defun test (n)
(lock (ccl:make-lock)))
(loop repeat 2 do
(ccl:process-run-function
"test"
(lambda ()
(loop
(ccl:with-lock-grabbed (lock)
(return)))
(loop repeat n do
(assert (eq 'sooner result)))
(assert (eq 'later result)))
(format t "."))

On 2012-05-09, at 7:46 PM, James M. Lawrence wrote:

> I appreciate the time you've taken to respond.
>
> When I wrote "0.5 seconds or whatever", I meant that it fails for
> apparently any amount of time. As I mentioned in response to another
> reply, a 10 second sleep produces the failure as well. Is that
>
> It is also unclear why
>
> * it does not fail when the loop is removed (along with the push nils)
>
> * it does not fail when (format t ".") is removed
>
> Perhaps these are just curiosities due to entropy in the underlying
> system calls.
>
> The upshot of what you're saying is that Clozure cannot reliably
> distribute work across threads, while other CL implementations can. I
> would not call it a bug, but it's at least unfortunate. In fact
> Clozure scales better than SBCL for parallel mapping and other
> functions (stats available upon request), barring these peculiar
> hiccups.
>
>
> On Wed, May 9, 2012 at 8:44 PM, Gary Byers <gb at clozure.com> wrote:
>>
>>
>> On Wed, 9 May 2012, James M. Lawrence wrote:
>>
>>> I thought my example was straightforward enough, though as I mentioned
>>> I wish it were smaller. Following your suggestion, I have replaced the
>>> queue with a stack. I have also taken out the condition-wait function
>>> copied from bordeaux-threads. My pop function now resembles your
>>> consume function.
>>>
>>> The same assertion failure occurs.
>>>
>>> I am unable to reproduce it with high debug settings, or with tracing,
>>> or with logging.
>>
>>
>>>
>>> The test consists of a pair of worker threads pulling from a task
>>> sleeps for 0.2 seconds (it can be 0.5 seconds or whatever, it just
>>> takes longer to fail). Since we have two workers, we should always
>>> obtain the result of the sleeping task second. A signal is getting
>>> missed, or something.
>>
>>
>> You're assuming that whatever thread pulls the lambda that returns
>> 'SOONER will off of TASKS will push 'SOONER onto RECEIVER before
>> another thread pulls another lambda that sleeps for .2 seconds before
>> returning 'LATER pushes 'LATER on RECEIVER.  That assumption is likely
>> to hold a high percentage of the time, but I can't think of anything
>> that guarantees it. (The OS scheduler may have decided that it should
>> let Emacs re-fontify some buffers for a while, or let the kernel
>> process all of those network packets that've been gumming up the
>> works, and when it gets back to CCL it finds that it's time for the
>> sleeping thread to wake up and it gets scheduled and pushes LATER
>> on RECEIVER before the other thread even wakes up.  This kind of scenario
>> isn't as likely as one where 'SOONER is pushed first, but
>> it's not wildly improbable, either.  It's "likely" that 'SOONER will
>> be pushed first - maybe even "highly likely".  It's more likely (more
>> highly likely ?) if the sleeping thread sleeps longer, but non-realtime
>> OSes (like most flavors of Linux, like OSX, like ...) don't make the
>> scheduling guarantees that you seem to be assuming.
>>
>> While you're thinking "this thread should run before the other one because
>> it's ready to run and the other one is sleeping", the scheduler's thinking
>> "that CPU has been really active lately; better shut it down for a little
>> while so that it doesn't get too hot or consume too much power", or
>> something
>> equally obscure and unintuitive.  If you change compiler options, or
>> do printing or logging (or otherwise change how threads use the CPU cycles
>> they're given), your code looks different to the scheduler and behaves
>> differently (in subtle and not-always-predictable ways.)
>>
>> Of all the thread-related bugs that've ever existed in CCL, the most
>> common cause has probably been "code wasn't prepared to deal with
>> concurrency"; a close second is probably "code is making unwarranted
>> assumptions about scheduler behavior."  After many years of getting beaten
>> by those things, I think and hope that I'm more inclined to question some
>> assumptions that I used to make automatically and implicitly, and my first
>> reaction is to question the assumption that you're making.  It's more likely
>> that the thread that doesn't sleep will push 'SOONER before the thread that
>> sleeps pushes 'LATER, but nothing guarantees this, lots of factors affect
>> what happens, and all that I can see is that things that're statistically
>> unlikely happen occasionally.
>>
>> Scheduling behavior is likely beyond the grasp of mere mortals; we can have
>> a reasonable, largely accurate model of how things will behave, but we have
>> to bear in mind that that's all we have.
>>
>> Semaphores in CCL are very thin wrappers around whatever the OS provides
>> (POSIX
>> semaphores, Mach semaphores, something-or-other on Windows.)  If you say "a
>> [semaphore] must be getting dropped", you're either saying that there's a
>> problem
>> in that very thin wrapper or that we're all doomed (because what the OS
>> provides
>> doesn't work), and you're also saying that your code demonstrates this
>> problem
>> and no one else's notices.  Some or all of those things could be true, but
>> you're
>> claiming that they must be because you think that you know which thread will
>> run before which other thread.  You don't know that; all you really know is
>> that's
>> probably true.
>>
>>
>>  (defun test ()
>>>
>>>   (loop
>>>      :repeat 2
>>>      :do (ccl:process-run-function
>>>           "test"
>>>           (lambda ()
>>>             (loop (funcall (or (pop-stack tasks)
>>>                                (return)))))))
>>>     (push-stack (lambda ()
>>>                   (push-stack (progn (sleep 0.2) 'later)
>>>     (push-stack (lambda ()
>>>       (assert (eq 'sooner result)))
>>>       (assert (eq 'later result))))
>>>  (format t "."))
>>>
>>> (defun run ()
>>>  (loop (test)))
>>> _______________________________________________
>>> Openmcl-devel mailing list
>>> Openmcl-devel at clozure.com
>>> http://clozure.com/mailman/listinfo/openmcl-devel
>>>
>>>
>>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel

```