[Openmcl-devel] [info-mcl] Thread-safe assignments / Is setf atomic

Wed Nov 7 10:57:58 PST 2007

At 10:01 AM -0500 11/7/07, Andrew Shalit wrote:
>Note that Toomas's analysis is only applicable to MCL.  Other Common
>Lisp implementations do not share these limitations on where they can be
>interrupted. 

Yes, this is called getting to a (GC) "safe point" and many other GC based languages these days don't used the pooling check approach at every function entry and backward branch approach that MCL uses.

"Thread-safe" is not a very specific term especially as you start to deal with more than one "Thread-safe" variable where memory ordering effects also play a role.

How will you be able to prove that you you have thread safety? Disassembling code doesn't really impress me. Suppose it just happens to work with the current compiler? Then some compiler optimization latter could cause it to fail. This probably isn't the case with MCL, but could be the case with CCL.

>From my perspective you can really know that a mechanism that you are using is "thread safe" unless the people that designed your product assert that it is.

If you want to see how hard this can be even in a simple language such as "C" with just two variables, see blog entry here, called "Barriers":
http://ridiculousfish.com/blog/

This is well worth reading to get a feel for all the issues involved, although I don't claim the everything said in that blog is accurate or even that his conclusions are correct. Still, kudos to the author for actually using code to work through the issues and trying to figure it all out.

One of the issues all programmers will face as the number of simultaneous executing threads (I call them SETS) continues to increase is how to create low overhead thread to thread communications mechanisms that are safe to use. One size certainly doesn't fit all.

My final thought is be very, very careful when thinking about using built in data structures of any language for thread safe communication, when the person that implemented them doesn't explain what if anything is guaranteed. Make sure you consider the more general cases of more than one thread safe variable and the case where the correct value in one variable depends on the other. If you don't have a firm grasp of modern computers architecture, memory system issues, and especially memory ordering and barriers I don't see how you can even begin to reason about this. It's important to know what you don't know.

Best,

leb

>(I'm actually not certain that MCL still works this way
>either, but I trust Toomas that it does.)
>
>Toomas Altosaar wrote:
>>> Hi,
>>>
>>> I realize that my recent question about atomic setf was pretty
>>> ambiguous. So here is a question narrowed down to something simpler,
>>> which is hopefully more straightforward to answer.
>>>
>>> Assume there is a global variable *var*, for example defined like this:
>>> (cl:defvar *var* 42)
>>>
>>> Assume there is exactly one thread forever executing the following loop:
>>> (cl:loop
>>>    (cl:setf *var* 42)
>>>    (cl:setf *var* 84))
>>>
>>> Furthermore, assume that other threads either don't touch *var* at
>>> all, or only read from it.
>>>
>>> Is it safe to assume that the reading threads will only ever read the
>>> values 42 or 84 from *var*, or should one take into account that every
>>> now and then, *var* may contain garbage (i.e., a bit pattern that
>>> consists partly of the bits of 42, and partly of the bits of 84)?
>>
>> (cl:defvar *var* 42)
>>
>> (defun fn ()
>>   (cl:loop
>>     (cl:setf *var* 42)
>>     (cl:setf *var* 84)))
>>
>> (disassemble 'fn)
>>
>>   (TWNEI NARGS 0)
>>   (MFLR LOC-PC)
>>   (BLA .SPSAVECONTEXTVSP)
>> L12
>>   (LWZ NARGS 331 RNIL)
>>   (TWGTI NARGS 0)
>>   (LWZ ARG_Y '*VAR* FN)  <=== XXX starting from here ...
>>   (LI ARG_Z '42)
>>   (STW ARG_Z 2 ARG_Y)         ; *var* is now 42
>>   (LWZ ARG_Y '*VAR* FN)
>>   (LI ARG_Z '84)              ; *var* is now 84
>>   (STW ARG_Z 2 ARG_Y)
>>   (B L12)                <=== YYY ... to here ... nothing can interrupt
>> this MCL thread
> >   (MR ARG_Z RNIL)
>>   (BA .SPPOPJ)
>>
>> So, from the disassemble one can see that two instructions past label
>> L12 until the branch back to L12 (b L12) is one linear section of code
>> with no branches or function calls. Within this stretch of code no
>> (Lisp) process / (OS) thread switch can occur within MCL. Why? Since we
>> are executing this code and no other thread in MCL is active! Since we
>> haven't made it back to the top of the loop where cmain/scheduler might
>> get called (the functions that cause threads to wait and run), we are
>> guaranteed that from XXX to YYY acts as an atomic operation. Only at the
>> top of the loop where the trap test is performed to see whether it is
>> time to run other code.
>>
>> Since 42 and 84 are fixnums in MCL (IIRC, MCL uses 2 tag bits for
>> fixnums, otherwise 3 for other data), their writes are handled with a
>> single store word (STW) instruction that writes 32 bits out to memory in
>> one shot.
>>
>> So in this example _all_ other MCL threads, if reading from *var* will
>> _always_ read the value 84, and never be able to see a value 42.
>>
>>> To generalize, are there datatypes for which it is not safe to assume
>>> that variable assignments are thread-safe in that sense?
>>
>> An example: updating an array.
>> Initially a very long simple array has been initialised so that all of
>> its elements have the fixnum representation for 0, i.e., all of its
>> elements are 0.
>>
>> Then we call a function that starts to set the value of each element to
>> another value, e.g., 1. Furthermore, the function does this updating by
>> LOOPing over the elements, e.g.,
>>
>> (defun update-array (arr N new-value)
>>   (loop for i from 0 below N
>>         do (setf (aref arr i) new-value)))
>>
>> Since there will be a backwards branch (as in your example above), other
>> threads reading from arr will get to run periodically and will see
>> changes to arr as time progresses.
>>
>>
>> If in doubt, build the new datatype, and then assign a reference to it
>> at the end, e.g.,
>>
>> (let* ((x (create-some-very-large-bignum)))
>>   (setq *var* x))
>>
>>> What about plain references?
>>
>> What do you mean by a plain reference?
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> info-mcl mailing list
>> info-mcl at clozure.com
>> http://clozure.com/mailman/listinfo/info-mcl
>_______________________________________________
>info-mcl mailing list
>info-mcl at clozure.com
>http://clozure.com/mailman/listinfo/info-mcl