[Openmcl-devel] hash table performance mystery

Tue Dec 28 13:50:38 PST 2010

Hi,

Thanks for the explanation; this make a lot of sense now, and it
reassures me that my very stupid workaround---adding a useless element
to the table when I create it---is reasonable and should work pretty
well for us, for now.  This particular table tends to stay pretty
small, so we should have no problems if it's resized more frequently
due to a pile-up of deleted elements.

Thanks!

Jared

On Tue, Dec 28, 2010 at 3:16 PM, Gary Byers <gb at clozure.com> wrote:
> I'll try to look at it closer when I have time, but my first reaction
> is to think that the loop can be safely removed.  If it's doing
> anything useful, it should probably only do it if deleting the
> entry would mean:
>
> a) there are no more "live" entries
> b) the vector is essentially full of deleted entries.
>
> In general, in a closed hashing scheme (where there are a fixed
> number of entries in the table), finding an entry with a particular
> value involves starting at some particular entry (determined by a
> hash function) and probing it and some sequence of other entries
> ("the collision chain") until a match or a free entry is found.
> When an entry is removed, it can't usually be marked as "free", since that
> might prevent something further down the chain from
> being found; REMHASH has to mark the entry as being "deleted".
> (SETF GETHASH) can reuse a deleted entry if it's already walked
> the chain without finding a match.
>
> If you add lots of entries to a hash table and then delete them
> all,  the table is indistinguishable from an empty table (and shouldn't
> be seen as being "full of deleted entries.")  It's possible that
> rehashing would notice this and rehash the vector in place (if it
> can), but the loop is trying to avoid rehashing (which can involve
> changing a read lock on the hash table to a write lock), and the
> general idea of the loop may be worthwhile for that reason.
>
> If you add a small number (possibly 1) of entries and delete them,
> then the table was already nearly empty ("nearly full of free entries"),
> and going through the table marking everything as free (when most
> things were already free) is a waste of time.
>
>
>
> On Tue, 28 Dec 2010, Jared C. Davis wrote:
>
>> Hi,
>>
>> Thanks for the replies -- I've managed to distill things down to a
>> much smaller example, and I think I can see what's going wrong.  It
>> appears that that remhash executes a linear-cost loop when it deletes
>> the last-remaining element from a hash table.  Here's an example that
>> really makes it clear:
>>
>> (in-package "CL-USER")
>>
>> (defparameter *ht*
>>  (make-hash-table :test #'eq :size 10000000 :shared nil :lock-free nil))
>>
>> ;; 10 seconds -- hits the zero-element case over and over.
>> (time (loop for i from 1 to 100 do
>>           (setf (gethash 'foo *ht*) 'bar)
>>           (remhash 'foo *ht*)))
>>
>> ;; Instant -- never hits the zero-element case
>> (progn
>>  (setf (gethash 'hello *ht*) 'world)
>>  (time (loop for i from 1 to 10000 do
>>             (setf (gethash 'foo *ht*) 'bar)
>>             (remhash 'foo *ht*))))
>>
>> Here is the part of remhash that seems to cause the problems:
>>
>> (when (and foundp
>>                    (zerop (the fixnum (nhash.vector.count vector))))
>>           (do* ((i $nhash.vector_overhead (1+ i))
>>                 (n (uvsize vector)))
>>                ((= i n))
>>                (declare (fixnum i n))
>>                (setf (%svref vector i) free-hash-marker))
>>           (setf (nhash.grow-threshold hash)
>>                 (+ (nhash.vector.deleted-count vector)
>>                    (nhash.vector.weak-deletions-count vector)
>>                    (nhash.grow-threshold hash))
>>                 (nhash.vector.deleted-count vector) 0
>>                 (nhash.vector.weak-deletions-count vector) 0))
>>
>> It looks like it's overwriting every element in the underlying hash
>> vector with free-hash-marker.  I'm not sure why that's necessary, or
>> if it can be avoided.
>>
>> Thanks!
>> Jared
>>
>>
>> On Tue, Dec 28, 2010 at 1:06 PM, Gary Byers <gb at clozure.com> wrote:
>>>
>>> There used to (and may still be) a wiki page that describes some of
>>> these tradeoffs in more detail, but the 1.3 release notes (at
>>> <http://trac.clozure.com/ccl/wiki/ReleaseNotes/1.3>) say that
>>> lock-free hash tables "avoid locking during GETHASH, but at the cost
>>> of making rehashing more expensive."
>>>
>>> "rehashing" occurs when either:
>>>
>>> - a hash table contains keys that're hashed by address and the GC changes
>>> ?the address of one or more keys.
>>> - the underlying vector of key/value pairs is full and its contents have
>>> ?to be copied (hashed) into a new vector.
>>>
>>> Just from glancing at your example, I'd guess that it can cause rehashing
>>> to occur for both of those reasons.
>>>
>>> Rehashing is less likely to occur (and the lock-free algorithm is likely
>>> to work better) for hash tables whose contents are relatively stable over
>>> time (where insertions/deletions are rare and any address-based keys are
>>> long-lived objects that're unlikely to be moved by the GC.)
>>>
>>> Rehashing in a way that allows concurrent access to the hash table from
>>> other threads is complicated and expensive. ?If there's a lot of
>>> concurrent
>>> access going on, then allowing at least many kinds of concurrent access
>>> might lead to overall improvement in throughput (compared to forcing
>>> other
>>> threads to wait for a lock.) ?That's harder to measure, but it leads to
>>> another rule of thumb, namely that the lock-free algorithm may offer
>>> better
>>> overall performance when there's lots of concurrent access going on (even
>>> when some of the operations involved - like rehashing - are very
>>> expensive.)
>>>
>>>
>>>
>>>
>>> On Tue, 28 Dec 2010, Jared C. Davis wrote:
>>>
>>>> ; Hi,
>>>> ;
>>>> ; I am trying to understand a strange performance problem that appears
>>>> to
>>>> be
>>>> ; related to non lock-free hash tables.
>>>> ;
>>>> ; Below is a very distilled example.
>>>> ;
>>>> ; In this example, test1 and test2 are quite similar except that test2
>>>> does
>>>> ; some additional (but irrelevant) work.
>>>> ;
>>>> ; Claim: "obviously", test2 should be somewhat slower than test1.
>>>> ;
>>>> ; Indeed, this claim holds up when the *fal-ht* is created with
>>>> :lock-free
>>>> T.
>>>> ;
>>>> ; But mysteriously, when *fal-ht* is created with :lock-free NIL, test2
>>>> ; outperforms test1. ?In fact, when *fal-ht* is large, the performance
>>>> ; difference can be huge.
>>>> ;
>>>> ; Here are some sample figures from my machine:
>>>> ;
>>>> ; ? ? ? ? ? ? ? ? ? ?FAL-HT Size ? ? ? ? Test1 Loop ? ? ? ?Test2 Loop
>>>> ; --------------------------------------------------------------------
>>>> ; ?Not Lock-Free ? ? ? ? ?100 ? ? ? ? ? ? ?0.4 sec ? ? ? ? ?0.4 sec
>>>> ; ? ? ? ? ? ? ? ? ? ? ? 1,000 ? ? ? ? ? ? ?2.4 sec ? ? ? ? ?0.4 sec
>>>> ; ? ? ? ? ? ? ? ? ? ? ?10,000 ? ? ? ? ? ? 23.2 sec ? ? ? ? ?0.4 sec
>>>> ; ? ? ? ? ? ? ? ? ? ? 100,000 ? ? ? ? ? ?231.3 sec ? ? ? ? ?0.9 sec
>>>> ; --------------------------------------------------------------------
>>>> ; ? ? ? ? ? ? ? ? ? ? ? ? 100 ? ? ? ? ? ? ?0.3 sec ? ? ? ? ?0.5 sec
>>>> ; ?Lock-Free ? ? ? ? ? ?1,000 ? ? ? ? ? ? ?0.3 sec ? ? ? ? ?0.5 sec
>>>> ; ? ? ? ? ? ? ? ? ? ? ?10,000 ? ? ? ? ? ? ?0.3 sec ? ? ? ? ?0.5 sec
>>>> ; ? ? ? ? ? ? ? ? ? ? 100,000 ? ? ? ? ? ? ?0.3 sec ? ? ? ? ?0.5 sec
>>>> ; --------------------------------------------------------------------
>>>> ;
>>>> ; I am completely baffled. ?How can test2 outperform test1? ?And why
>>>> does
>>>> the
>>>> ; size of fal-ht have any significance on this computation at all? ?Any
>>>> ideas
>>>> ; are greatly appreciated.
>>>> ;
>>>> ; The figures above are from revision 14519, checked out this morning,
>>>> on
>>>> ; 64-bit linux, with a kernel built via "make" in
>>>> lisp-kernel/linuxx8664,
>>>> and a
>>>> ; (rebuild-ccl :full t) issued subsequently. ?They are similar to
>>>> results
>>>> from
>>>> ; an earlier revision (13955) so I have no reason to believe this is
>>>> anything
>>>> ; recent.
>>>> ;
>>>> ; Thanks!
>>>> ;
>>>> ; Jared
>>>>
>>>> (defparameter *fal-ht*
>>>> ?;; Associates alists with "equivalent" hash tables
>>>> ?(make-hash-table :test #'eq
>>>> ? ? ? ? ? ? ? ? ?:size 10000
>>>> ? ? ? ? ? ? ? ? ?:shared nil
>>>> ? ? ? ? ? ? ? ? ?:lock-free t
>>>> ? ? ? ? ? ? ? ? ?))
>>>>
>>>> (defun ht-acons (key value alist)
>>>> ?"Like acons, but also updates the hash table for alist in *fal-ht*"
>>>> ?(let* ((entry ?(cons key value))
>>>> ? ? ? ?(ans ? ?(cons entry alist))
>>>> ? ? ? ?(fal-ht *fal-ht*))
>>>> ? (if (atom alist)
>>>> ? ? ? (let ((tab (make-hash-table :size 500)))
>>>> ? ? ? ? (setf (gethash key tab) entry)
>>>> ? ? ? ? (setf (gethash ans fal-ht) tab))
>>>> ? ? (let ((tab (gethash alist fal-ht)))
>>>> ? ? ? (remhash alist fal-ht)
>>>> ? ? ? (setf (gethash key tab) entry)
>>>> ? ? ? (setf (gethash ans fal-ht) tab)))
>>>> ? ans))
>>>>
>>>> (defun ht-assoc (key alist)
>>>> ?"Like assoc, but uses the hash table for alist instead."
>>>> ?(if (atom alist)
>>>> ? ? nil
>>>> ? (gethash key (gethash alist *fal-ht*))))
>>>>
>>>> (defun ht-free (alist)
>>>> ?"Removes the hash table associated with alist."
>>>> ?(unless (atom alist)
>>>> ? (remhash alist *fal-ht*))
>>>> ?alist)
>>>>
>>>> (defun test1 (lst tab acc)
>>>> ?"Removes duplicates from the list L, accumulating non-duplicated
>>>> elements
>>>> ?onto acc. ?TAB acts as a seen table. ?It is an alist, that has a hash
>>>> ?table associated with it in *fal-ht*, that binds seen elements to T."
>>>> ?(if (atom lst)
>>>> ? ? (progn (ht-free tab)
>>>> ? ? ? ? ? ?acc)
>>>> ? (let ((look (ht-assoc (car lst) tab)))
>>>> ? ? (if look
>>>> ? ? ? ? (test1 (cdr lst) tab acc)
>>>> ? ? ? (let* ((tab (ht-acons (car lst) t tab))
>>>> ? ? ? ? ? ? ?(acc (cons (car lst) acc)))
>>>> ? ? ? ? (test1 (cdr lst) tab acc))))))
>>>>
>>>> (defun test2 (lst tab acc tab2)
>>>> ?"Similar to test1, but we also do some irrelevant work: we update tab2,
>>>> ?a second alist/hash table, just as we update tab."
>>>> ?(if (atom lst)
>>>> ? ? (progn (ht-free tab2)
>>>> ? ? ? ? ? ?(ht-free tab)
>>>> ? ? ? ? ? ?acc)
>>>> ? (let ((look (ht-assoc (car lst) tab)))
>>>> ? ? (if look
>>>> ? ? ? ? (test2 (cdr lst) tab acc tab2)
>>>> ? ? ? (let* ((tab ?(ht-acons (car lst) t tab))
>>>> ? ? ? ? ? ? ?(tab2 (ht-acons (car lst) t tab2))
>>>> ? ? ? ? ? ? ?(acc ?(cons (car lst) acc)))
>>>> ? ? ? ? (test2 (cdr lst) tab acc tab2))))))
>>>>
>>>> ;; Stupid data to use for our test.
>>>> (defparameter *dat*
>>>> ?(loop for i from 1 to 400 collect i))
>>>>
>>>> ;; Timing loop for test1.
>>>> (time (loop for i from 1 to 1000 do
>>>> ? ? ? ? ? (test1 *dat* nil nil)))
>>>>
>>>> ;; Timing loop for test2.
>>>> (time (loop for i from 1 to 1000 do
>>>> ? ? ? ? ? (test2 *dat* nil nil nil)))
>>>>
>>>> (quit)
>>>>
>>>>
>>>> --
>>>> Jared C. Davis <jared at cs.utexas.edu>
>>>> 11410 Windermere Meadows
>>>> Austin, TX 78759
>>>> http://www.cs.utexas.edu/users/jared/
>>>> _______________________________________________
>>>> Openmcl-devel mailing list
>>>> Openmcl-devel at clozure.com
>>>> http://clozure.com/mailman/listinfo/openmcl-devel
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Jared C. Davis <jared at cs.utexas.edu>
>> 11410 Windermere Meadows
>> Austin, TX 78759
>> http://www.cs.utexas.edu/users/jared/
>> _______________________________________________
>> Openmcl-devel mailing list
>> Openmcl-devel at clozure.com
>> http://clozure.com/mailman/listinfo/openmcl-devel
>>
>>
>

-- 
Jared C. Davis <jared at cs.utexas.edu>
11410 Windermere Meadows
Austin, TX 78759
http://www.cs.utexas.edu/users/jared/