[Openmcl-devel] hash table performance mystery

Gary Byers gb at clozure.com
Tue Dec 28 13:16:40 PST 2010


I'll try to look at it closer when I have time, but my first reaction
is to think that the loop can be safely removed.  If it's doing
anything useful, it should probably only do it if deleting the
entry would mean:

a) there are no more "live" entries
b) the vector is essentially full of deleted entries.

In general, in a closed hashing scheme (where there are a fixed
number of entries in the table), finding an entry with a particular
value involves starting at some particular entry (determined by a
hash function) and probing it and some sequence of other entries
("the collision chain") until a match or a free entry is found.
When an entry is removed, it can't usually be marked as "free", 
since that might prevent something further down the chain from
being found; REMHASH has to mark the entry as being "deleted".
(SETF GETHASH) can reuse a deleted entry if it's already walked
the chain without finding a match.

If you add lots of entries to a hash table and then delete them
all,  the table is indistinguishable from an empty table (and shouldn't
be seen as being "full of deleted entries.")  It's possible that
rehashing would notice this and rehash the vector in place (if it
can), but the loop is trying to avoid rehashing (which can involve
changing a read lock on the hash table to a write lock), and the
general idea of the loop may be worthwhile for that reason.

If you add a small number (possibly 1) of entries and delete them,
then the table was already nearly empty ("nearly full of free entries"),
and going through the table marking everything as free (when most
things were already free) is a waste of time.



On Tue, 28 Dec 2010, Jared C. Davis wrote:

> Hi,
>
> Thanks for the replies -- I've managed to distill things down to a
> much smaller example, and I think I can see what's going wrong.  It
> appears that that remhash executes a linear-cost loop when it deletes
> the last-remaining element from a hash table.  Here's an example that
> really makes it clear:
>
> (in-package "CL-USER")
>
> (defparameter *ht*
>  (make-hash-table :test #'eq :size 10000000 :shared nil :lock-free nil))
>
> ;; 10 seconds -- hits the zero-element case over and over.
> (time (loop for i from 1 to 100 do
>            (setf (gethash 'foo *ht*) 'bar)
>            (remhash 'foo *ht*)))
>
> ;; Instant -- never hits the zero-element case
> (progn
>  (setf (gethash 'hello *ht*) 'world)
>  (time (loop for i from 1 to 10000 do
>              (setf (gethash 'foo *ht*) 'bar)
>              (remhash 'foo *ht*))))
>
> Here is the part of remhash that seems to cause the problems:
>
> (when (and foundp
>                     (zerop (the fixnum (nhash.vector.count vector))))
>            (do* ((i $nhash.vector_overhead (1+ i))
>                  (n (uvsize vector)))
>                 ((= i n))
>                 (declare (fixnum i n))
>                 (setf (%svref vector i) free-hash-marker))
>            (setf (nhash.grow-threshold hash)
>                  (+ (nhash.vector.deleted-count vector)
>                     (nhash.vector.weak-deletions-count vector)
>                     (nhash.grow-threshold hash))
>                  (nhash.vector.deleted-count vector) 0
>                  (nhash.vector.weak-deletions-count vector) 0))
>
> It looks like it's overwriting every element in the underlying hash
> vector with free-hash-marker.  I'm not sure why that's necessary, or
> if it can be avoided.
>
> Thanks!
> Jared
>
>
> On Tue, Dec 28, 2010 at 1:06 PM, Gary Byers <gb at clozure.com> wrote:
>> There used to (and may still be) a wiki page that describes some of
>> these tradeoffs in more detail, but the 1.3 release notes (at
>> <http://trac.clozure.com/ccl/wiki/ReleaseNotes/1.3>) say that
>> lock-free hash tables "avoid locking during GETHASH, but at the cost
>> of making rehashing more expensive."
>>
>> "rehashing" occurs when either:
>>
>> - a hash table contains keys that're hashed by address and the GC changes
>> ?the address of one or more keys.
>> - the underlying vector of key/value pairs is full and its contents have
>> ?to be copied (hashed) into a new vector.
>>
>> Just from glancing at your example, I'd guess that it can cause rehashing
>> to occur for both of those reasons.
>>
>> Rehashing is less likely to occur (and the lock-free algorithm is likely
>> to work better) for hash tables whose contents are relatively stable over
>> time (where insertions/deletions are rare and any address-based keys are
>> long-lived objects that're unlikely to be moved by the GC.)
>>
>> Rehashing in a way that allows concurrent access to the hash table from
>> other threads is complicated and expensive. ?If there's a lot of concurrent
>> access going on, then allowing at least many kinds of concurrent access
>> might lead to overall improvement in throughput (compared to forcing other
>> threads to wait for a lock.) ?That's harder to measure, but it leads to
>> another rule of thumb, namely that the lock-free algorithm may offer better
>> overall performance when there's lots of concurrent access going on (even
>> when some of the operations involved - like rehashing - are very expensive.)
>>
>>
>>
>>
>> On Tue, 28 Dec 2010, Jared C. Davis wrote:
>>
>>> ; Hi,
>>> ;
>>> ; I am trying to understand a strange performance problem that appears to
>>> be
>>> ; related to non lock-free hash tables.
>>> ;
>>> ; Below is a very distilled example.
>>> ;
>>> ; In this example, test1 and test2 are quite similar except that test2
>>> does
>>> ; some additional (but irrelevant) work.
>>> ;
>>> ; Claim: "obviously", test2 should be somewhat slower than test1.
>>> ;
>>> ; Indeed, this claim holds up when the *fal-ht* is created with :lock-free
>>> T.
>>> ;
>>> ; But mysteriously, when *fal-ht* is created with :lock-free NIL, test2
>>> ; outperforms test1. ?In fact, when *fal-ht* is large, the performance
>>> ; difference can be huge.
>>> ;
>>> ; Here are some sample figures from my machine:
>>> ;
>>> ; ? ? ? ? ? ? ? ? ? ?FAL-HT Size ? ? ? ? Test1 Loop ? ? ? ?Test2 Loop
>>> ; --------------------------------------------------------------------
>>> ; ?Not Lock-Free ? ? ? ? ?100 ? ? ? ? ? ? ?0.4 sec ? ? ? ? ?0.4 sec
>>> ; ? ? ? ? ? ? ? ? ? ? ? 1,000 ? ? ? ? ? ? ?2.4 sec ? ? ? ? ?0.4 sec
>>> ; ? ? ? ? ? ? ? ? ? ? ?10,000 ? ? ? ? ? ? 23.2 sec ? ? ? ? ?0.4 sec
>>> ; ? ? ? ? ? ? ? ? ? ? 100,000 ? ? ? ? ? ?231.3 sec ? ? ? ? ?0.9 sec
>>> ; --------------------------------------------------------------------
>>> ; ? ? ? ? ? ? ? ? ? ? ? ? 100 ? ? ? ? ? ? ?0.3 sec ? ? ? ? ?0.5 sec
>>> ; ?Lock-Free ? ? ? ? ? ?1,000 ? ? ? ? ? ? ?0.3 sec ? ? ? ? ?0.5 sec
>>> ; ? ? ? ? ? ? ? ? ? ? ?10,000 ? ? ? ? ? ? ?0.3 sec ? ? ? ? ?0.5 sec
>>> ; ? ? ? ? ? ? ? ? ? ? 100,000 ? ? ? ? ? ? ?0.3 sec ? ? ? ? ?0.5 sec
>>> ; --------------------------------------------------------------------
>>> ;
>>> ; I am completely baffled. ?How can test2 outperform test1? ?And why does
>>> the
>>> ; size of fal-ht have any significance on this computation at all? ?Any
>>> ideas
>>> ; are greatly appreciated.
>>> ;
>>> ; The figures above are from revision 14519, checked out this morning, on
>>> ; 64-bit linux, with a kernel built via "make" in lisp-kernel/linuxx8664,
>>> and a
>>> ; (rebuild-ccl :full t) issued subsequently. ?They are similar to results
>>> from
>>> ; an earlier revision (13955) so I have no reason to believe this is
>>> anything
>>> ; recent.
>>> ;
>>> ; Thanks!
>>> ;
>>> ; Jared
>>>
>>> (defparameter *fal-ht*
>>> ?;; Associates alists with "equivalent" hash tables
>>> ?(make-hash-table :test #'eq
>>> ? ? ? ? ? ? ? ? ?:size 10000
>>> ? ? ? ? ? ? ? ? ?:shared nil
>>> ? ? ? ? ? ? ? ? ?:lock-free t
>>> ? ? ? ? ? ? ? ? ?))
>>>
>>> (defun ht-acons (key value alist)
>>> ?"Like acons, but also updates the hash table for alist in *fal-ht*"
>>> ?(let* ((entry ?(cons key value))
>>> ? ? ? ?(ans ? ?(cons entry alist))
>>> ? ? ? ?(fal-ht *fal-ht*))
>>> ? (if (atom alist)
>>> ? ? ? (let ((tab (make-hash-table :size 500)))
>>> ? ? ? ? (setf (gethash key tab) entry)
>>> ? ? ? ? (setf (gethash ans fal-ht) tab))
>>> ? ? (let ((tab (gethash alist fal-ht)))
>>> ? ? ? (remhash alist fal-ht)
>>> ? ? ? (setf (gethash key tab) entry)
>>> ? ? ? (setf (gethash ans fal-ht) tab)))
>>> ? ans))
>>>
>>> (defun ht-assoc (key alist)
>>> ?"Like assoc, but uses the hash table for alist instead."
>>> ?(if (atom alist)
>>> ? ? nil
>>> ? (gethash key (gethash alist *fal-ht*))))
>>>
>>> (defun ht-free (alist)
>>> ?"Removes the hash table associated with alist."
>>> ?(unless (atom alist)
>>> ? (remhash alist *fal-ht*))
>>> ?alist)
>>>
>>> (defun test1 (lst tab acc)
>>> ?"Removes duplicates from the list L, accumulating non-duplicated elements
>>> ?onto acc. ?TAB acts as a seen table. ?It is an alist, that has a hash
>>> ?table associated with it in *fal-ht*, that binds seen elements to T."
>>> ?(if (atom lst)
>>> ? ? (progn (ht-free tab)
>>> ? ? ? ? ? ?acc)
>>> ? (let ((look (ht-assoc (car lst) tab)))
>>> ? ? (if look
>>> ? ? ? ? (test1 (cdr lst) tab acc)
>>> ? ? ? (let* ((tab (ht-acons (car lst) t tab))
>>> ? ? ? ? ? ? ?(acc (cons (car lst) acc)))
>>> ? ? ? ? (test1 (cdr lst) tab acc))))))
>>>
>>> (defun test2 (lst tab acc tab2)
>>> ?"Similar to test1, but we also do some irrelevant work: we update tab2,
>>> ?a second alist/hash table, just as we update tab."
>>> ?(if (atom lst)
>>> ? ? (progn (ht-free tab2)
>>> ? ? ? ? ? ?(ht-free tab)
>>> ? ? ? ? ? ?acc)
>>> ? (let ((look (ht-assoc (car lst) tab)))
>>> ? ? (if look
>>> ? ? ? ? (test2 (cdr lst) tab acc tab2)
>>> ? ? ? (let* ((tab ?(ht-acons (car lst) t tab))
>>> ? ? ? ? ? ? ?(tab2 (ht-acons (car lst) t tab2))
>>> ? ? ? ? ? ? ?(acc ?(cons (car lst) acc)))
>>> ? ? ? ? (test2 (cdr lst) tab acc tab2))))))
>>>
>>> ;; Stupid data to use for our test.
>>> (defparameter *dat*
>>> ?(loop for i from 1 to 400 collect i))
>>>
>>> ;; Timing loop for test1.
>>> (time (loop for i from 1 to 1000 do
>>> ? ? ? ? ? (test1 *dat* nil nil)))
>>>
>>> ;; Timing loop for test2.
>>> (time (loop for i from 1 to 1000 do
>>> ? ? ? ? ? (test2 *dat* nil nil nil)))
>>>
>>> (quit)
>>>
>>>
>>> --
>>> Jared C. Davis <jared at cs.utexas.edu>
>>> 11410 Windermere Meadows
>>> Austin, TX 78759
>>> http://www.cs.utexas.edu/users/jared/
>>> _______________________________________________
>>> Openmcl-devel mailing list
>>> Openmcl-devel at clozure.com
>>> http://clozure.com/mailman/listinfo/openmcl-devel
>>>
>>>
>>
>
>
>
> -- 
> Jared C. Davis <jared at cs.utexas.edu>
> 11410 Windermere Meadows
> Austin, TX 78759
> http://www.cs.utexas.edu/users/jared/
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>



More information about the Openmcl-devel mailing list