[Openmcl-devel] hash table performance mystery

Jared C. Davis jared at cs.utexas.edu
Tue Dec 28 09:47:39 PST 2010

; Hi,
; I am trying to understand a strange performance problem that appears to be
; related to non lock-free hash tables.
; Below is a very distilled example.
; In this example, test1 and test2 are quite similar except that test2 does
; some additional (but irrelevant) work.
; Claim: "obviously", test2 should be somewhat slower than test1.
; Indeed, this claim holds up when the *fal-ht* is created with :lock-free T.
; But mysteriously, when *fal-ht* is created with :lock-free NIL, test2
; outperforms test1.  In fact, when *fal-ht* is large, the performance
; difference can be huge.
; Here are some sample figures from my machine:
;                    FAL-HT Size         Test1 Loop        Test2 Loop
; --------------------------------------------------------------------
;  Not Lock-Free          100              0.4 sec          0.4 sec
;                       1,000              2.4 sec          0.4 sec
;                      10,000             23.2 sec          0.4 sec
;                     100,000            231.3 sec          0.9 sec
; --------------------------------------------------------------------
;                         100              0.3 sec          0.5 sec
;  Lock-Free            1,000              0.3 sec          0.5 sec
;                      10,000              0.3 sec          0.5 sec
;                     100,000              0.3 sec          0.5 sec
; --------------------------------------------------------------------
; I am completely baffled.  How can test2 outperform test1?  And why does the
; size of fal-ht have any significance on this computation at all?  Any ideas
; are greatly appreciated.
; The figures above are from revision 14519, checked out this morning, on
; 64-bit linux, with a kernel built via "make" in lisp-kernel/linuxx8664, and a
; (rebuild-ccl :full t) issued subsequently.  They are similar to results from
; an earlier revision (13955) so I have no reason to believe this is anything
; recent.
; Thanks!
; Jared

(defparameter *fal-ht*
  ;; Associates alists with "equivalent" hash tables
  (make-hash-table :test #'eq
                   :size 10000
                   :shared nil
                   :lock-free t

(defun ht-acons (key value alist)
  "Like acons, but also updates the hash table for alist in *fal-ht*"
  (let* ((entry  (cons key value))
         (ans    (cons entry alist))
         (fal-ht *fal-ht*))
    (if (atom alist)
        (let ((tab (make-hash-table :size 500)))
          (setf (gethash key tab) entry)
          (setf (gethash ans fal-ht) tab))
      (let ((tab (gethash alist fal-ht)))
        (remhash alist fal-ht)
        (setf (gethash key tab) entry)
        (setf (gethash ans fal-ht) tab)))

(defun ht-assoc (key alist)
  "Like assoc, but uses the hash table for alist instead."
  (if (atom alist)
    (gethash key (gethash alist *fal-ht*))))

(defun ht-free (alist)
  "Removes the hash table associated with alist."
  (unless (atom alist)
    (remhash alist *fal-ht*))

(defun test1 (lst tab acc)
  "Removes duplicates from the list L, accumulating non-duplicated elements
   onto acc.  TAB acts as a seen table.  It is an alist, that has a hash
   table associated with it in *fal-ht*, that binds seen elements to T."
  (if (atom lst)
      (progn (ht-free tab)
    (let ((look (ht-assoc (car lst) tab)))
      (if look
          (test1 (cdr lst) tab acc)
        (let* ((tab (ht-acons (car lst) t tab))
               (acc (cons (car lst) acc)))
          (test1 (cdr lst) tab acc))))))

(defun test2 (lst tab acc tab2)
  "Similar to test1, but we also do some irrelevant work: we update tab2,
   a second alist/hash table, just as we update tab."
  (if (atom lst)
      (progn (ht-free tab2)
             (ht-free tab)
    (let ((look (ht-assoc (car lst) tab)))
      (if look
          (test2 (cdr lst) tab acc tab2)
        (let* ((tab  (ht-acons (car lst) t tab))
               (tab2 (ht-acons (car lst) t tab2))
               (acc  (cons (car lst) acc)))
          (test2 (cdr lst) tab acc tab2))))))

;; Stupid data to use for our test.
(defparameter *dat*
  (loop for i from 1 to 400 collect i))

;; Timing loop for test1.
(time (loop for i from 1 to 1000 do
            (test1 *dat* nil nil)))

;; Timing loop for test2.
(time (loop for i from 1 to 1000 do
            (test2 *dat* nil nil nil)))


Jared C. Davis <jared at cs.utexas.edu>
11410 Windermere Meadows
Austin, TX 78759

More information about the Openmcl-devel mailing list