[Openmcl-devel] Bignum speed

Thu Oct 16 22:22:36 PDT 2008

On Oct 16, 2008, at 11:24 PM, Waldek Hebisch wrote:

> A little benchmark below.  To make time easier to measure
> dot-product adds all products of elements of two lists.  To
> avoid possiblity of trivial optimizations all numbers are
> different, products are added, result is assigned to a variable.
> I make sure that result is not printed, because printing is
> usually quite time consuming and could change timing in undesired way.
>
>
> (setf a (expt 9 1000))
> (defun dot-product (l1 l2)
>    (let ((s 0)
>          (ll1 l1))
>        (loop
>            (cond
>                ((null l2)
>                    (return-from dot-product s))
>                ((null ll1)
>                    (setf ll1 l1)
>                    (setf l2 (cdr l2)))
>                (t
>                    (setf s (+ s (* (car ll1) (car l2))))
>                    (setf ll1 (cdr ll1)))))))
>
> (compile 'dot-product)
>
> (setf l1 nil)
> (dotimes (i 100) (setf l1 (cons (- a i) l1)))
> (setf l2 nil)
> (dotimes (i 100) (setf l2 (cons (+ i a)  l2)))
>
> (time (progn (setf ress (dot-product l1 l2)) nil))
>
> Using Clozure CL 1.2-r10552 on 2.4GHz Core Duo time is  2.200137s
> On the same machine sbcl needs 0.104s, clisp 0.297694 s, gcl 0.09s

On my machine, I get 2.450994 seconds on the x86-64 lisp, and on the  
experimental x86-32 lisp, I get 0.344 seconds.  (Just for grins, I can  
report that an old PowerBook G4 gets 0.909 seconds.)  So, yes,  the  
x86-64 lisp looks pretty bad.

On neither x86 platform are we using any sort of clever multiplication  
algorithm (Karatsuba, Toom-Cook), although the 32-bit ppc port does  
try to use Karatsuba when bignum sizes are suitably large.

A lot more of the 32-bit bignum code is in assembly, so that's  
presumably why it does better.  If we implemented a sub-quadratic time  
multiplication algorithm, that would be an additional plus.

Anyway, that's just additional evidence to support gb's prior message  
that there's plenty of room for some straightforward tuning.  If  
tuning the lisp code doesn't provide good enough results, it might not  
be too hard to adapt the x8632 assembly code for x8664.  Maybe if I  
get some free time...