[Openmcl-devel] Fun with Measurements

Brent Fulgham bfulg at pacbell.net
Sun Oct 29 11:45:52 PST 2006

On Oct 29, 2006, at 12:05 AM, Gary Byers wrote:

> On Sat, 28 Oct 2006, Brent Fulgham wrote:
>> While the 64-bit build of OpenMCL looks great on nearly all tests
>> (and substantially improves over the 32-bit build in some cases),
>> something seems very wrong with the 64-bit support for BIGNUMS (see
>> Remember that the non-reference columns are relative scales, so on  
>> PI-
>> DECIMAL/SMALL, 64-bit OpenMCL is 10x as slo as OpenMCL 1.1/1.0 in 32-
>> bit.  One small bright spot is the CRC40 benchmark, which due to it's
>> heavy use of 40-bit integers is substantially improved using native
>> 64-bit words.
> I haven't looked at this (Apple's CHUD performance tools don't yet  
> work
> in 64-bit mode); can you tell whether it's more accurate to say  
> "bignum
> arithmetic is slower in ppc64 OpenMCL" or "bignum -multiplication- is
> slower ..." ?

Yes, it probably is multiplication.  The benchmark consists of:

;; code from Bruno Haible <haible at ilog.fr>
;; A. Elementary integer computations:
;;    The tests are run with N = 100, 1000, 10000, 100000 decimal  
;;    Precompute *x1* = floor((sqrt(5)+1)/2 * 10^(2N))
;;               *x2* = floor(sqrt(3) * 10^N)
;;               *x3* = 10^N+1
;;    Then time the following operations:
;;    1. Multiplication *x1* * *x2*,
;;    2. Division (with remainder) *x1* / *x2*,
;;    3. integer_sqrt (*x3*),
;;    4. gcd (*x1*, *x2*),
;; B. (from Pari)
;;       u=1;v=1;p=1;q=1;for(k=1..1000){w=u+v;u=v;v=w;p=p*w;q=lcm(q,w);}

So, it is greatly focused on bignum multiplication.

Here's an example test case that shows bad performance in 64-bit mode:

;; calculating pi using ratios
(defun compute-pi-decimal (n)
   (let ((p 0)
         (r nil)
         (dpi 0))
     (dotimes (i n)
       (incf p (/ (- (/ 4 (+ 1 (* 8 i)))
                     (/ 2 (+ 4 (* 8 i)))
                     (/ 1 (+ 5 (* 8 i)))
                     (/ 1 (+ 6 (* 8 i))))
                  (expt 16 i))))
     (dotimes (i n)
       (multiple-value-setq (r p) (truncate p 10))
       (setf dpi (+ (* 10 dpi) r))
       (setf p (* p 10)))

;; this can be 1e-6 on most compilers, but for COMPUTE-PI-DECIMAL on
;; OpenMCL we lose lotsa precision
(defun fuzzy-eql (a b)
   (< (abs (/ (- a b) b)) 1e-4))

(defun run-pi-decimal/small ()
   (assert (fuzzy-eql pi (/ (compute-pi-decimal 200) (expt 10 198)))))

(defun run-pi-decimal/big ()
   (assert (fuzzy-eql pi (/ (compute-pi-decimal 1000) (expt 10 998)))))

The comment regarding OpenMCL is not mine, this was in the benchmark  
sources.  Does anyone know why this comment is being made?

At any rate, the above example is 8 to 10x as slow for 64-bit OpenMCL  
as for 32-bit.  (pi-decimal/small is 10x as slow, pi-decimal/big is 8x).

> The ppc32 version does Karatsuba multiplication
> <http://en.wikipedia.org/wiki/Karatsuba_multiplication> if the
> operands are big enough; neither the ppc64 nor x86-64 versions do  
> (for no
> particularly good reason.)  I'd be a little surprised if that was  
> the difference,
> but I'm also surprised by the results: a year of so ago, the ppc64  
> bignum code
> seemed as fast or faster in all of the cases that I tried.

I'm not qualified to assess whether Karatsuba multiplication would  
provide significant gain.  It might be useful to see if there is a  
more full-featured BIGNUM test suite we could use to assess the full  
set of operations.  I'll see what I can find.



More information about the Openmcl-devel mailing list