[Openmcl-devel] Fun with Measurements

Brent Fulgham bfulg at pacbell.net
Sun Oct 29 11:45:52 PST 2006


On Oct 29, 2006, at 12:05 AM, Gary Byers wrote:

> On Sat, 28 Oct 2006, Brent Fulgham wrote:
>> While the 64-bit build of OpenMCL looks great on nearly all tests
>> (and substantially improves over the 32-bit build in some cases),
>> something seems very wrong with the 64-bit support for BIGNUMS (see
>> BIGNUM/PARI-200-5, PI-DECIMAL/BIG and PI-DECIMAL/SMALL for example).
>> Remember that the non-reference columns are relative scales, so on  
>> PI-
>> DECIMAL/SMALL, 64-bit OpenMCL is 10x as slo as OpenMCL 1.1/1.0 in 32-
>> bit.  One small bright spot is the CRC40 benchmark, which due to it's
>> heavy use of 40-bit integers is substantially improved using native
>> 64-bit words.
>
> I haven't looked at this (Apple's CHUD performance tools don't yet  
> work
> in 64-bit mode); can you tell whether it's more accurate to say  
> "bignum
> arithmetic is slower in ppc64 OpenMCL" or "bignum -multiplication- is
> slower ..." ?

Yes, it probably is multiplication.  The benchmark consists of:

;; code from Bruno Haible <haible at ilog.fr>
;;
;; A. Elementary integer computations:
;;    The tests are run with N = 100, 1000, 10000, 100000 decimal  
digits.
;;    Precompute *x1* = floor((sqrt(5)+1)/2 * 10^(2N))
;;               *x2* = floor(sqrt(3) * 10^N)
;;               *x3* = 10^N+1
;;    Then time the following operations:
;;    1. Multiplication *x1* * *x2*,
;;    2. Division (with remainder) *x1* / *x2*,
;;    3. integer_sqrt (*x3*),
;;    4. gcd (*x1*, *x2*),
;;
;; B. (from Pari)
;;       u=1;v=1;p=1;q=1;for(k=1..1000){w=u+v;u=v;v=w;p=p*w;q=lcm(q,w);}

So, it is greatly focused on bignum multiplication.

Here's an example test case that shows bad performance in 64-bit mode:

=========================================================
;; calculating pi using ratios
(defun compute-pi-decimal (n)
   (let ((p 0)
         (r nil)
         (dpi 0))
     (dotimes (i n)
       (incf p (/ (- (/ 4 (+ 1 (* 8 i)))
                     (/ 2 (+ 4 (* 8 i)))
                     (/ 1 (+ 5 (* 8 i)))
                     (/ 1 (+ 6 (* 8 i))))
                  (expt 16 i))))
     (dotimes (i n)
       (multiple-value-setq (r p) (truncate p 10))
       (setf dpi (+ (* 10 dpi) r))
       (setf p (* p 10)))
     dpi))

;; this can be 1e-6 on most compilers, but for COMPUTE-PI-DECIMAL on
;; OpenMCL we lose lotsa precision
(defun fuzzy-eql (a b)
   (< (abs (/ (- a b) b)) 1e-4))

(defun run-pi-decimal/small ()
   (assert (fuzzy-eql pi (/ (compute-pi-decimal 200) (expt 10 198)))))

(defun run-pi-decimal/big ()
   (assert (fuzzy-eql pi (/ (compute-pi-decimal 1000) (expt 10 998)))))
=========================================================

The comment regarding OpenMCL is not mine, this was in the benchmark  
sources.  Does anyone know why this comment is being made?

At any rate, the above example is 8 to 10x as slow for 64-bit OpenMCL  
as for 32-bit.  (pi-decimal/small is 10x as slow, pi-decimal/big is 8x).

> The ppc32 version does Karatsuba multiplication
> <http://en.wikipedia.org/wiki/Karatsuba_multiplication> if the
> operands are big enough; neither the ppc64 nor x86-64 versions do  
> (for no
> particularly good reason.)  I'd be a little surprised if that was  
> the difference,
> but I'm also surprised by the results: a year of so ago, the ppc64  
> bignum code
> seemed as fast or faster in all of the cases that I tried.

I'm not qualified to assess whether Karatsuba multiplication would  
provide significant gain.  It might be useful to see if there is a  
more full-featured BIGNUM test suite we could use to assess the full  
set of operations.  I'll see what I can find.

Thanks,

-Brent





More information about the Openmcl-devel mailing list