[Openmcl-devel] Performance Question

Gary Byers gb at clozure.com
Tue Jan 2 23:46:14 PST 2007



On Tue, 2 Jan 2007, Brent Fulgham wrote:

> The main performance problem in my code appears to be the implementation of 
> the floating point array stuff.
>
> Attached is a test case outlining the problem.
>
> If I run this under OpenMCL (the 12-31-2006 build):
>
> ? (time (dotimes (i 5000)
>       (defparameter *A* (rlet ((BB (array :single-float 4)))
>            (init-array BB :single-float 0.1 0.2 0.3 0.4)))))


A little stub version of INIT-ARRAY doesn't cons at all, and of course
it shouldn't.  I don't remember whether there are many OpenGL *fv functions
that take arrays of other than 4 elements, but IIRC 4-element arrays
were pretty common.

The size of a SINGLE-FLOAT is 4 bytes.  If that changes, this code
would have to change.

(defun init-array4 (p type a b c d)
   ;;(declare (optimize (speed 3) (safety 0)))
   (case type
    (:single-float
     (setf (%get-single-float p 0) a
           (%get-single-float p 4) b
           (%get-single-float p 8) c
           (%get-single-float p 12) d))
    ;; Other types ?
    )
    nil)

? (time (dotimes (i 5000)
         (defparameter *A* (rlet ((BB (array :single-float 4)))
                             (init-array4 BB :single-float 0.1 0.2 0.3 0.4)))))
(DOTIMES (I 5000) (DEFPARAMETER *A* (RLET ((BB (ARRAY :SINGLE-FLOAT 4))) (INIT-ARRAY4 BB :SINGLE-FLOAT 0.1 0.2 0.3 0.4)))) took 9 milliseconds (0.009 seconds) to run.
Of that, 9 milliseconds (0.009 seconds) were spent in user mode
          0 milliseconds (0.000 seconds) were spent in system mode
NIL


The 9 millisecond time above is on a 2.5GHz G5.  I'm sure that the
generated code could be improved a bit, but shoving a few floats into
memory shouldn't be a bottleneck and I don't think that the (partial)
version of INIT-ARRAY above does anything particularly clever.  I don't
think that the code is particularly hard to read or understand or maintain
or that it would be difficult to generalize to other small fixed numbers
of elements.

It's possible that one could also write a version that was a little
slower and a little more general (e.g., took a dynamic-extent &rest
arg and processed its elements in a loop.  That shouldn't be orders of
magnitude slower than the specialized version above and it shouldn't
cons; I don't know why the more general version below is and does.
(I would guess that it's doing foreign-type-system things at runtime,
but have not made any real effort to understand it.)

Rather than puzzle over that, it might be better to use something like
the simple, not-horribly-slow version above as a starting point and
generalize it as necessary.

> (DOTIMES (I 5000) (DEFPARAMETER *A* (RLET ((BB (ARRAY :SINGLE-FLOAT 4))) 
> (INIT-ARRAY BB :SINGLE-FLOAT 0.1 0.2 0.3 0.4)))) took 4,694 milliseconds 
> (4.694 seconds) to run.
> Of that, 1,090 milliseconds (1.090 seconds) were spent in user mode
>        1,719 milliseconds (1.719 seconds) were spent in system mode
>        1,885 milliseconds (1.885 seconds) were spent executing other OS 
> processes.
> 14 milliseconds (0.014 seconds) was spent in GC.
> 10,613,672 bytes of memory allocated.
> NIL
> ?
>
> If I run this code in the demo version of MCL 5.1, the performance is about 
> an 40 times better, and conses about a 50th as much memory:
>
> ? (time (dotimes (i 5000)
>    (defparameter *A* (rlet ((BB (array :single-float 4)))
>      (ccl::init-array BB :single-float 0.1 0.2 0.3 0.4)))))
> (DOTIMES (I 5000) (DEFPARAMETER *A* (RLET ((BB (ARRAY :SINGLE-FLOAT 4))) 
> (CCL::INIT-ARRAY BB :SINGLE-FLOAT 0.1 0.2 0.3 0.4)))) took 185 milliseconds 
> (0.185 seconds) to run.
> 200,016 bytes of memory allocated.
>
> This pretty closely matches the 40-fold slow-down in frame rate I see in 
> simple tests of the OpenGL stuff under OpenMCL compared to MCL.
>
> If I hard-code the sizes in the %put-long/%put-single-float it speeds things 
> up considerably, though memory use is still very large:
>
> Of that, 277 milliseconds (0.277 seconds) were spent in user mode
>        413 milliseconds (0.413 seconds) were spent in system mode
>        59 milliseconds (0.059 seconds) were spent executing other OS 
> processes.
> 7 milliseconds (0.007 seconds) was spent in GC.
> 2,133,672 bytes of memory allocated.
>
>
> Thanks,
>
> -Brent
>
>
> ==========================================================
> Test Implementation
> ==========================================================
> (eval-when (:compile-toplevel :load-toplevel :execute)
> (ccl:use-interface-dir :carbon)
> (open-shared-library "/System/Library/Frameworks/Carbon.framework/Carbon"))
>
> (defun %sizeof (type)
>  (ccl::%foreign-type-or-record-size type :bytes))
>
> (defun record-field-length (Type)
>  (%sizeof Type))
>
> (defmacro %put-long (Name Value &optional (Index 0))
> `(eval-when (compile eval load)
>    (setf (%get-long ,Name (* ,Index (%sizeof :long))) ,Value)))
> ;
> ;  "speedy" version
> ;
> ;(defmacro %put-long (Name Value &optional (Index 0))
> ;  `(eval-when (compile eval load)
> ;     (setf (%get-long ,Name (* ,Index 4)) ,Value)))
>
> (defmacro %put-single-float (Name Value Index)
> `(eval-when (compile eval load)
>    (setf (%get-single-float ,Name (* ,Index (%sizeof :single-float))) 
> ,Value)))
> ;
> ;  "speedy" version
> ;
> ;(defmacro %put-single-float (Name Value Index)
> ;  `(eval-when (compile eval load)
> ;     (setf (%get-single-float ,Name (* ,Index 4)) ,Value)))
>
> (defun INIT-ARRAY (&array Type &rest Values)
> (declare (dynamic-extent Values))
> (let ((Index 0)
>       (Size (record-field-length Type)))
>   (dolist (Value Values &Array)
>     (ecase Type
>       (:long (%put-long &Array Value Index))
>       (:single-float (%put-single-float &Array Value Index)))
>     (incf Index 1))))
>
> (defmacro WITH-RGBA-VECTOR (Vector (Red Green Blue Alpha) &body Forms)
> `(rlet ((,Vector (array :single-float 4)))
>    (init-array ,Vector :single-float ,Red ,Green ,Blue ,Alpha)
>    , at Forms))
>



More information about the Openmcl-devel mailing list