[Openmcl-devel] Using vecLib framework from OpenMCL?
Gary Byers
gb at clozure.com
Tue Aug 29 00:21:04 PDT 2006
The function (CCL:MAKE-HEAP-IVECTOR <element-count> <element-type>)
will allocate some foreign memory (at least enough to store <element-count>
contiguous objects of immediate type <element-type>); it then slaps
some lisp header information on the front of that block of memory
and returns a tagged lisp vector, a pointer to the first usable
byte of data (past the lisp header), and the logical size of the
pointer in bytes.
? (make-heap-ivector 10 'character)
"" ; the 10 #\NUL characters may or may not print visibly
#<A Mac Pointer #x301E6C>
10
?
Note that the address in question (#x301e6c) is 4 bytes (the size of
the lisp header on ppc32) past an address that's aligned on a 64-bit
boundary. AltiVec objects generally have to be aligned on 128-bit
boundaries in memory (I believe that there's at least a performance
penalty in the SSE2 case if the vector isn't 128-bit aligned).
So, you could allocate a little more than you need, and only use
the aligned part of the result:
;;; Create a 128-bit aligned vector of 4 SINGLE-FLOATs inside a
;;; heap-allocated vector of 7 SINGLE-FLOATS. Return the lisp
;;; vector, the biased foreign pointer, and the index of the first
;;; aligned SINGLE-FLOAT in the vector.
(defun allocate-aligned-single-float-vector ()
(multiple-value-bind (lisp-vector foreign-pointer)
(ccl:make-heap-ivector 7 'single-float)
(let* ((address (ccl:%ptr-to-int foreign-pointer))
(aligned-address (logandc2 (+ address 15) 15)))
(values lisp-vector
(ccl::ptr-to-int aligned-address)
;; The "4" below is the size of a SINGLE-FLOAT in bytes
(floor (- aligned-address address) 4)))))
So (after some huffing and puffing) we could initialize a foreign
vector with some values whose square roots we'd like to determine
in parallel:
(multiple-value-bind (vector arg-pointer first-index)
(allocate-aligned-single-float-vector)
(dotimes (i 4)
(setf (aref vector (+ first-index i)) (float i 1.0f0)))
;; Allocate another aligned vector to hold the eagerly awaited result.
(multiple-value-bind (result-vector result-pointer result-first-index)
(allocate-aligned-single-float-vector)
;; Now, we're ready to call vsqrtf. Oops, no we're not.
;; (#_vsqrtf result-pointer arg-pointer)))
#_vsqrtf wants its argument to be passed in (and will return its
result in) a vector register (vN for AltiVec, xmmN for SSE2).
OpenMCL's FFI has no real concept of what this means.
For most foreign types, there's a corresponding lisp type and (in
general) a foreign function call involves coercing between various
lisp representations of integers/floats/pointers and raw (unboxed)
representations and coercing the raw unboxed result into a lisp
value. This generally also involves following conventions like
"pass the first N FP args in the first N FP registers" according
to the target ABI.
One of those ABI conventions involves how SIMD vector arguments
and results should be handled. There isn't really a corresponding
Lisp "SIMD vector" type, and there hasn't been an obvious candidate
on PPC32 because (a) lisp vectors aren't aligned stringently enough
and (b) the alignment of a lisp vector (relative to 128-bit alignment)
can change at any instruction boundary because of GC activity.
For 64-bit platforms, neither (a) nor (b) is a concern (all lisp
objects are 128-bit aligned and this never changes.) So, we -could-
(hypothetically) use real lisp vectors to encapsulate SIMD vectors
(sort of like the way that a lisp DOUBLE-FLOAT object encapsulates
a double-float value). Our call to #_vsqrtf might wind up something
like:
(let* ((argument-vector (make-array (+ 4 2) :element-type 'single-float)))
;; the vector will start with a 64-bit header; skip the first 2
;; 32-bit elements to wind up back on a 128-bit boundary
(dotimes (i 4)
(setf (aref argument-vector (+ 2 i)) (float i 1.0f0)))
(let* ((result-vector (make-array (+ 4 2) :element-type 'single-float)))
(external-call "_vsqrtf" :vector arguments-vector (:vector result-vector))
;; The made-up syntax above is supposed to suggest that the result
;; is a SIMD vector that should be stored in the lisp object
;; RESULT-VECTOR. There might be other/better syntax for this.
(dotimes (i 4)
(format t "~& SQRT of ~s = ~s"
(aref argument-vector (+ 2 i))
(aref result-vector (+ 2 i))))))
-That- looks vaguely lisp-like (except for the slightly odd requirement
that the first 64 bits of data in the lisp object be ignored.)
I was never able to come up with an even vaguely lisp-like way of
integrating SIMD stuff with a 32-bit lisp; whenever the issue arose,
I generally suggested that it'd be better to wait for 64-bit ports
because of the alignment issues. I'm not sure what the priority on
this should be, but I do recognize that I no longer have that excuse.
On Tue, 29 Aug 2006, Phil wrote:
> I'm finding myself longingly looking at some of the capabilities in
> vecLib (a collection of math libraries which are accelerated when
> Altivec/SSE is available) but am thinking that the required setup
> (i.e. allocating and populating the structures then reading the
> results back into Lisp structures) would at best be a wash vs.
> straight Lisp code. Then I got to thinking that the overhead could
> be greatly minimized by implementing a limited-functionality Lisp
> vector/array type which used FFI-based memory allocation. Any
> thoughts/experiences re: attempting this or other approaches to using
> vecLib?
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>
More information about the Openmcl-devel
mailing list