[Openmcl-devel] Why does this "cheat"/"lie" not work?...

Sat Feb 6 11:04:39 PST 2010

Jon S. Anthony wrote:
> On Sat, 2010-02-06 at 16:20 +0100, Waldek Hebisch wrote:
> 
> > AFAICS on 64 bit machine (UNSIGNED-BYTE 32) time (UNSIGNED-BYTE 32)
> > multiplication is done via .SPBUILTIN-TIMES and code using
> > (UNSIGNED-BYTE 32) is doing a lot of work converting between fixnum
> > format and (UNSIGNED-BYTE 32).  For example body of inner loop of
> > 32 bit copy generates
> 
> A couple things.  Maybe I'm lost, but if you are just copying (like in
> my original example code), why is there any multiplication going on with
> the elements being copied?  Or any multiplication at all?
> 

I have several routines.  Some are using multiplication and for them
sbcl generates inline code while Closure CL calls builtin.
I presented disassembly of a different routine, namely simple
copy.  My copy routine is doing no multilication, but generated
machine code contains completely useless multiplication (as I
wrote this multiplication is used to convert native 32 bit number
to fixnum, and the fixnum is immediately converted back to
native number).

> Also, again for the original example, on an X86-64, the type spec of the
> elements being copied would be (unsigned-byte 64), i.e., the natural
> word size, not 32.
>

Well, you would use (unsigned-byte 64), but I really have array of
32 bit numbers (no cheating) and I want types safe code.  And copy
may start and/or end at odd position

> So, if you are moving 32 bit chunks on a 64 bit machine, you aren't just
> "loading and storing registers" (more or less), so maybe that is where
> the extra stuff here is coming from?
>

No.  Actually 32 bit loads and stores are native.  The extra stuff
comes because Closure CL insists on converting native numbers to
fixnums and back.  To say the truth this seem to be common problem
of all implementations that I use -- only after several trials
I found that using (unsigned-byte 32) time (unsigned-byte 32)
and (unsigned-byte 64) for result I get inline multiplication
using sbcl.  Fixnum multiplication also was inline but it
wastes 3 bits, requires double storage and generates extra shifts.
ECL keep numbers in native form, but when they are bigger than
fixnum conses bignums (even if declared type fits into machine
word).  And for "notrivial" operations ECL converts between
native integers and Lisp form, so there is a lot of shifs
and function calls seriously degrading performance.

-- 
                              Waldek Hebisch
hebisch at math.uni.wroc.pl