[Openmcl-devel] Floating point performance

Sun Jan 31 02:26:56 PST 2010

LAP is an acronym for "Lisp Assembly Program"; it refers to a way of
writing functions in assembly language in a lisp system.  It's
mentioned in the "Lisp 1.5 Programmer's Manual", which was published
in 1960.  Sheesh.  These kids ...

There are a few LAP functions that use FSQRT or FSQRTS instructions
in functions defined (via DEFPPCLAPMACRO) in "ccl;level-0;PPC;ppc-float.lisp".
(On x86, any processor that can run CCL - that supports SSE2 extensions -
can do SQRT in hardware, so there are x86 LAP functions in "ccl:level-0;X86;"
that implement SQRT on non-negative floats.)

There are several examples of wrapper functions around math library calls
in "ccl:level-1;l1-numbers.lisp".  Most of these functions come in three
flavors:

  1) a DOUBLE-FLOAT version that stores the result in a lisp DOUBLE-FLOAT
     object allocated by the caller.  The names of these destructive functions
     end in !.
  2) a SINGLE-FLOAT version - conditionalized for 32-bit targets - that
     behaves the same way as (1).
  3) a SINGLE-FLOAT version for 64-bit targets that returns an immediate
     SINGLE-FLOAT objects.

All of these functions jump through some hoops to try to detect arithmetic
exceptions that might occur during execution of math library functions.
(Those functions generally expect to run with FP exceptions masked/disabled,
so the hoop-jumping generally involves trying to detect whether an exception
that would have been unmasked/enabled in lisp code occurred in the library
function.)

If we just call the foreign function and don't check for this:

? (#_sqrt -1.0d0)
1D+-0 #| not-a-number |#

we'll get well-defined but unexpected results.

So, you generally want to remove the PPC lap functions from "ppc-float.lisp"
and add some PPC-conditional wrapper functions to l1-numbers.lisp; the
wrapper functions around #_sqrt and #_sqrtf should have the same general
structure as the wrappers around other unary math library functions (like
#_sin/#_sinf)

I'll probably do this in a few days; whatever the slight benefit of
using the optional SQRT instructions might be, it's clear that it's
not worth the performance hit of emulation, resulting confusion, and
subsequent email.

On Sun, 31 Jan 2010, Paul Onions wrote:

>
> On 31 Jan 2010, at 00:28, Gary Byers wrote:
>>
>> On Sat, 30 Jan 2010, Ron Garret wrote:
>>>
>>> You could just compile a sqrt function with gcc and call it through
>>> the FFI.
>>>
>>
>> Or exploit the fact that the authors of the OS's math libraries have
>> already defined #_sqrt and #_sqrtf and call those functions.  That's
>> pretty straightforward; dealing with FP exceptions in math library
>> calls is a bit less so.
>>
>
> Yes, I had the same thought last night, so I went off and started
> reading the documentation on .cdb files and the FFI (something I've
> been meaning to do for ages). Looks like very impressive stuff.
> Anyway, this morning I tried
>
> (defun test-fun-3 ()
>   (loop repeat 1000000
>         for x = 1.0 then (+ x 0.01)
>         sum (#_sqrt (float x 1.0d0))))
>
> (defun test-fun-4 ()
>   (loop repeat 1000000
>         for x = 1.0 then (+ x 0.01)
>         sum (#_sqrtf x)))
>
> which took 1.5s and 1s, respectively. Much more reasonable!
>
> So, the final step is find out where in CCL's codebase the SQRT trap
> is used, and make sure I can work around those areas. In an earlier
> email you mentioned that it is only used in "a couple of LAP
> functions". May I ask, what's a "LAP function"? As you might have
> guessed I am not familiar with CCL internals.
>
> Thanks for all your help,
> Paul
>
>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>