[Openmcl-devel] Floating point performance
gb at clozure.com
Sat Jan 30 21:32:54 UTC 2010
Hardware support for SQRT is an optional PPC feature; most of the PPC
CPUs manufactured by Motorola and used in Macs didn't offer it, but the
G5 (manufactured by IBM) did, as did some other IBM workstations.
At some point (when the G5 and its potential successors looked like
the wave of the future ...), CCL started assuming that that hardware
support was present and emulating it if it wasn't. I'm not sure that
that was as a attractive an idea as it first seemed: it was based on
the asssumption that that'd make it easier to someday open-code SQRT
on floats, but since that hardware support's only defined on
non-negative arguments it'd be hard to do that open-coding anyway.
The optional PPC SQRT instructions are only used in a couple of
LAP functions, and the benefit of using them is fairly limited
(saves a foreign function call.)
On a G5, Paul's test case seems to be "bad, but not extremely bad"
in 32-bit CCL running on a G5, and Paul's seeing extremely bad
results on a machine where those SQRT instructions are being emulated
(1000000 times ...).
Like I said, it seemed like a good idea at the time.
On Sat, 30 Jan 2010, Ron Garret wrote:
> On Jan 30, 2010, at 11:18 AM, Paul Onions wrote:
>> On 30 Jan 2010, at 16:28, Paul Onions wrote:
>>> BTW I'm running 1.5-dev-r13391M-trunk (DarwinPPC32)
>> Seems that it must be related either to PPC-ness or 32bit-ness,
>> because running the test functions on a 1.5-dev-r13281M-trunk
>> (LinuxX8664) system gave good results (almost identical timing for the
>> two functions).
> This is probably related to the fact that some PPCs don't have hardware SQRT. My guess would be that your PPC actually does have hardware SQRT, and CMU is using it but CCL isn't.
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
More information about the Openmcl-devel