[Openmcl-devel] getting CCL ready for CS graphics courses

Gary Byers gb at clozure.com
Thu Jan 8 18:12:07 PST 2009



On Thu, 8 Jan 2009, Alexander Repenning wrote:

> good call. On a  1.66Ghz PPC
>
> (#_glVertex3f a b c) takes about 3 us
>
> (defun glVertex3f (a b c)
> (#_glVertex3f a b c))
>
> (glVertex3f a b c) with glVertex3f not inline about 4 us
>
> (glVertex3f a b c) with inline is back to 3 us
>
> Oddly, my much faster Intel Mac runs this slower

A few weeks ago - when talking about nanosecond timings - I wrote
what I remember to be a long explanation of why there's a lot
more overhead in a Darwinx8664 foreign function call than there
is on other platforms.  The short version is any one of:

- "Mach sucks, but no one understands how"
- "Those iPhones sure are shiny !"
- something more pithy and clever that I can't think of at the moment.

A slightly longer version is that OSX on x8664 doesn't allow a user-mode
process to use an otherwise-unused segment register for its own
purposes (e.g., keeping track of thread-local data.)  Linux, FreeBSD,
and Solaris all provide this functionality; Win64 doesn't ("Windows
sucks, and everyone understands how.")  Because of this deficiency,
the choices are basically:

- keep lisp thread data in a general-purpose register, which might
negatively affect the performance of lots of things.
- "share" the segment register that the OS uses for C thread data,
switching it between C data and Lisp data on foreign function calls
and callbacks (and therfore slowing down foreign function calls
and callbacks by the cost of 2 system calls.)

I'd rather not do either of these things.  If one assumes that
many foreign function calls are themselves fairly expensive
operations, then adding additional overhead to foreign function
calls seemed more attractive (er, um, "less unattractive") than
adding some (variable) amount of overhead to lots of lisp functions.
It's true that some foreign function calls don't do much of anything,
and the syscall overhead on foreign function calls might dwarf
the actual cost of the operation (this is likely true of 
#_glVertex3f, if eliminating function call overhead yields visible
results, then it seems likely that #_glVertex3f isn't "computationally
expensive" in any sense.)

One could reasonably argue that it would have been better to make
the other unattractive choice (this was done on win64, where it
wasn't clear that there was even something as ugly as the Darwin
hack.)  That might be correct (I'm a little skeptical, and the
win64 machine that I use is slower than other machines, making
comparisons difficult; I don't know.)  In any case, the fact
that foreign function calls are have more overhead on x8664
Darwin than they do on other platforms isn't some Big Unsolved
Mystery; it has something to do with the fact that the OS is
effectively leaving us a register short (and with what I thought
was the best way to deal with that.)

>
>
> That leaves, the unfortunately much harder, problem of finding a way to 
> generate the defun from the famework definition.
>
> Alex
>
>
>
>
> On Jan 8, 2009, at 10:48 AM, Stas Boukarev wrote:
>
>> On Thu, Jan 8, 2009 at 8:33 PM, Alexander Repenning
>> <ralex at cs.colorado.edu> wrote:
>>> 1) define a function for each foreign function, e.g.,
>>> 
>>> (defun  glColor3f (R G B)
>>> (#_glColor3f R G B))
>>> 
>>> - not good: function call overhead
>> 
>> Why not declare that function inline?
>> (http://www.lispworks.com/documentation/HyperSpec/Body/d_inline.htm)
>> 
>> -- 
>> With Best Regards, Stas.
>> 
>
> Prof. Alexander Repenning
>
> University of Colorado
> Computer Science Department
> Boulder, CO 80309-430
>
> vCard: http://www.cs.colorado.edu/~ralex/AlexanderRepenning.vcf
>
>



More information about the Openmcl-devel mailing list