[Openmcl-devel] Thoughts? OpenMCL on ARM processors?

Tue Oct 30 18:05:31 PDT 2007

On Tue, 30 Oct 2007, Phil wrote:

> In asking the question, why not go a step further:  has any thought
> been given to using something along the lines of LLVM as a potential
> future platform?  Would this even make sense for Clozure CL at some
> level?  I ask because I got very interested in this topic after
> reading about how much it looks like Apple is beginning to do with
> this in the excellent ARS Leopard review.
>
> Thanks,
> Phil
>

I should look at it more closely than I have, but there issues that
make me skeptical about the use of portable compiler infrastructure
toolkits.  (I'll admit to not knowing whether that skepticism is
fully warranted in LLVM's case or not.)

Because of the way that its GC works and the way that native threads
interact with the GC, there are constraints on what compiled code can
and can not do; some of these constraints and issues are described in
chapter 13 of the manual.  I think of these things as being pervasive,
and think that it's desirable for a compiler to be aware of these
issues "early".  For instance, suppose that we have:

(declare (type (simple-array (unsigned-byte 8) (256)) a))
(dotimes (i 256)
   (when (eql (aref a i) 17) (print "finally found it!")))

In a lot of environments, there might be a reasonable goal of
reducing that to a "portable" representation like:

   (load-byte r1 r2) ; r1 <- byte contents of *r2
   (add-constant 1 r2) ; increment r2
   (compare-byte r1 17)

and the ARM backend might want want to turn that into

   (ldbia r1 r2) ; don't remember ARM assembler syntax, but that's supposed
                 ; to represent postincrement addressing via
                 ; "load byte increment after".
   (cmps ...)

and the x86-64 backend might do

   (cmpb ($ 17) (% r2))
   (lea (% r2) (@ 1 (% r2))) ; post-increment, don't modify condition code

Great, but the whole concept of iterating through an array that
way - advancing a pointer so that it points -into- the array, rather
than -at- the tagged array - violates GC conventions.  There -are-
ways to produce better code for this loop than OpenMCL does and still
adhere to those conventions - most of those ways have to do with
how "I" is handled in the loop - and much of this could be done
early (and a little more in a machine-dependent backend.)

I'm not familar enough with LLVM to know how easy it'd be to teach
it that it isn't just compiling some flavor of C and that it has
to be aware of these issues/constaints - to some degree - in its
earlier, portable phases.  I haven't seen examples of its use
in targeting an enviroment like OpenMCL's, and I'm skeptical about
how possible/practical that would be.

(If anyone who knows much about LLVM's architecture can address
that, that might be a more interesting answer than my expression
of skepticism is.)