[Openmcl-devel] cross-compiling for ia32

Fri Nov 9 08:47:34 PST 2007

On Nov 9, 2007, at 1:43 AM, Gary Byers wrote:

> I don't know what Matt's init file looks like, but ...
>
> [~] gb at clozure> cat openmcl-init.lisp
> (ccl:set-development-environment)
> [~] gb at clozure>
>
> I'm not sure how much of the build process expects this, but some of
> the errors that you got suggest that some parts expect/assume that
> you're in the CCL package with redefinition warnings/cerrors disabled.

That's it exactly.

(ccl:set-development-environment) and (in-package "CCL") should get  
you to the point where can load the files to set up for cross-compiling.

Some other notes that might be useful:

After loading the x86-32 backend/compiler (such as they are at this  
point), you can use X8632-XCOMPILE-LAMBDA to cross-compile and  
disassemble lambdas, e.g.,

? (x8632-xcompile-lambda '(lambda (x) (+ x 2)))

  (movl ($ 0) (% fn))
  (cmpw ($ 4) (% imm0.w))                         ;[5]
  (je.pt L15)                                     ;[10]
  (uuo-error-wrong-number-of-args)                ;[13]
L15
  (pushl (% ebp))
  (movl (% esp) (% ebp))                          ;[16]
  (pushl (% arg_z))                               ;[18]
  (movl (@ -4 (% ebp)) (% arg_z))                 ;[19]
  (testb ($ 3) (% arg_z.b))                       ;[22]
  (jne L54)                                       ;[25]
  (addl ($ 8) (% arg_z))                          ;[27]
  (jno.pt L76)                                    ;[30]
  (nop)                                           ;[33]
  (nop)                                           ;[37]
  (calll (@ .SPFIX-OVERFLOW))                     ;[40]
  (movl ($ 0) (% fn))                             ;[47]
  (jmp L76)                                       ;[52]
L54
  (movl ($ 8) (% arg_y))
  (nop)                                           ;[59]
  (nop)                                           ;[62]
  (calll (@ .SPBUILTIN-PLUS))                     ;[64]
  (movl ($ 0) (% fn))                             ;[71]
L76
  (leavel)
  (retl)                                          ;[77]
#<XFUNCTION  #x300040EC350D>

You're probably wondering what the deal is with the (movl ($ 0) (%  
fn)) instructions, and with the nop instructions before calls.

On the x86, the CALL instruction pushes the address of the return  
instruction on the stack; this address can in general by arbitrarily  
tagged.  In OpenMCL, the GC can potentially run at any instruction  
boundary, so we have to be careful never to leave anything on the  
stack that would confuse the GC.

We therefore "tail-align" all the call instructions so that the  
address of following instruction is tagged with x8632::fulltag-tra  
(#b101).  (tra means tagged return address.)  If the GC sees a (movl  
($ <something> (% fn)) at a tagged return address, it will know that  
it can use the <something> as a function address.

The notation we use for this in LAP is actually (movl ($ :self) (%  
fn)), which perhaps makes the intention a little clearer.  Before a  
function can actually run, we have to fix up all these self-references  
to have the actual address of the function in them.

To do this, we have the compiler generates a table of self-reference  
offsets which it places at the end of the function, before the gc-able  
constants.  We then use the table to tell where in the code we have to  
fill in the actual function address, and the GC can use the table to  
do the same when it wants to move the function elsewhere in memory.

If you look at x862-compile, it might be possible to follow this.   
Uncomment the show-frag-bytes form to see a dump of the raw bytes that  
are going to get made into a function.  For the function above, you'll  
see:

frag at #x4
19 00 BF 00 00 00 00 66 81 F8 04 00 3E 74 02
^^^^^--- this is the 16-bit count of 32-bit immediate
words in the function.
A function pointer is tagged with fulltag-misc (= 6.), so there
are two bytes between the header and the entry point.

frag at #x13
CD C2 55 89 E5 53 8B 5D FC F6 C3 03 75 1B
frag at #x21
83 C3 08 3E 71 2B
frag at #x27
66 66 66 90 66 66 90
frag at #x2E
FF 14 25 28 52 00 00
frag at #x35
BF 00 00 00 00 EB 16
frag at #x3C
BE 08 00 00 00 66 66 90 66 90
frag at #x46
FF 14 25 8C 51 00 00
frag at #x4D
BF 00 00 00 00
frag at #x52
C9 C3
frag at #x54
00 00 00 00 07 00 00 00 36 00 00 00 4E 00 00 00
FF 00 00 00 00 00 00 00
^^---- this is at byte offset
(+ untagged function pointer (* imm-word-count 4))

The table of self-reference offsets precedes it,
marked at the "end" by #x00000000.  We need to write the
actual address of the function at offsets #x7, #x36 and #x4e.
The encoding of this table could obviously be encoded a bit
more compactly...

The #x000000ff is a "function-boundary-marker", used by the gc.
The last (:long 0) is the end of the function.

If there were any constant refs, they'd precede the last (:long 0).