[Openmcl-devel] Allocate heap and call C question

Mon Jul 5 19:26:50 PDT 2004

On Mon, 5 Jul 2004, Andrew P. Lentvorski, Jr. wrote:

>
> On Jul 5, 2004, at 4:43 AM, Gary Byers wrote:
>
> > Since there's no reliable way to pass a lisp object to foreign code,
> > you generally have to do something like what you're trying to do by
> > allocating a foreign array, copying the lisp array's contents to that
> > foreign array, passing the foreign array to foreign code, and possibly
> > copying the foreign array's elements back to the lisp array.  (The
> > first few of these steps is basically what things like WITH-CSTRS do.)
>
> So, if I understand what you are saying, there is no way to mark a
> specific Lisp object such that the garbage collector will not screw it
> up.  Personally, I think that is a significant hole, but now that I
> know that is the case, I can work around it.

Well, there's certainly no way to allocate a lisp object under the
assumption that it can be fully relocated by the GC and then casually
violate that assumption.  (Note that we're saying basically the same
thing, but I'm phrasing it differently.)  Some implementations offer
the ability to pass pointers to lisp objects to foreign code, with
the caveat that that code not call back to lisp (and possibly trigger
a GC).  Since OpenMCL threads are preemptively scheduled, it's much
harder to guarantee that a GC doesn't happen while foreign code is
pointing at a relocatable lisp object, and the only way that I can
think of to support this is to offer a stronger guarantee.

There -is- a (currently) undocumented way of allocating certain types
of lisp objects outside of GC-managed memory.  Lisp objects are
generally "first-class" in that the exist until the GC can prove that
it's impossible to reference them; something allocated outside the
control of the GC exists until it's explicitly deallocated (and the
consequences of referring to something after it's been deallocated
are ... not first-class either.)

(defun make-heap-ivector (element-count element-type)
   (let* ((subtag (ccl::element-type-subtype element-type)))
     (unless (= (logand subtag target::fulltagmask)
                target::fulltag-immheader)
       (error "~s is not an ivector subtype." element-type))
     (let* ((size-in-bytes (ccl::subtag-bytes subtag element-count)))
       (ccl::%make-heap-ivector subtag size-in-bytes element-count))))

Calling

? (make-heap-ivector 4 '(unsigned-byte 32))
will return two values:
a) a (SIMPLE-ARRAY (UNSIGNED-BYTE 32) (4)), whose contents are random
b) a MACPTR which points to the first byte of data in that vector

The array's allocated in foreign memory and its contents are immediate
(not pointers to lisp objecs) ; the GC will never move it and will have
no reason to pay any attention to it.  You can otherwise treat the
array as if it was a "real" array, and can pass the MACPTR to foreign
code; the array and the malloc'ed foreign memory that the MACPTR points
to will continue to be there until you explicitly call:

? (ccl::%dispose-heap-ivector array)

after which the results of referring to the array are undefined (and
almost certainly unpleasant.)

>
> So, I would like to allocate a foreign array on the heap such that the
> Lisp GC won't touch it and get back a macptr.  The only way that I know
> of to do this is to call malloc as an external-call like so:
>
> ? (setq am (external-call "_malloc" :unsigned-int 16 :address))
> #<A Mac Pointer #x101C60>
>
> Is there a better/different/more Lisp-y way of doing this?  If not,
> fine.  However, I would hate to reinvent the wheel if something already
> exists.

There's a CCL::MALLOC function that basically does the same thing; it
only exists because it's necessary to allocate memory very early in
the bootstrapping process (before shared libraries and things like
EXTERNAL work.)

There's no particular reason to prefer it to calling #_malloc directly.

If the size of the foreign array is a constant, MAKE-RECORD can be
used to allocate and zero the array:

? (make-record (:array :unsigned-int 4))

>
> > Note that the GC would be disabled for all threads if any thread was
> > inside a WITH-POINTERS-TO-LISP-OBJECTS, and this can be undesirable.
> > (Whether it is undesirable or not depends on the application, and
> > the developer of that application should have enough rope to hang
> > themselves ...)
>
> Is there a particular reason why individual objects cannot be
> allocated/marked such that the GC does not tinker with them?  This
> question probably demonstrates my massive ignorance of Lisp
> implementation.

OpenMCL's GC is fully compacting: after it runs, all live objects
are contiguous in the heap and all free space is also contiguous.
This is intended to minimize memory fragmentation, improve locality,
and minimize VM requirements; it can't both support the notion of
full compaction and allow some objects to be nailed down in the
middle of an address range that the GC wants to efficiently compact
in-place.

>
> To a certain extent, I'm not overly concerned with speed as long as
> there exists *some* mechanism for getting back and forth to C even if
> it is slow.  Simply using an external API *at all* is currently
> problematic in most Lisps without writing lots of wrappers, recompiling
> C code, recompiling the Lisp code, etc.  My goal is just to get to a
> level that allows the use of something along the lines of the Python
> ctypes module.  Sure, you can shoot yourself in both feet, chop off
> your hands, and hang yourself simultaneously; however, you can also get
> lots of work done before committing to the effort required to do an
> official wrapper of an API.
>
> > FWIW, in the code in your message you were calling %GET-UNSIGNED-LONG
> > (and SETF thereof) on successive byte indices; you'd typically want
> > to copy 32-byte words that are 4 bytes apart:
>
> Ah.  That probably needs to be made more explicit in the documentation.
>   It certainly makes the offset field of dubious utility.  C programmers
> expect that adding an offset to a pointer-to-X automatically handles
> the appropriate arithmetic (ie. if int *p; p = 0x54679800; then p+1 ==
> 0x54679804)

If P is guaranteed to be naturally aligned, then using a scaled offset
would perhaps be simpler for people with those expectations.

If P isn't guaranteed to be naturally aligned, then using a byte offset
is mandatory.

P is not guaranteed (or required) to be naturally aligned.

Having this flexibility at the lowest level is probably desirable;
having a slightly higher-level thing (some sort of FOREIGN-AREF, or an
extension to the current PREF to better handle arrays) that made
assumptions about alignment and scaled the index for you would
probably be desirable, too.

>
> So, what I have now is:
>
> Welcome to OpenMCL Version (Beta: Darwin) 0.14.2-040506!
> ? ;; Set the memory elements from an array
> (defun set-mem (m a)
>    (dotimes (i (length a))
>        (setf (%get-signed-long m (* i 4)) (aref a i))))
> SET-MEM
> ? ;; Dump the memory elements out afterward
> (defun print-mem (m l)
>    (dotimes (i l)
>        (format t "~A~%" (%get-signed-long m (* i 4)))))
> PRINT-MEM
> ? (setq am (external-call "_malloc" :unsigned-int 16 :address))
> #<A Mac Pointer #x101C60>
> ? (open-shared-library
> "/Users/andrewl/openmcl/openmcl/gtk/libptrtest.dylib")
> #<SHLIB /Users/andrewl/openmcl/openmcl/gtk/libptrtest.dylib #x6384A36>
> ? (setq a #(78 92 10 4))
> #(78 92 10 4)
> ? (set-mem am a)
> NIL
> ? (print-mem am 4)
> 78
> 92
> 10
> 4
> NIL
> ? (external-call "_ip_ip_test" :address am :address)
> Entered ip_ip_test:
> Data In: 0x101c60
> C:I:0 *(p+i):78
> C:I:1 *(p+i):92
> C:I:2 *(p+i):10
> C:I:3 *(p+i):4
> Reversing memory chunk
> C:I:0 *(p+i):4
> C:I:1 *(p+i):10
> C:I:2 *(p+i):92
> C:I:3 *(p+i):78
> Exited  ip_ip_test:
> #<A Mac Pointer #x101C60>
> ? (print-mem am 4)
> 4
> 10
> 92
> 78
> NIL
>
> This seems to work.  Do I have any hidden mistakes buried in this code
> that are going to catch up with me later?  Or is there a better way of
> doing this?
>

I don't see any hidden mistakes or problems

> Thanks for all the help,
> -a
>
>
>
>
>
>