[Openmcl-devel] sharing data between fortran and OpenMCL

Fri Oct 29 01:37:40 PDT 2004

On Thu, 28 Oct 2004, Cyrus Harmon wrote:

>
> Ok, this was a while back, but I'm trying to pick up where I left off.
> I'll leave the whole thread attached below, but basically you said:
>
> > After many years of being the bad cop, I'm willing to make a deal:
> > I'll add support for "casually passing lisp data to foreign code",
> > as long as someone else - preferably people who really want this
> > functionality - volunteer to debug the subtle and infrequent GC
> > problems that'd arise.
>
> Ok, I'll do my best to debug this. What do I need to do? I know that I
> can use the make-heap-ivector, but for arbitrary lisp objects, what
> needs to happen? Clearly this will happen in some sort of without-gcing
> block.

It's admittedly a little hard to tell, but I think that I was being
facetious.  If you were to (hypothetically) try to combine:

 - a fully relocating GC
 - preemptively-scheduled threads, which make it difficult to predict
   when a GC might happen
 - the ability to pass the (fleeting, transitory) address of some lisp
   object(s) to foreign code

you'd likely find that things worked fine a high percentage of the time
and would fail (possibly with bizarre symptoms, possibly with spectacular
symptoms) some small percentage of the time.

Consider something like:

? (defun foo (a i val)
  (declare (type simple-vector a)
           (optimize (speed 3) (safety 0)))
   "We don't need any type- or bounds-checking.
    Nothing can go wrong."
   (setf (svref a i) val))

? (defvar *a* (make-array 3))

;;; Other activity here, then:

? (foo *a* (random 1000) (gensym))

It's very hard to predict the effects of calling FOO (but it's generally
not hard to predict that those effects will be bad.)  In general, those
bad effects might be felt immediately, might not ever be felt, might
cause a segmentation fault or might cause very subtle misbehavior, and
whatever effects might eventually cause some form of program faulure
might themselves be pretty far removed from the original problem (FOO).
Unless the situation's entirely reproducible, it's nearly impossible
to debug something like this when it happens in the real world.

[The point of the FOO example is that it and the practice of passing
the address of relocatable lisp objects to foreign code can both
lead to memory-corruption scenarios that're very hard to debug.]

Saying "I'll implement this, but someone else gets to debug the problems
that it'll cause" is intended to mean "it wouldn't be a good idea to
implement this, and I'm glad that it isn't implemented because debugging
those problems is often virtually impossible."

>
> On a similar note, forgive me if we've discussed this before, but could
> the without-gcing block take a list of lisp objects that would
> basically be "locked" allowing other gc actions to take place or is
> this just a bad idea?
>

If you remember the bouncer-in-the-diner analogy, the compaction
algorithm (the bouncer shoving diner customers around to make all free
space contiguous) is simple and linear in the number of active
customers, and it makes subsequent seating (allocation) trivial.
Partial compaction/relocation algorithms are certainly possible (see
the traditional MacOS memory manager), but they're generally slower,
more complicated, less effective, and complicate allocation (see the
traditional MacOS memory manager.)

I think that the schemes that would work better are:

 - inside a certain syntactic construct , it'd be legal to use some
   primitive to obtain the address of (at least certain types of) lisp
   objects and to pass those addresses to foreign code, with the
   understanding that those addresses have "dynamic extent" and cease
   to be valid when the construct exits.

 - (harder, but more general): provide for the allocation of lisp
   objects in one or more "static" memory areas.  In general (there
   may be room for exceptions), the GC might reclaim the memory used
   by a statically-allocated object, but it would never move such
   objects around (and it'd be legal/safe for foreign code to point
   at static objects, and doing so wouldn't constrain the GC from
   doing whatever it wants to do with "dynamic" objects.)

The big difference between MAKE-HEAP-IVECTOR and the "static object"
scheme is that in the latter case, the GC might decide that a static
object was unreferenced and that it could be freed; MAKE-HEAP-IVECTOR
puts that burden on the programmer.