[Openmcl-devel] Re: persistence of xref info in fasl files.

Sun Jan 4 13:51:04 PST 2004

On Sun, 4 Jan 2004, Oliver Markovic wrote:

>
> On 03.01.2004, at 23:53, Gary Byers wrote:
>
> > On Sat, 3 Jan 2004, Alan Ruttenberg wrote:
> >
> >> Here's an implementation. Gary, I don't really know which what the
> >> best
> >> place to put the hooks, so I've used advise where it seemed
> >> appropriate. Please feel free to fix or tell me how to.
> >>
> >> I came across one issue when doing this.  If you have (defun foo ()
> >> (flet ((bar ())) *baz*))
> >> then you get a recorded xref from bar to *baz*. I think this should be
> >> foo to *baz* since bar isn't global. If you disagree with that policy
> >> the code needs to be reworked a bit.
> >>
> >
> > I can't remember whether I sent mail about this before the holidays,
> > or who I would have sent it to if I did.
>
> I think we talked about it on IRC before the holidays.

I was going to send something similar to yesterday's message a few
weeks ago, but must not have done so.  (It's horrible when you get
old and can't remember simple things ...)

>
>
> > <lots of interesting information snipped>
>
> Is this written down somewhere? I remember something about an old
> MCL internals document, but it seems to have been lost in the disk
> crash.

I'd forgotten to restore it.  It was written in early 2000 (the html
was generated in 2001) and describes something close to OpenMCL 0.3.
It's there (where the website links point to it) now and is still
mostly accurate about some things.

I think that the section that says "every memory-allocated object
that's not a CONS is a UVECTOR" might make it easier to understand
some of the comments in the source (in places like
"ccl:compiler;**;*arch.lisp" and in the kernel "constants.h/constants.s"
files.)  None of this is quite as easy to read as it could be, I suppose.

>
>
> > (It might take a while to come up with a reasonable policy here:
> > is DOTIMES interesting ?  It might have macroexpanded into "calls"
> > to 1+, or =.  Are those interesting ? Etc.)  Once we've decided what's
> > interesting, it's not too hard to ensure that everything that was
> > interesting to the frontend shows up in the backend.
>
> That's an interesting question. Right now, both cases can happen
> depending
> on whether you're dealing with a compiler macro or an ordinary macro.
> That's
> quite sub-optimal IMHO. Perhaps we can do something along the lines of
> macroexpand and macroexpand-1, because I can see people being interested
> in both answers.
>
>
> > There are some possible advantages to using a separate data structure
> > as Alan's code does as well.  I'm not 100% convinced that the idea I'm
> > proposing is entirely better, but I think that it's worth thinking
> > about
> > further.  It has always sort of struck me that (incomplete and
> > imprecise)
> > XREF info's already there and there's a lot of it, and it seems more
> > attractive to make it more complete and precise than to duplicate it.)
>
> I agree with you in that it looks cleaner. The reason why I chose a
> separate
> data structure was that I'm not familiar enough with compiler internals
> to
> actually do something like you proposed. You can easily confuse the
> current
> scheme by doing something like you did in your example. Getting oddballs
> like these right would be really great, since all other XREFs I've
> tried (LW
> and Allegro) trip on them as well.
>  From what I gather, this would only store direct references (e.g. who
> is called
> by or who is bound by). How would you get the inverse case? Right now
> the information is stored in separate tables. Would this also get
> stored some-
> where or would one need to walk all functions like ccl::callers does?
> Seeing
> as this doesn't take very long and is probably only used interactively,
> I guess
> that's OK.
>
>

The big exception that I have to ccl::callers is that it uses low-level
heap walking primitives to find and examine functions.  Aside from all
of the locality/paging arguments against that, it basically has to
suspend all other threads while it's traversing the heap (the way
that per-thread consing works in 0.14 makes this complicated.)  There
are all kinds of ways to lose (deadlock, mostly) when doing this,
and I'd like to find a better alternative.

The confusing part is that (after looking at every function in the
heap), CCL::CALLERS discards those that aren't globally accessible
(things that aren't current function definitions, macro definitions,
active method functions, etc.)  There are a number of good reasons for
it to concentrate only on those things that are globally accessible,
but of the two strategies:

 a) find all interesting interesting functions by walking the heap,
    discard those that aren't globally accessible.
 b) traverse (via DO-ALL-SYMBOLS or something) all globally-accessible
    functions, find those that're interesting.

(b) seems preferable for a lot of reasons.  If there's a good reason
for the current CCL::CALLERS to be implemented via (a) instead of (b),
I can't think of it.

> >> BTW, what does "indirect calls" mean?
> >
> > Maybe that bit 3 is set in the corresponding
> > xref-constant-information-map ?
>
> Exactly. I just copied the Allegro interface which distinguishes
> between direct
> and indirect calls e.g. (foo 1 2) is a direct, (mapcar #'foo '(1 2)) an
> indirect call
> to FOO. At least that's what's written in the documentation, but after
> trying it out,
> it seems as if it doesn't detect these correctly 100% of times (not
> even their
> given example in the doc...)
>
> Speaking of documentation, I'd like to write a page or two about the
> XREF API,
> since I don't think it should change that much (apart from the
> implementation).
> The HTML pages seem to be generated from Docbook sources, which I can't
> find anywhere. Are they downloadable somewhere?
>

They're written using LyX (http://www.lyx.org), which can export DocBook
SGML.  The LyX sources are in CVS:

:<access method>:clozure.com:/usr/local/tmpcvs/ccldoc

where <access method> is pserver if that works for you, ext if you're
using SSH.

The good news is that there's an Aqua OSX version of LyX available; the
bad news is that I've never had much luck getting the DocBook toolchain
set up on OSX in a way that LyX can find it, so I always generate this
stuff under Linux.

I can make the (automatically generated) SGML source for the HTML available,
if that'd help.

I like LyX (as far as it goes); if there's a better approach to doing
documentation using more widely-available tools, I'd switch in a heartbeat.

LyX was originally (and is still mostly) a LaTeX frontend; it might be
easier/better to cut out the DocBook stage and produce other formats
from LaTeX output.  (That was what was done with the old internals
document, and there may be more/better conversion filters now than
there were then.)

> --
>    Oliver Markovic
>
>