[Openmcl-devel] Why is remhash being called here?

Gary Byers gb at clozure.com
Sun Jun 30 20:46:09 PDT 2013

On Sun, 30 Jun 2013, Ron Garret wrote:

> On Jun 30, 2013, at 2:08 PM, Gary Byers wrote:
>> My guess is that REMHASH is being called to remove something from a hash table.
> Yes.  That much I was actually able to figure out on my own.
>> If your next question is going to be something like "what hash table, and why
>> is a hash table apparently being maintained when forms are being read bv the
>> REPL (in this case) ?", a short answer is that this is involved in making
>> M-. (the editor command) work in more cases for more people and in allowing
>> DISASSEMBLE to annotate its output with the corresponding source code.
> That would make sense except for two things:
> 1.  This happens in the command-line version too where there is no meta-.

But you might want to call DISASSEMBLE someday (assuming that what was read
is compiled into something that you might want to call DISASSEMBLE on someday ...)
Slightly less snarkily, you can use (for instance) REBUILD-CCL to create a
CCL image from the command-line, then run that image in an environment where
M-. works; the source-location information associated with compiled functions
will be available to guide M-., because that information was generated at read-time.

> 2.  REMHASH is (apparently) only called when I evaluate a form that invokes a reader macro.

The hash table is maintained by the reader (or at least by that variant of the
reader that's used by the REPL and by COMPILE-FILE.)   It's ultimately doing
this so that it if the form being read is passed to the compiler then information
about the source location of the form and its subforms can also be passed to the
compiler and thus  made available to M-. and DISASSEMBLE.

Maintaining source information about a form or subform makes some sense if there's
a 1:1 relationship between that form and the information; a particular form
(DEFUN FACT ...) may have been read from a particular file at a particular range
of positions, and the same is also likely true of a subform of that form (like
(ZEROP N)).  The loop that you quote below calls REMHASH on entries where there's
a 1:N relationship, as in:
? (trace remhash)
? (+ 1 2 1 2 3)
0> Calling (REMHASH 1 #<HASH-TABLE :TEST EQ size 5/60 #x3020006A488D>) 
<0 REMHASH returned T
0> Calling (REMHASH 2 #<HASH-TABLE :TEST EQ size 4/60 #x3020006A488D>) 
<0 REMHASH returned T

The subforms 1 and 2 each had multiple "source notes" associated with them
(there was a 1:many relationship between those forms and their location in
the input stream) so the entries were removed in the loop that you quote

Because of the way that it works (the recursive call to READ at essentially
the same stream position), your reader macro also causes multiple SOURCE-NOTE
objects to be associated with its result.  (These objects are equivalent but
not EQ, so I suppose that there's some argument in favor of slowing things down
further and checking for that ... no, forget that I said that.)

> I know the toplevel does some things besides simply calling (print (eval (read))) but AFAIK none of those things should care (of even know) that a reader macro had been invoked.  Hence my puzzlement.

Hopefully, knowing that that isn't what's actually happening is less puzzling.

>> If your next question is going to be "aren't there significant costs involved
>> here, and do the benefits justify those costs ?", my answer would be a resounding ...
>> oh wait, you haven't really asked that question.  Never mind.
> Indeed, I have not asked that question.  I'm really just trying to understand why the toplevel is doing something different when a reader macro is invoked.

It isn't really.  Reader macros (those associated with #\(, #\', and many
others) get invoked all the time.  The form that your reader macro returned
happened to get removed from a hash table because it had multiple source notes
associated with it (honest), and AFAICT this has to do with the fact that it
effectively calls READ on the same stream at the same position twice.

Someone once said that human beings were invented by water as a means of
transporting itself from one place to another.  I'm not sure about that,
but I sometimes get the impression that the CCL reader and compiler were
invented by SOURCE-NOTEs for similar reasons.

More information about the Openmcl-devel mailing list