[Openmcl-devel] Address-based hashing considered harmful. Maybe.
Stonewall Ballard
sb.list at sb.org
Mon Mar 3 06:25:06 PST 2003
On 3/3/03 1:16 AM, "Gary Byers" <gb at clozure.com> wrote:
> On Sat, 1 Mar 2003, Stonewall Ballard wrote:
>
>> At least in the case of cons cells, how hard would it be to allocate them in
>> their own zone and never move them? Since they're fixed-size, the only
>> reason to move them is so that a scavenger can work uniformly on all
>> objects.
>>
>
> There are other reasons to want to move things (improving locality and
> paging behavior where relevant) Compacting or copying (making allocated
> objects contiguous) has the side effect of making free space contiguous
> and makes allocation much simpler.
For fixed-sized objects like cons cells in their own zone, allocation is
quite simple. Locality of reference and paging optimization are better with
copying, of course.
>...
> At the moment, that sounds like a reasonable short-term approach (don't
> eliminate address-based hashing, but don't use it where alternatives
> exist and let the hash table code concentrate on the case where it's
> not involved.)
>
> Here's a (half-thought-out) idea: suppose that "support" for
> address-based hashing was an attribute of the hash table - set via an
> argument to MAKE-HASH-TABLE and not subsequently changeable. If an
> address-based key (something for which no other good hash
> transformation exists) was used in a hash table that didn't provide
> such support, lookup would devolve into linear search : it would still
> "work", but would be grossly inefficient. On the other hand, when
> keys that had good hashing functions were used in such a table, the
> code could safely ignore all of the issues related to address-based
> hashing.
Linear scanning is "grossly inefficient" only for large hash tables. Did you
see what size the tables of conses you found were?
>
> There's room for argument as to what the default value of :ADDRESS-BASED
> should be, but if it was in effect things would work as they do now:
> there'd be some significant paranoia and locking involved when such a
> hash table was accessed, but the lookup should still be efficient and
> the overhead might be less significant.
>
> The difference between this scheme and the current one (where there is
> a bit that indicates whether an address-based key might be present) is
> that you need some degree of exclusive access to the hash table (locking)
> before you can even trust that bit not to change out from under you.
>
> If :ADDRESS-BASED defaulted to T, things would be no worse than they are
> now; if it was appropriate to say :ADDRESS-BASED NIL explicitly, things
> could be made a lot better in that case. (The :ADDRESS-BASED T case could
> be made better than it is, but it will always cost more to deal with that
> than to not do so.)
Not knowing how the internals work, this may be naive, but can you change
the type of a hash table if someone puts an address-based object into it?
Let hash tables start life as not-address-based, but morph them as needed.
They'd never revert.
That would keep things simpler for the programmer (me, not you). One could
provide hints, of course.
- Stoney
_______________________________________________
Openmcl-devel mailing list
Openmcl-devel at clozure.com
http://clozure.com/cgi-bin/mailman/listinfo/openmcl-devel
More information about the Openmcl-devel
mailing list