[Openmcl-devel] Hash Table anomaly -- hash-table-size decreases - wondering how this can happen

Glenn Iba giba at alum.mit.edu
Mon Jan 5 08:03:03 PST 2015


Hi Tim,

  Yes, for me it's a rather serious problem in terms of search performance.

If I know my hash-table will need to hold  50 million positions, then there
is a huge
performance penalty if the hash-table has to grow from a small size.
As the number of positions becomes large, the cost to grow the hash-table
and
rehash all the positions becomes significant.  Presumably it is growing
many times to
get back to a large size.

Example:

   With a large hash-table, it takes my search roughly 1/2 hour to compute
a generation
(the hash-table is basically used as a set to eliminate duplicate
positions).
I do a CLRHASH in order to re-use the hash-table to accumulate the next
generation
(this is a breadth-first search).
When the hash-table shrinks after a CLRHASH, it takes 12 hours to compute
the next
generation (which is only slightly larger than the previous one).

I agree that profiling is important.
Do you (or anyone else listening) know how to get code to generate alerts:
   1.  Every time the hash-table grows
   2.  Every time GC is called    (or is there a way to turn off automatic
GC and call it either manually or explicitly in my code?)

Thanks for looking at this with me,
--Glenn




On Mon, Jan 5, 2015 at 6:43 AM, Tim Bradshaw <tfeb at me.com> wrote:

> On 5 Jan 2015, at 06:05, Glenn Iba <giba at alum.mit.edu> wrote:
>
> Meanwhile, for my search purposes:
> Is there any way in CCL to specify a Minimum size for a hash-table??
> It seems pretty annoying (to put it mildly) that rehashing might reduce
> the size of the table.
>
>
> I hate to ask this but: is this a problem?  In particular is there an
> observed performance or functionality problem caused by this?  My
> experience with things like this is that the performance characteristics
> are usually hairy and hard to understand, and that, usually, the people
> doing the implementation have understood these a lot better than I do,
> particularly with regard to GC.  So unless I've profiled and found that
> there is actually a performance problem, or the application has exploding
> memory use or something, I never worry.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20150105/f85f16f1/attachment.htm>


More information about the Openmcl-devel mailing list