[Openmcl-devel] Hash Table anomaly -- hash-table-size decreases - wondering how this can happen

Glenn Iba giba at alum.mit.edu
Mon Jan 5 14:15:15 PST 2015


Gail,

  Thanks for tracking down the bug!   Also,  thanks to everyone else who
responded with helpful input!

Looks like I can get by for now with (make-hash-table :lock-free nil)

BTW - is there a way to search the Clozure documentation?   I tried to find
:lock-free documentation
but came up empty.

Thanks again to all!

--Glenn



On Mon, Jan 5, 2015 at 2:41 PM, Gail Zacharias <gz at clozure.com> wrote:

> I've tracked down the bug in CCL (http://trac.clozure.com/ccl/ticket/1258
> - it's pretty much the problem Madhu pointed out early in this thread)  and
> will try to fix it later today.
>
>
>
> On Mon, Jan 5, 2015 at 1:21 PM, Gail Zacharias <gz at clozure.com> wrote:
> >
> > In CCL hash tables do not get resized due to overall memory pressure,
> something else is going on.
> >
> > As Gary Byers mentioned earlier, the resizing behavior of hash tables is
> controlled by :REHASH-THRESHOLD and :REHASH-SIZE.  The default values are
> 0.85 and 1.5.   Specifying a larger :rehash-size, e.g. :rehash-size 5.0,
> would cut down the cost of regrowing the table, although it wouldn't
> explain why it gets shrunk in the first place.
> >
> > Does the problem still arise if you create your hash table with
> :LOCK-FREE NIL?
> >
> > I believe there is a way to get GC to run a user hook, though I don't
> remember how off hand.  However, if you are sure that you have a non-weak
> equalp hash table whose only keys are byte vectors, it's very unlikely that
> the problem has anything to do with the gc, so this wouldn't be the first
> thing I'd look at.
> >
> > You can catch when hash tables get resized by advising
> ccl::compute-hash-size.
> >
> >
> >
> > On Jan 5, 2015, at 11:03 AM, Glenn Iba <giba at alum.mit.edu> wrote:
> >
> > > Hi Tim,
> > >
> > > Yes, for me it's a rather serious problem in terms of search
> performance.
> > > If I know my hash-table will need to hold  50 million positions, then
> there is a huge
> > > performance penalty if the hash-table has to grow from a small size.
> > > As the number of positions becomes large, the cost to grow the
> hash-table and
> > > rehash all the positions becomes significant.  Presumably it is
> growing many times to
> > > get back to a large size.
> > >
> > > Example:
> > >
> > >  With a large hash-table, it takes my search roughly 1/2 hour to
> compute a generation
> > > (the hash-table is basically used as a set to eliminate duplicate
> positions).
> > > I do a CLRHASH in order to re-use the hash-table to accumulate the
> next generation
> > > (this is a breadth-first search).
> > > When the hash-table shrinks after a CLRHASH, it takes 12 hours to
> compute the next
> > > generation (which is only slightly larger than the previous one).
> > >
> > > I agree that profiling is important.
> > > Do you (or anyone else listening) know how to get code to generate
> alerts:
> > >  1.  Every time the hash-table grows
> > >  2.  Every time GC is called    (or is there a way to turn off
> automatic GC and call it either manually or explicitly in my code?)
> > >
> > > Thanks for looking at this with me,
> > > --Glenn
> > >
> > >
> > >
> > >
> > > On Mon, Jan 5, 2015 at 6:43 AM, Tim Bradshaw <tfeb at me.com> wrote:
> > > On 5 Jan 2015, at 06:05, Glenn Iba <giba at alum.mit.edu> wrote:
> > >
> > >> Meanwhile, for my search purposes:
> > >> Is there any way in CCL to specify a Minimum size for a hash-table??
> > >> It seems pretty annoying (to put it mildly) that rehashing might
> reduce the size of the table.
> > >
> > > I hate to ask this but: is this a problem?  In particular is there an
> observed performance or functionality problem caused by this?  My
> experience with things like this is that the performance characteristics
> are usually hairy and hard to understand, and that, usually, the people
> doing the implementation have understood these a lot better than I do,
> particularly with regard to GC.  So unless I've profiled and found that
> there is actually a performance problem, or the application has exploding
> memory use or something, I never worry.
> > >
> > > _______________________________________________
> > > Openmcl-devel mailing list
> > > Openmcl-devel at clozure.com
> > > https://lists.clozure.com/mailman/listinfo/openmcl-devel
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20150105/68116003/attachment.htm>


More information about the Openmcl-devel mailing list