[Openmcl-devel] Hash Table anomaly -- hash-table-size decreases - wondering how this can happen

Glenn Iba giba at alum.mit.edu
Mon Jan 5 20:52:35 PST 2015


Hi Gail,

   I have no clue what (make-hash-table :test #'equalp :lock-free nil)   is
doing,
but it doesn't work at all -- my searches fail miserably.  The generation
sizes are all wrong (much smaller than the correct sizes),
and no solution is ever found.

   Without the :lock-free nil   the hash-tables work fine (correctly)
except for the shrinkage problem after CLRHASH.

Is there some reason that :lock-free nil  won't work with :test #'equalp  ??

Is there a simple explanation of what  the :lock-free is for and what the
expected behavior is with :lock-free nil ??

Could the :lock-free nil be interacting badly with the garbage collector?
I'm at a total loss to imagine what's going so badly wrong  :(

Thanks,
--Glenn


On Mon, Jan 5, 2015 at 5:15 PM, Glenn Iba <giba at alum.mit.edu> wrote:

> Gail,
>
>   Thanks for tracking down the bug!   Also,  thanks to everyone else who
> responded with helpful input!
>
> Looks like I can get by for now with (make-hash-table :lock-free nil)
>
> BTW - is there a way to search the Clozure documentation?   I tried to
> find :lock-free documentation
> but came up empty.
>
> Thanks again to all!
>
> --Glenn
>
>
>
> On Mon, Jan 5, 2015 at 2:41 PM, Gail Zacharias <gz at clozure.com> wrote:
>
>> I've tracked down the bug in CCL (http://trac.clozure.com/ccl/ticket/1258
>> - it's pretty much the problem Madhu pointed out early in this thread)  and
>> will try to fix it later today.
>>
>>
>>
>> On Mon, Jan 5, 2015 at 1:21 PM, Gail Zacharias <gz at clozure.com> wrote:
>> >
>> > In CCL hash tables do not get resized due to overall memory pressure,
>> something else is going on.
>> >
>> > As Gary Byers mentioned earlier, the resizing behavior of hash tables
>> is controlled by :REHASH-THRESHOLD and :REHASH-SIZE.  The default values
>> are 0.85 and 1.5.   Specifying a larger :rehash-size, e.g. :rehash-size
>> 5.0, would cut down the cost of regrowing the table, although it wouldn't
>> explain why it gets shrunk in the first place.
>> >
>> > Does the problem still arise if you create your hash table with
>> :LOCK-FREE NIL?
>> >
>> > I believe there is a way to get GC to run a user hook, though I don't
>> remember how off hand.  However, if you are sure that you have a non-weak
>> equalp hash table whose only keys are byte vectors, it's very unlikely that
>> the problem has anything to do with the gc, so this wouldn't be the first
>> thing I'd look at.
>> >
>> > You can catch when hash tables get resized by advising
>> ccl::compute-hash-size.
>> >
>> >
>> >
>> > On Jan 5, 2015, at 11:03 AM, Glenn Iba <giba at alum.mit.edu> wrote:
>> >
>> > > Hi Tim,
>> > >
>> > > Yes, for me it's a rather serious problem in terms of search
>> performance.
>> > > If I know my hash-table will need to hold  50 million positions, then
>> there is a huge
>> > > performance penalty if the hash-table has to grow from a small size.
>> > > As the number of positions becomes large, the cost to grow the
>> hash-table and
>> > > rehash all the positions becomes significant.  Presumably it is
>> growing many times to
>> > > get back to a large size.
>> > >
>> > > Example:
>> > >
>> > >  With a large hash-table, it takes my search roughly 1/2 hour to
>> compute a generation
>> > > (the hash-table is basically used as a set to eliminate duplicate
>> positions).
>> > > I do a CLRHASH in order to re-use the hash-table to accumulate the
>> next generation
>> > > (this is a breadth-first search).
>> > > When the hash-table shrinks after a CLRHASH, it takes 12 hours to
>> compute the next
>> > > generation (which is only slightly larger than the previous one).
>> > >
>> > > I agree that profiling is important.
>> > > Do you (or anyone else listening) know how to get code to generate
>> alerts:
>> > >  1.  Every time the hash-table grows
>> > >  2.  Every time GC is called    (or is there a way to turn off
>> automatic GC and call it either manually or explicitly in my code?)
>> > >
>> > > Thanks for looking at this with me,
>> > > --Glenn
>> > >
>> > >
>> > >
>> > >
>> > > On Mon, Jan 5, 2015 at 6:43 AM, Tim Bradshaw <tfeb at me.com> wrote:
>> > > On 5 Jan 2015, at 06:05, Glenn Iba <giba at alum.mit.edu> wrote:
>> > >
>> > >> Meanwhile, for my search purposes:
>> > >> Is there any way in CCL to specify a Minimum size for a hash-table??
>> > >> It seems pretty annoying (to put it mildly) that rehashing might
>> reduce the size of the table.
>> > >
>> > > I hate to ask this but: is this a problem?  In particular is there an
>> observed performance or functionality problem caused by this?  My
>> experience with things like this is that the performance characteristics
>> are usually hairy and hard to understand, and that, usually, the people
>> doing the implementation have understood these a lot better than I do,
>> particularly with regard to GC.  So unless I've profiled and found that
>> there is actually a performance problem, or the application has exploding
>> memory use or something, I never worry.
>> > >
>> > > _______________________________________________
>> > > Openmcl-devel mailing list
>> > > Openmcl-devel at clozure.com
>> > > https://lists.clozure.com/mailman/listinfo/openmcl-devel
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20150105/311c2f9a/attachment.htm>


More information about the Openmcl-devel mailing list