<div dir="ltr">Lock-free hash tables use an alternate algorithm to minimize the performance impact of thread-safety. They avoid the expense of locking during gethash at the cost of making rehashing more expensive (puthash performance is basically unaffected). Aside from performance, hash tables should behave the same whether you specify :lock-free or not -- it's a bug if they don't give the same results.<br><div><br></div><div>The lock-free algorithm has been the default in CCL since 1.3, and so it's certainly possible that some non-lock-free code has bit rotted. We can look into it if you come up with some kind of a test case.</div><div><br></div><div>Alternatively you can follow Tim's suggestion -- don't reuse the same hash table for a new generation, just make a new one. This will avoid both the lock-free clrhash bug and any issues in locking hash tables.</div><div><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 5, 2015 at 11:52 PM, Glenn Iba <span dir="ltr"><<a href="mailto:giba@alum.mit.edu" target="_blank">giba@alum.mit.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Gail,<div><br></div><div> I have no clue what (make-hash-table :test #'equalp :lock-free nil) is doing,</div><div>but it doesn't work at all -- my searches fail miserably. The generation sizes are all wrong (much smaller than the correct sizes),</div><div>and no solution is ever found.</div><div><br></div><div> Without the :lock-free nil the hash-tables work fine (correctly) except for the shrinkage problem after CLRHASH.</div><div><br></div><div>Is there some reason that :lock-free nil won't work with :test #'equalp ??</div><div><br></div><div>Is there a simple explanation of what the :lock-free is for and what the expected behavior is with :lock-free nil ??</div><div><br></div><div>Could the :lock-free nil be interacting badly with the garbage collector? </div><div>I'm at a total loss to imagine what's going so badly wrong :(</div><div><br></div><div>Thanks,</div><div>--Glenn</div><div><br></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 5, 2015 at 5:15 PM, Glenn Iba <span dir="ltr"><<a href="mailto:giba@alum.mit.edu" target="_blank">giba@alum.mit.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Gail,<div><br></div><div> Thanks for tracking down the bug! Also, thanks to everyone else who responded with helpful input!</div><div><br></div><div>Looks like I can get by for now with (make-hash-table :lock-free nil)</div><div><br></div><div>BTW - is there a way to search the Clozure documentation? I tried to find :lock-free documentation</div><div>but came up empty.</div><div><br></div><div>Thanks again to all!</div><span><font color="#888888"><div><br></div><div>--Glenn</div><div><br></div><div><br></div></font></span></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 5, 2015 at 2:41 PM, Gail Zacharias <span dir="ltr"><<a href="mailto:gz@clozure.com" target="_blank">gz@clozure.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I've tracked down the bug in CCL (<a href="http://trac.clozure.com/ccl/ticket/1258" target="_blank">http://trac.clozure.com/ccl/ticket/1258</a> - it's pretty much the problem Madhu pointed out early in this thread) and will try to fix it later today.<div><div><div><br><div><br></div><div><br>On Mon, Jan 5, 2015 at 1:21 PM, Gail Zacharias <<a href="mailto:gz@clozure.com" target="_blank">gz@clozure.com</a>> wrote:<br>><br>> In CCL hash tables do not get resized due to overall memory pressure, something else is going on.<br>><br>> As Gary Byers mentioned earlier, the resizing behavior of hash tables is controlled by :REHASH-THRESHOLD and :REHASH-SIZE. The default values are 0.85 and 1.5. Specifying a larger :rehash-size, e.g. :rehash-size 5.0, would cut down the cost of regrowing the table, although it wouldn't explain why it gets shrunk in the first place.<br>><br>> Does the problem still arise if you create your hash table with :LOCK-FREE NIL?<br>><br>> I believe there is a way to get GC to run a user hook, though I don't remember how off hand. However, if you are sure that you have a non-weak equalp hash table whose only keys are byte vectors, it's very unlikely that the problem has anything to do with the gc, so this wouldn't be the first thing I'd look at.<br>><br>> You can catch when hash tables get resized by advising ccl::compute-hash-size.<br>><br>><br>><br>> On Jan 5, 2015, at 11:03 AM, Glenn Iba <<a href="mailto:giba@alum.mit.edu" target="_blank">giba@alum.mit.edu</a>> wrote:<br>><br>> > Hi Tim,<br>> ><br>> > Yes, for me it's a rather serious problem in terms of search performance.<br>> > If I know my hash-table will need to hold 50 million positions, then there is a huge<br>> > performance penalty if the hash-table has to grow from a small size.<br>> > As the number of positions becomes large, the cost to grow the hash-table and<br>> > rehash all the positions becomes significant. Presumably it is growing many times to<br>> > get back to a large size.<br>> ><br>> > Example:<br>> ><br>> > With a large hash-table, it takes my search roughly 1/2 hour to compute a generation<br>> > (the hash-table is basically used as a set to eliminate duplicate positions).<br>> > I do a CLRHASH in order to re-use the hash-table to accumulate the next generation<br>> > (this is a breadth-first search).<br>> > When the hash-table shrinks after a CLRHASH, it takes 12 hours to compute the next<br>> > generation (which is only slightly larger than the previous one).<br>> ><br>> > I agree that profiling is important.<br>> > Do you (or anyone else listening) know how to get code to generate alerts:<br>> > 1. Every time the hash-table grows<br>> > 2. Every time GC is called (or is there a way to turn off automatic GC and call it either manually or explicitly in my code?)<br>> ><br>> > Thanks for looking at this with me,<br>> > --Glenn<br>> ><br>> ><br>> ><br>> ><br>> > On Mon, Jan 5, 2015 at 6:43 AM, Tim Bradshaw <<a href="mailto:tfeb@me.com" target="_blank">tfeb@me.com</a>> wrote:<br>> > On 5 Jan 2015, at 06:05, Glenn Iba <<a href="mailto:giba@alum.mit.edu" target="_blank">giba@alum.mit.edu</a>> wrote:<br>> ><br>> >> Meanwhile, for my search purposes:<br>> >> Is there any way in CCL to specify a Minimum size for a hash-table??<br>> >> It seems pretty annoying (to put it mildly) that rehashing might reduce the size of the table.<br>> ><br>> > I hate to ask this but: is this a problem? In particular is there an observed performance or functionality problem caused by this? My experience with things like this is that the performance characteristics are usually hairy and hard to understand, and that, usually, the people doing the implementation have understood these a lot better than I do, particularly with regard to GC. So unless I've profiled and found that there is actually a performance problem, or the application has exploding memory use or something, I never worry.<br>> ><br>> > _______________________________________________<br>> > Openmcl-devel mailing list<br>> > <a href="mailto:Openmcl-devel@clozure.com" target="_blank">Openmcl-devel@clozure.com</a><br>> > <a href="https://lists.clozure.com/mailman/listinfo/openmcl-devel" target="_blank">https://lists.clozure.com/mailman/listinfo/openmcl-devel</a><br>><br></div></div></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>