<div dir="ltr">Raymond,<div><br></div><div> Thank you for taking the time to help! I'm new to this Clozure mailing list.</div><div>Do the CCL maintainers monitor his list for issues? Or do you recommend</div><div>I create a ticket to get it looked into?</div><div><br></div><div>Thanks again!</div><div>--Glenn</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 3, 2015 at 10:17 AM, Raymond Wiker <span dir="ltr"><<a href="mailto:rwiker@gmail.com" target="_blank">rwiker@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">It looks like repeatedly clrhashing the table may cause it to be rehashed down to a smaller size. The following code snippet replicates what you’re seeing (I think): <div><br></div><div><div>(let ((h (make-hash-table :size 100000)))</div><div><span style="white-space:pre-wrap"> </span> (dotimes (i (truncate (hash-table-size h) 2))</div><div><span style="white-space:pre-wrap"> </span> (setf (gethash i h) i))</div><div><span style="white-space:pre-wrap"> </span> (clrhash h)</div><div><span style="white-space:pre-wrap"> </span> (dotimes (i (truncate (hash-table-size h) 2))</div><div><span style="white-space:pre-wrap"> </span> (setf (gethash i h) i))</div><div><span style="white-space:pre-wrap"> </span> (hash-table-size h))</div></div><div><br></div><div>In ccl, this returns 75001. Lispworks 6.1 32-bit returns 100003, while sbcl returns 100000 (all on Mac OS X). This may or may not be a bug in ccl; no doubt people better qualified to judge will provide an answer to that.</div><div><br></div><div>It may be better to simply allocate a new hash table rather than using clrhash.</div><div><div class="h5"><div><br><div><div><blockquote type="cite"><div>On 03 Jan 2015, at 15:10 , Glenn Iba <<a href="mailto:giba@alum.mit.edu" target="_blank">giba@alum.mit.edu</a>> wrote:</div><br><div><div dir="ltr">Raymond,<div><br><div> Thanks for the quick response. Is it definitely the case, then, that a GC can trigger rehashing?<div>Rehashing a large hash table is potentially expensive -- and I wanted to avoid it.</div><div>The reason I allocate a large hash-table to begin with is to avoid the rehashing incurred</div><div>due to growing the hash-table. It takes my search an order of magnitude more time to</div><div>compute a generation if the hash-table starts small and grows repeatedly.</div><div>After my hash-tables "shrink" in size, the compute time for a generation jumps from 1/2 hour</div><div>to 15 hours.</div><div><br></div><div>Is there anyway to specify a minimum size for a hash-table? or to protect it against </div><div>re-hashing during GC?</div><div><br></div><div>One idea I had was to allocate a new "really large" hash-table for each generation.</div><div>This would be instead of using CLRHASH and re-using the original large hash-table.</div><div>But this wouldn't help if GC causes it to shrink anyway. My goal is to have a</div><div>large hash-table allocated only once, and have it stay large so I can reuse it.</div><div><br></div><div>Thanks for any suggestions,</div></div></div><div>--Glenn</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 3, 2015 at 3:26 AM, Raymond Wiker <span dir="ltr"><<a href="mailto:rwiker@gmail.com" target="_blank">rwiker@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">My understanding (as far as it goes) is that hash-table-size is just a hint to the runtime of the expected number of keys in the hash table; it is used when creating a hash table and can later be used to create other hash tables of similar size. hash-table-count, on the other hand, is the actual, current number of keys in the hash table. I would expect hash-table-size to increase as the number of keys in the table grows past the initial size, and it would also make sense for hash-table-size to decrease if the hash table is rehashed as part of gc.<br>
<div><div><br>
> On 03 Jan 2015, at 08:47 , Glenn Iba <<a href="mailto:giba@alum.mit.edu" target="_blank">giba@alum.mit.edu</a>> wrote:<br>
><br>
> Call for help!<br>
><br>
> I'm doing some large searches in CCL, and have been using large hash-tables,<br>
> but I"m perplexed that the hash-table-size is getting mysteriously decreased.<br>
> Can anyone explain how this is possible?<br>
><br>
> My speculation is that I'm exhausting the heap (though I don't get any notification of this),<br>
> and that CCL is trying to create more heap space by shrinking my large hash-table.<br>
> Does this sound like it could be possible? I'd prefer to get a notification that I'm out of space.<br>
> Is there any way to control this?<br>
><br>
> Details:<br>
> I'm running CCL 1.10 on a Mac with OS X Yosemite (10.10.1), with 8GB RAM.<br>
> I'm creating a single large hash-table with<br>
> (make-hash-table :test #'equalp :size 100000000) ;; 100,000,000<br>
> I'm storing positions of my search space (each represented by a byte-vector of 16 unsigned-bytes)<br>
> in this hash-table<br>
> For each generation, I collect all the positions in the hash-table (to avoid duplicates).<br>
> I then write the generation out to a file, and do CLRHASH so I can reuse the hash-table.<br>
> My searches reach a point (as the generation size grows) when the hash-table-size<br>
> decreases dramatically (from 100,000,000 to 12,396,373) -- how is this possible?<br>
><br>
> I'd be happy to supply code, detailed traces, and whatever other info I can<br>
> to anyone who'd be willing to help me figure this out.<br>
><br>
> Thanks in advance!<br>
> --Glenn<br>
><br>
><br>
</div></div>> _______________________________________________<br>
> Openmcl-devel mailing list<br>
> <a href="mailto:Openmcl-devel@clozure.com" target="_blank">Openmcl-devel@clozure.com</a><br>
> <a href="https://lists.clozure.com/mailman/listinfo/openmcl-devel" target="_blank">https://lists.clozure.com/mailman/listinfo/openmcl-devel</a><br>
<br>
</blockquote></div><br></div>
</div></blockquote></div><br></div></div></div></div></div></blockquote></div><br></div>