[Openmcl-devel] Hash Table anomaly -- hash-table-size decreases - wondering how this can happen
Glenn Iba
giba at alum.mit.edu
Sat Jan 3 06:10:35 PST 2015
Raymond,
Thanks for the quick response. Is it definitely the case, then, that a
GC can trigger rehashing?
Rehashing a large hash table is potentially expensive -- and I wanted to
avoid it.
The reason I allocate a large hash-table to begin with is to avoid the
rehashing incurred
due to growing the hash-table. It takes my search an order of magnitude
more time to
compute a generation if the hash-table starts small and grows repeatedly.
After my hash-tables "shrink" in size, the compute time for a generation
jumps from 1/2 hour
to 15 hours.
Is there anyway to specify a minimum size for a hash-table? or to protect
it against
re-hashing during GC?
One idea I had was to allocate a new "really large" hash-table for each
generation.
This would be instead of using CLRHASH and re-using the original large
hash-table.
But this wouldn't help if GC causes it to shrink anyway. My goal is to
have a
large hash-table allocated only once, and have it stay large so I can reuse
it.
Thanks for any suggestions,
--Glenn
On Sat, Jan 3, 2015 at 3:26 AM, Raymond Wiker <rwiker at gmail.com> wrote:
> My understanding (as far as it goes) is that hash-table-size is just a
> hint to the runtime of the expected number of keys in the hash table; it is
> used when creating a hash table and can later be used to create other hash
> tables of similar size. hash-table-count, on the other hand, is the actual,
> current number of keys in the hash table. I would expect hash-table-size to
> increase as the number of keys in the table grows past the initial size,
> and it would also make sense for hash-table-size to decrease if the hash
> table is rehashed as part of gc.
>
> > On 03 Jan 2015, at 08:47 , Glenn Iba <giba at alum.mit.edu> wrote:
> >
> > Call for help!
> >
> > I'm doing some large searches in CCL, and have been using large
> hash-tables,
> > but I"m perplexed that the hash-table-size is getting mysteriously
> decreased.
> > Can anyone explain how this is possible?
> >
> > My speculation is that I'm exhausting the heap (though I don't get any
> notification of this),
> > and that CCL is trying to create more heap space by shrinking my large
> hash-table.
> > Does this sound like it could be possible? I'd prefer to get a
> notification that I'm out of space.
> > Is there any way to control this?
> >
> > Details:
> > I'm running CCL 1.10 on a Mac with OS X Yosemite (10.10.1), with 8GB
> RAM.
> > I'm creating a single large hash-table with
> > (make-hash-table :test #'equalp :size 100000000) ;;
> 100,000,000
> > I'm storing positions of my search space (each represented by a
> byte-vector of 16 unsigned-bytes)
> > in this hash-table
> > For each generation, I collect all the positions in the hash-table
> (to avoid duplicates).
> > I then write the generation out to a file, and do CLRHASH so I can
> reuse the hash-table.
> > My searches reach a point (as the generation size grows) when the
> hash-table-size
> > decreases dramatically (from 100,000,000 to 12,396,373) -- how
> is this possible?
> >
> > I'd be happy to supply code, detailed traces, and whatever other info I
> can
> > to anyone who'd be willing to help me figure this out.
> >
> > Thanks in advance!
> > --Glenn
> >
> >
> > _______________________________________________
> > Openmcl-devel mailing list
> > Openmcl-devel at clozure.com
> > https://lists.clozure.com/mailman/listinfo/openmcl-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20150103/e40299fa/attachment.htm>
More information about the Openmcl-devel
mailing list