[Openmcl-devel] eql and equal hash tables help please
david.cooper at genworks.com
Wed Jan 16 16:07:07 PST 2013
Did you try it using the :size argument to make-hash-table, with the estimated number of entries in the file system?
On Jan 16, 2013, at 6:31 PM, George Khouri <gk1 at four-four.com> wrote:
> In a program that traverses a file system of ~125000 entries and adds the pathnames as keys to an #'equal hash table, ccl hangs in CCL::LOCK-FREE-PUTHASH with no progress after ~14000 entries - seems like it's just thrashing. No error is thrown and I can interrupt it with control-C.
> If I add the pathnames to an #'eql hash table ccl has no problem with the same directory.
> Is this expected behavior?
> I tried the same thing in SBCL which works with either an eql or an equal hash table. BTW, the eql version runs in ~6 seconds in SBCL, compared to ~70 seconds in CCL, though I realize this speed difference could be a memory issue and not one of algorithms. I know nothing about SBCL.
> I'm running ccl 1.8 on linux 3.2.0-33-generic Ubuntu 12.04 on a core i5 thinkpad, 4GB.
> It doesn't "appear" that I'm running low on memory, but the program behaves that way with the equal hash table, and though I've tried the various ccl command-line options to increase stack and heap allocations, as well as ulimit stack size, I've had no change in results. I've tried various values for initial hash table size, rehash size and threshold but nothing seems to have any beneficial effect. The system monitor doesn't show any disk swapping going on.
> Is there some memory parameter I'm missing that I need to change? Is there a problem related to having 4 cpu cores?
> Here's the typical output when I interrupt the seemingly hung program - the hash table won't grow beyond this point:
> Welcome to Clozure Common Lisp Version 1.8-r15286M (LinuxX8664)!
> ? (traverse-filesystem "/home/gkhouri/")
> > Break: interrupt signal
> > While executing: CCL::LOCK-FREE-PUTHASH, in process listener(1).
> 1 > (:f 0)
> (7FB9FE90FF80) : 0 (LOCK-FREE-PUTHASH #P"/tmp/gkhouri-test/Documents/Music/M Stal Tunes.csv" #<HASH-TABLE :TEST EQUAL size 13672/13673 #x302000753F9D> T) 613
> (CCL::KEY CCL::HASH CCL::VALUE)
> CCL::KEY: #P"/tmp/gkhouri-test/Documents/Music/M Stal Tunes.csv"
> CCL::HASH: #<HASH-TABLE :TEST EQUAL size 13672/13673 #x302000753F9D>
> CCL::VALUE: T
> 1 > (room)
> Approximately 17,432,576 bytes of memory can be allocated
> before the next full GC is triggered.
> Total Size Free Used
> Lisp Heap: 41025536 (40064K) 17432576 (17024K) 23592960 (23040K)
> Stacks: 11072816 (10813K) 11055872 (10797K) 16944 (17K)
> Static: 20304144 (19828K) 0 (0K) 20304144 (19828K)
> 376792.870 MB reserved for heap expansion.
> If I create the equal hash table with :lock-free set to nil, (make-hash-table :test #'equal :lock-free nil), ccl presumably uses an older hashing algorithm and I don't think I have the problem storing the pathnames, BUT after about 15,000 entries, things slow down drastically, and unacceptably, presumably from rehashing (?), and I killed the program after about 60,000 entries because it was taking so long. With :lock-free enabled, I never get past ~14K entries.
> Thanks for any info.
> George Khouri
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Openmcl-devel