[Openmcl-devel] Decreasing the size of the reserved heap

Gary Byers gb at clozure.com
Mon Jan 13 20:07:55 PST 2014


OK.  We discussed this briefly in IRC last week and I said that I'd look
into an approach to getting the cluster and top to see less "virtual memory"
being as being "in use" and get back to you this weekend.  (So, I took
a long weekend ...)

First of all, I memntioned that there was an apparent bug that caused
ROOM to get confused when the -R option was used to reserve less than
128Gb; the code that handles that isn't really sure if that means "in
addition to the 128GB reserved for static things" or "a very small
amount, which is treated as "just a little more than 128GB, since
we sort of need to reserve at least 128GB".  In addition, the space
that's "reserved for heap expansion" includes both space for allocated
data and space for data structurs that're used to keep track of that
data.  (A bitvector that the GC uses to keep track of live objects
needs 1 bit for every 128 bits of data; if we're dealing with hundreds
of GB of data, we need to reserve a few GB of the total reservation
for those auxiliary data structures.

Starting CCL with

$ ccl64 [other options[ -R 133gb

seems to leave a bit over 800Mb to dynamically allocate things in
and creates auxiliary data structures to suppport all of the static
and dynamic data that could exist.

The "static data" in question (what some of that 128GB is used for)
includes data and code that's been "purified" by SAVE-APPLICATION
(where "purified" in this context means "moved out of the GC's way")
and room for "static conses".  As you (David) and some other people
know, "static conses" (CONS cells whose address is guaranteed to
never change) are used by some versions of ACL2 to support a function
memoization scheme; they aren't generally interesting or useful otherwise.

In most applications, most of that 128GB of address space that's
reserved for static conses and purified code/data isn't used; if we
unmap the address range that's reserved for the growth of those
static regions, stupid things like whatever's managing your cluster
and things that try to monitor use of "virtual memory" will think
that we're using less virtual memory.   (We could also unmap some of
the pages that we reserved for those "auxiliary data structures"; that's
a few GB which may or may not matter to the stupid thing that's managing
your cluster, but this whole message thread has gone on too long already
and that's left as an exercise.

It's worth remembering that most of the reason that we're reserving
ranges of address space in the first place is to keep other (foriegn)
code from using addresses that we think CCL might want to use (in the
near or distant future.)  Those sorts of conflicts are much more likely
to happen in a 32-bit environment than in a 64-bit one (less address
space = greater chance of conflict), but the chance is still non-zero.
(A bug that kept a Webkit demo from working reliably for a few OSX
releases seems to have been caused by a Webkit Javascript JIT compiler
wanting to use addresses that were already allocated by CCL).  If we
unmap some of the pages we've reserved, we may not be able to safely
map them again.

So, to unmap some of those reserved pages and take our chances:  first,
we need a little functon to find a CCL kernel data structure that
descibes memory areas:


(in-package "CCL")

(defun find-area (code)
   (let* ((p (%null-ptr)))
     (%setf-macptr-to-object p (%get-kernel-global all-areas))
     (if (eql (%get-object p target::area.code) code)
       p
       (do* ((q (%get-ptr p target::area.succ) (%get-ptr q target::area.succ)))
            ((eql p q))
         (when (eql (%get-object q target::area.code) code)
           (return q))))))

We can use that function to find the end of the memory region that was
purified by SAVE-APPLICATION:

(defun purified-end-address ()
   (let* ((a (find-area area-managed-static)))
     (if a
       (%get-ptr a target::area.high)
       (error "can't find the end of purified space"))))



We can find the address of the start of the dynamic heap by several
means; this way is fairly simple:

(defun dynamic-heap-start-address ()
   (%int-to-ptr (ash (%get-kernel-global heap-start) target::fixnumshift)))

but (as they often do ...) static conses confuse things.  We allocate static
conses in chunks of 32K each, and each static cons cell is 16 bytes in size,
so the chuunks are (* 32868 16) = 512KB eack.  Whenever a chunk is allocated,
the value returned by DYNAMIC-HEAP-START-ADDRESS will decrease by 512KB.

(defun room-for-n-static-conses (n)
   (* (logandc2 (+ 32767 n) 32767) target::cons.size))

CCL::RESERVED-STATIC-CONSES returns the number of static conses that have
been reserved since the lisp was created; it'll always be a multiple of
32768, so if we want to be be able to safely allocate N static conses
(including whatever's been allocated) and don't want to risk address
conflicts (or any other priblems that I'm not thinking of) we need
to allow (DYNAMIC-HEAP-START-ADDRESS) to be lowered down to a target
address:

(defun target-dynamic-heap-start-address (nstatic-conses)
   (%inc-ptr (dynamic-heap-start-address)
      (- (room-for-n-static-conses (- nstatic-conses (reserved-static-conses))))))


So, if we promise not to save an image or use more static conses than we say
we will, we don't really need to keep the reserved memory between
(PURIFIED-END-ADDRESS) and (TARGET-DYNAMIC-HEAP-START-ADDREESS N) and can
ask the OS to make the intervening pages "free" rather than "mapped but
not accessible".

(defun unmap-reserved-static-space (nstatic-conses)
   (let* ((start (purified-end-address))
          (end (target-dynamic-heap-start-address nstatic-conses))
          (nbytes (- (%ptr-to-int end) (%ptr-to-int start))))
     (#_munmap start nbytes)))

There's more stuff (a few GB of it) in bitmaps and "auxiliary data structures"
that we could unmap, but it's probably a bit harder to find.  (You know what
they say about the first 127GB being the easiest.)


On Wed, 8 Jan 2014, David L. Rager wrote:

> Hi Gary,
>
> Indeed, I agree that worrying about 'top' output is a silly thing to
> do.  That being said, sometimes clusters (indeed, one that I use but
> do not have the rights to administer) use the amount of memory
> allocated as an indicator of how large the heap might be expected to
> grow.  If the cluster uses such a heuristic, it could prohibit any
> additional jobs from running on the machine that might soon have a
> process that will consume 512GB of memory.  Thus, all the CPUs and
> most of the memory on that machine are being reserved for CCL, when,
> in fact, CCL is only using one core and a couple GB of memory.
>
> I admit that this is my difficulty, and not Clozure's.  Furthermore, I
> tend to agree that Virtual Memory should be cheap to allocate and not
> cause problems.  This being said, if there are some constants in the
> source code that I could set to decrease the 512/128GB threshold, to,
> say, 16GB, that would be helpful.
>
> Thanks,
> David
>
>
>
> On Wed, Jan 8, 2014 at 1:30 PM, Gary Byers <gb at clozure.com> wrote:
>> The -R option controls how much address space is reserved for CCL; tools
>> like
>> 'top' report this as if it was interesting, and it usually isn't.
>>
>> On most 64-bit platforms, "reserving" 512GB of contiguous address space
>> costs essentially the same as reserving 1GB.  That "reserved address space"
>> isn't (yet) readable or writable; there's no physical of virtual memory
>> associated with it.  (I think that at least some versions of Windows
>> will create page-table entries for reserved pages.)  The effect of
>> reserving the address space is to make it difficult for random C code
>> to use address space that CCL's heap may eventually want to use.
>>
>> 512GB seems like a lot today; most of us would have difficulty actually
>> using that.  10 years from now (when we all have 1TB of some sort of
>> memory in our wristwatches) it may not seem that way.
>>
>> For the last few years, the story's been a little more complicated
>> than that: of that 512GB of reserved address space, a fixed amount
>> (I think that it's indeed 128GB) is reserved for "static" things (things
>> whose address won't change during a session, unlike the "dynamic" things
>> that the GC may always be shuffling around.)  The -R option basically
>> controls how much address space is reserved on top of that 129GB.
>>
>> (There are good reasons for making this 128GB a fixed limit; I don't
>> remember all of them at the moment, and they aren't any more interesting
>> than the rest of this.)  It is also true that using the -R option to change
>> the size of the reserved region can confuse ROOM; we have an open ticket
>> that complains about that and someday that'll probably be fixed.)
>>
>> Unless there's some other reason to cause tools like 'top' to print
>> smaller numbers (e.g., to keep people from sending emails that say
>> "OMG!  Top says that I'm using 512GB of virtual memory!  That must
>> be why OSX is filling my disk with swap files!" [It isn't ...]), there's
>> little reason that I can think of for changing the default on 64-bit
>> platforms.
>>
>> On 32-bit machines, there's a whole lot less address space available;
>> typically,
>> the OS makes somewhere between 1 and 3 GB available to a userspace process,
>> and CCL usually tries to reserve a fairly large chunk of what's available
>> (leaving a smaller chunk for foreign/C code.)  In fairly rare cases, foreign
>> code that tries to allocate memory (#_malloc, #_mmap, etc.) can have those
>> attempts fail - not because there isn't enough physical/virtual memory
>> available but because there isn't enough address space to put it in.)  The
>> -R option was mostly intended to help deal with this case.
>>
>> Some early Linux x8664 kernels had difficulty reserving 512GB; from what
>> I could tell (in emailing people using such machines) this was one of
>> several
>> problems those kernels had.  It's not 2004 anymore (or so I've heard.)
>>
>>
>> On Wed, 8 Jan 2014, David L. Rager wrote:
>>
>>> Greetings,
>>>
>>> For bizarre reasons, I'm trying to get CCL to start with a lower
>>> amount of allocated virtual memory.  But it seems that CCL is mostly
>>> ignoring my request (according to top, CCL is still allocating 128g of
>>> memory on Linux kernel 2.6.32-358.18.1.el6.x86_64).  I've spent a lot
>>> of time searching for the cause, but it's time to ask whether there's
>>> anything wrong with my command:
>>>
>>>   scripts/ccl64 -R 4000000000
>>>
>>> I know that the topic "Heap Allocation" (found at
>>> http://ccl.clozure.com/manual/chapter17.3.html) uses the word "try" to
>>> describe the attempt at only reserving small amounts of heap space.
>>> But this is enough of a difficulty for my situation that I wanted to
>>> ask anyway.  Is there something else I can try?  Is there a value in
>>> the CCL source code that I can edit manually and then recompile?
>>>
>>> Thanks,
>>> David
>>> _______________________________________________
>>> Openmcl-devel mailing list
>>> Openmcl-devel at clozure.com
>>> http://clozure.com/mailman/listinfo/openmcl-devel
>>>
>>>
>>
>
>
-------------- next part --------------
(in-package "CCL")

(defun find-area (code)
  (let* ((p (%null-ptr)))
    (%setf-macptr-to-object p (%get-kernel-global all-areas))
    (if (eql (%get-object p target::area.code) code)
      p
      (do* ((q (%get-ptr p target::area.succ) (%get-ptr q target::area.succ)))
           ((eql p q))
        (when (eql (%get-object q target::area.code) code)
          (return q))))))

(defun dynamic-heap-start-address ()
  (%int-to-ptr (ash (%get-kernel-global heap-start) target::fixnumshift)))

(defun purified-end-address ()
  (let* ((a (find-area area-managed-static)))
    (if a
      (%get-ptr a target::area.high)
      (error "can't find the end of purified space"))))

(defun room-for-n-static-conses (n)
  (* (logandc2 (+ 32767 n) 32767) target::cons.size))

(defun target-dynamic-heap-start-address (nstatic-conses)
  (%inc-ptr
   (dynamic-heap-start-address) 
   (- (room-for-n-static-conses (- nstatic-conses (reserved-static-conses))))))

(defun unmap-reserved-static-space (nstatic-conses)
  (let* ((start (purified-end-address))
         (end (target-dynamic-heap-etart-address nstatic-conses))
         (nbytes (- (%ptr-to-int end) (%ptr-to-int start))))
    (#_munmap start nbytes)))


More information about the Openmcl-devel mailing list