[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen: memory initialization/balloon fixes (#3)

On 23/09/11 00:46, Dan Magenheimer wrote:
>> From: Jeremy Fitzhardinge [mailto:jeremy@xxxxxxxx]
>> On 09/22/2011 03:34 PM, Dan Magenheimer wrote:
>>>> I'm aware of that... "some" has been a fixed size of a few megabytes
>>>> in Xen for a long time.  I am seeing 30-60MB or more.
>>> Never mind on this part.  After further debugging, I can see
>>> that this difference is due to normal uses of memory by the
>>> kernel for XEN PAGETABLES and RAMDISK etc.  It's unfortunate
>>> that the difference is so large, but I guess that's in part due
>>> to the desire to use the same kernel binary for native and
>>> virtualized.  I don't remember it being nearly so high for
>>> older PV kernels, but I guess it's progress! :-}
>> I don't think the Xen parts allocate/reserves lots of memory
>> unnecessarily, so it shouldn't be too different from the 2.6.18-xen
>> kernels.  They do reserve various chunks of memory, but for things like
>> RAMDISK I think they get released again (and anyway, I don't think
>> that's going to be anywhere near 30MB, let alone 60).  I'm not very
>> confident in those /proc/meminfo numbers - they may count memory as
>> "reserved" if its in a reserved region even if the pages themselves have
>> been released to the kernel pool.
> No, the first line of /proc/meminfo is precisely "totalram_pages".

I think most of the increase in reserved memory compared to classic Xen
kernels is the change to using the generic SWIOTLB.  This is up to 64 MiB.

>>>>>> Part B of the problem (and the one most important to me) is that
>>>>>> setting /sys/devices/system/xen_memory/xen_memory0/target_kb
>>>>>> to X results in a MemTotal inside the domU (as observed by
>>>>>> "head -1 /proc/meminfo") of X-D.  This can be particularly painful
>>>>>> when X is aggressively small as X-D may result in OOMs.
>>>>>> To use kernel function/variable names (and I observed this with
>>>>>> some debugging code), when balloon_set_new_target(X) is called
>>>>>> totalram_pages gets driven to X-D.
>>>>> Again, this looks like the correct behavior to me.
>>>> Hmmm... so if a user (or automated tool) uses the Xen-defined
>>>> API (i.e. /sys/devices/system/xen_memory/xen_memory0/target_kb)
>>>> to use the Xen balloon driver to attempt to reduce memory usage
>>>> to 100MB, and the Xen balloon driver instead reduces it to
>>>> some random number somewhere between 40MB and 90MB, which
>>>> may or may not cause OOMs, you consider this correct behavior?
>>> I still think this is a bug but apparently orthogonal to
>>> your patchset.  So sorry to bother you.
>> If you ask for 100MB, it should never try to make the domain smaller
>> than that; if it does, it suggests the number is being misparsed or
>> something.
> OK then balloon_stats.current_pages can never be larger than totalram_pages.
> Which means that balloon_stats.current_pages must always grow
> and shrink when totalram_pages does (which is true now only in
> the balloon driver code).  Which means, I think:
> balloon_stats.current_pages is just plain wrong!  It doesn't need to
> exist!  If we replace every instance in balloon.c with totalram_pages,
> I think everything just works.  Will run some tests tomorrow.

No.  balloon_stats.current_pages is the amount of pages used by the
domain from Xen's point of view (and must be equal to the amount report
by xl top).  It is not what the guest kernel thinks is the number of
usable pages.

Because totalram_pages doesn't include some reserved pages
balloon_stats.current_pages will necessarily always be greater.

If you're attempting to make the domain self-balloon I don't see why
you're even interested in the total number of pages.  Surely it's the
number of free pages that's useful?

e.g., a basic self-ballooning algorithm would be something like.

   delta = free_pages - emergency_reserve - spare
   reservation_target -= delta


free_pages is the current number of free pages

emergency_reserve is the amount of pages the kernel reserves for
satisfying important allocations when memory is low.  This is
approximately (initial_maximum_reservation / 32).

spare is some extra number of pages to provide a buffer when the memory
usage increases.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.