[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Whats effect of EXTRA_MEM_RATIO






On Tue, Jul 16, 2013 at 9:58 PM, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:
On Tue, Jul 16, 2013 at 09:47:57PM +0530, Rushikesh Jadhav wrote:
> On Tue, Jul 16, 2013 at 9:12 PM, Konrad Rzeszutek Wilk <
> konrad.wilk@xxxxxxxxxx> wrote:
>
> > On Wed, Jul 10, 2013 at 01:36:44AM +0530, Rushikesh Jadhav wrote:
> > > Sorry about delayed response but I've again got hit by this magic number
> > 10.
> > >
> > > While reading and doing more work on subject topic I found a 2 year older
> > > commit which gives some clue.
> > >
> > https://github.com/torvalds/linux/commit/d312ae878b6aed3912e1acaaf5d0b2a9d08a4f11
> > >
> > > It says that the reserved low memory defaults to 1/32 of total RAM so I
> > > think EXTRA_MEM_RATIO upto 32 should be ok but has no clue for the number
> > > 10.
> > >
> > > Specially, Exact Commit
> > >
> > https://github.com/torvalds/linux/commit/698bb8d14a5b577b6841acaccdf5095d3b7c7389
> > > says that 10x seems like a reasonable balance but can I make a pull
> > > request to make it say 16 or 20.
> >
> > Did you look at the 'struct page' and how it is setup in the kernel?
> > Or rather, how much space it consumes?
> >
>
> Hi Konrad,
>
> I checked the struct page but was'nt able to sum up its exact size for a PV
> kernel but it does go in lowmem. I did something else to tackle the
> EXTRA_MEM_RATIO problem for me.

What exactly is the problem statement?

Currently default centos6 kerne 2.6.32 does not balloon beyond its start memory. This is a known bug in centos 6.
Alternative is to use upstream kernel from 3.X and with CentOS it seems 3.4.50 where as Debian/Ubuntu 3.2
CentOS 3.4.50 has all required support to run as dom0 kernel. I think you did most of this work :-)  Thanks.

3.X kernel servers as PV guest without any modification but for memory ballooning they are getting stuck at 10x and this 10 is coming from EXTRA_MEM_RATIO So, I'm finding source of it and workarounds for this.
 
>
> There are few situations
> 1. PV 3.4.50 kernel does not know about static max mem for domain & it
> always starts with base memory

It does not? There aren't any hypercalls to figure this out?

I have'nt seen so far ( may be an oversight ) but in linux-3.4.50/arch/x86/xen/setup.c I can only find this constant.

Below is the code

----
char * __init xen_memory_setup(void)
{
        static struct e820entry map[E820MAX] __initdata;
        unsigned long max_pfn = xen_start_info->nr_pages;
----

here max_pfn is initialized with base memory and extra_pages are calculated/added to map in multiples of it.

 

> 2. The scalability of domain is decided by this EXTRA_MEM_RATIO which is =
> 10 as default.
> 3. 10x scalability is always there irrespective of max mem (even if base
> mem = max mem).  Because its pragma #define EXTRA_MEM_RATIO (10)
> 4. To achieve 10x scalability the guest kernel has to make page table
> entries and looses considerable amount of RAM. e.g on Debian guest with
> base & max mem = 512MB, for EXTRA_MEM_RATIO=10 free command shows 327MB
> total memory so a loss of 512MB - 327MB = 185MB
> on same Debian with base & max mem = 512MB, for EXTRA_MEM_RATIO=1 free
> shows 485MB total memory so a loss of 512MB - 485MB = 27MB only.
>
> Now to avoid this problem I made extra_mem_ratio as a boot kernel param and
> now I can customize the "extra_mem_ratio" variable in grub.cfg depending on
> my expected scalability. e.g.
>
> kernel /vmlinuz-3.4.50-8.el6.x86_64 ro root=/dev/mapper/vg_94762034-lv_root
> rd_LVM_LV=vg_94762034/lv_swap rd_NO_LUKS LANG=en_US.UTF-8
> rd_LVM_LV=vg_94762034/lv_root  KEYTABLE=us console=hvc0 rd_NO_MD quiet
> SYSFONT=latarcyrheb-sun16 rhgb crashkernel=auto *extra_mem_ratio=4* rd_NO_DM
>
> There is no need to recompile guest kernel each time to change
> EXTRA_MEM_RATIO

Right.
>
> EXTRA_MEM_RATIO in Kernel 3.x looks like a threat for PV XEN Guests as 10
> is a magic hard coded figure for scalability.

Why not just then use the CONFIG_XEN_MEMORY_HOTPLUG mechanism which
will allocate the 'strcut page' within the new added memory regions?


Do you mean CONFIG_XEN_BALLOON_MEMORY_HOTPLUG ?  as I cant find CONFIG_XEN_MEMORY_HOTPLUG.
I'm rebuilding it with CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y and will post the results soon.
 
>
> Your views please ?
>
> With reference to highmem and lowmem, I found that the lowmem is kernel
> space and highmem is userspace. This means that the available RAM is
> divided and memory page structures are filled in lowmem which could be 1/3
> of base memory. So for bigger scalability, lowmem would be filled with
> pages only to address the scalability.

Right, but this problem affects _only_ 32-bit guests. 64-bit don't have
a highmem. Everything is in 'lowmen'.

ok.
 
>
>
> > >
> > > Any ideas ?
> > >
> > >
> > > On Mon, Jun 3, 2013 at 11:20 PM, Konrad Rzeszutek Wilk <
> > > konrad.wilk@xxxxxxxxxx> wrote:
> > >
> > > > On Mon, Jun 03, 2013 at 09:58:36PM +0530, Rushikesh Jadhav wrote:
> > > > > On Mon, Jun 3, 2013 at 5:40 PM, Konrad Rzeszutek Wilk <
> > > > > konrad.wilk@xxxxxxxxxx> wrote:
> > > > >
> > > > > > On Sun, Jun 02, 2013 at 02:57:11AM +0530, Rushikesh Jadhav wrote:
> > > > > > > Hi guys,
> > > > > > >
> > > > > > > Im fairly new to the Xen Development & trying to understand
> > > > ballooning.
> > > > > >
> > > > > > OK.
> > > > > > >
> > > > > > > While compiling a DomU kernel I'm trying to understand the e820
> > > > memory
> > > > > > map
> > > > > > > w.r.t Xen,
> > > > > > >
> > > > > > > I have modified arch/x86/xen/setup.c  EXTRA_MEM_RATIO  to 1 and
> > can
> > > > see
> > > > > > > that the guest can not balloon up more than 2GB. Below is the
> > memory
> > > > map
> > > > > > of
> > > > > > > DomU with max mem as 16GB.
> > > > > > >
> > > > > > > for EXTRA_MEM_RATIO  = 1
> > > > > > >
> > > > > > > BIOS-provided physical RAM map:
> > > > > > >  Xen: 0000000000000000 - 00000000000a0000 (usable)
> > > > > > >  Xen: 00000000000a0000 - 0000000000100000 (reserved)
> > > > > > >  Xen: 0000000000100000 - 0000000080000000 (usable)
> > > > > > >  Xen: 0000000080000000 - 0000000400000000 (unusable)
> > > > > > > NX (Execute Disable) protection: active
> > > > > > > DMI not present or invalid.
> > > > > > > e820 update range: 0000000000000000 - 0000000000010000 (usable)
> > ==>
> > > > > > > (reserved)
> > > > > > > e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
> > > > > > > No AGP bridge found
> > > > > > > last_pfn = 0x80000 max_arch_pfn = 0x400000000
> > > > > > > initial memory mapped : 0 - 0436c000
> > > > > > > Base memory trampoline at [ffff88000009b000] 9b000 size 20480
> > > > > > > init_memory_mapping: 0000000000000000-0000000080000000
> > > > > > >  0000000000 - 0080000000 page 4k
> > > > > > > kernel direct mapping tables up to 80000000 @ bfd000-1000000
> > > > > > > xen: setting RW the range fd6000 - 1000000
> > > > > > >
> > > > > > >
> > > > > > > for EXTRA_MEM_RATIO  = 10 the map is like below and can balloon
> > up to
> > > > > > 16GB.
> > > > > > >
> > > > > >
> > > > > > Right, that is the default value.
> > > > > >
> > > > >
> > > > > What are the good or bad effects of making it 20.
> > > > > I found that increasing this number causes base memory to fill up (
> > in
> > > > many
> > > > > MBs ) and increases the range of Base~Max.
> > > >
> > > > That sounds about right. I would suggest you look in the free Linux
> > > > kernel book and look at the section that deals with 'struct page',
> > > > Lowmem and highmen. That should explain what is consuming the lowmem
> > > > memory.
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > > > BIOS-provided physical RAM map:
> > > > > > >  Xen: 0000000000000000 - 00000000000a0000 (usable)
> > > > > > >  Xen: 00000000000a0000 - 0000000000100000 (reserved)
> > > > > > >  Xen: 0000000000100000 - 0000000400000000 (usable)
> > > > > > > NX (Execute Disable) protection: active
> > > > > > > DMI not present or invalid.
> > > > > > > e820 update range: 0000000000000000 - 0000000000010000 (usable)
> > ==>
> > > > > > > (reserved)
> > > > > > > e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
> > > > > > > No AGP bridge found
> > > > > > > last_pfn = 0x400000 max_arch_pfn = 0x400000000
> > > > > > > last_pfn = 0x100000 max_arch_pfn = 0x400000000
> > > > > > > initial memory mapped : 0 - 0436c000
> > > > > > > Base memory trampoline at [ffff88000009b000] 9b000 size 20480
> > > > > > > init_memory_mapping: 0000000000000000-0000000100000000
> > > > > > >  0000000000 - 0100000000 page 4k
> > > > > > > kernel direct mapping tables up to 100000000 @ 7fb000-1000000
> > > > > > > xen: setting RW the range fd6000 - 1000000
> > > > > > > init_memory_mapping: 0000000100000000-0000000400000000
> > > > > > >  0100000000 - 0400000000 page 4k
> > > > > > > kernel direct mapping tables up to 400000000 @ 601ef000-62200000
> > > > > > > xen: setting RW the range 619fb000 - 62200000
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Can someone please help me understand its behavior and
> > importance ?
> > > > > >
> > > > > > Here is the explanation from the code:
> > > > > >
> > > > > > 384         /*
> > > > > > 385          * Clamp the amount of extra memory to a
> > EXTRA_MEM_RATIO
> > > > > > 386          * factor the base size.  On non-highmem systems, the
> > base
> > > > > > 387          * size is the full initial memory allocation; on
> > highmem
> > > > it
> > > > > > 388          * is limited to the max size of lowmem, so that it
> > doesn't
> > > > > > 389          * get completely filled.
> > > > > > 390          *
> > > > > >
> > > > >
> > > > > "highmem is limited to the max size of lowmem"
> > > > > Does it mean "1/3" or maximum possible memory or startup memory ?
> > > >
> > > > For my answer to make sense I would steer you toward looking what
> > > > highmem and lowmem are. That should give you an idea of the memory
> > > > limitations 32-bit kernels have.
> > > > > In what cases it can get completely filled ?
> > > >
> > > > Yes.
> > > > >
> > > > >
> > > > > > 391          * In principle there could be a problem in lowmem
> > systems
> > > > if
> > > > > > 392          * the initial memory is also very large with respect
> > to
> > > > > > 393          * lowmem, but we won't try to deal with that here.
> > > > > > 394          */
> > > > > > 395         extra_pages = min(EXTRA_MEM_RATIO * min(max_pfn,
> > > > > > PFN_DOWN(MAXMEM)),
> > > > > > 396                           extra_pages);
> > > > > >
> > > > > > I am unclear on what you are exactly want to learn? The hypercalls
> > or
> > > > how
> > > > > > the balloning happens? IF so I would recommend you work backwards -
> > > > look
> > > > > > at the balloon driver itself, how it decreases/increases the
> > memory,
> > > > and
> > > > > > what
> > > > > > data structures it uses to figure out how much memory it can use.
> > Then
> > > > you
> > > > > > can go back to the setup.c to get an idea on how the E820 is being
> > > > created.
> > > > > >
> > > > > >
> > > > > Thanks. I'll check more from drivers/xen/balloon.c
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > > _______________________________________________
> > > > > > > Xen-devel mailing list
> > > > > > > Xen-devel@xxxxxxxxxxxxx
> > > > > > > http://lists.xen.org/xen-devel
> > > > > >
> > > > > >
> > > >
> >

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.