[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Increasing domain memory beyond initial maxmem



On 31.03.22 14:01, Marek Marczykowski-Górecki wrote:
On Thu, Mar 31, 2022 at 08:41:19AM +0200, Juergen Gross wrote:
On 31.03.22 05:51, Marek Marczykowski-Górecki wrote:
Hi,

I'm trying to make use of CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y to increase
domain memory beyond initial maxmem, but I hit few issues.

A little context: domains in Qubes OS start with rather little memory
(400MB by default) but maxmem set higher (4GB by default). Then, there is
qmemman daemon, that adjust balloon targets for domains, based on (among
other things) demand reported by the domains themselves. There is also a
little swap, to mitigate qmemman latency (few hundreds ms at worst).
This initial memory < maxmmem in case of PVH / HVM makes use of PoD
which I'm trying to get rid of. But also, IIUC Linux will waste some
memory for bookkeeping based on maxmem, not actually usable memory.

First issue: after using `xl mem-max`, `xl mem-set` still refuses to
increase memory more than initial maxmem. That's because xl mem-max does
not update 'memory/static-max' xenstore node. This one is easy to work
around.

Then, the actual hotplug fails on the domU side with:

[   50.004734] xen-balloon: vmemmap alloc failure: order:9, 
mode:0x4cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL), 
nodemask=(null),cpuset=/,mems_allowed=0
[   50.004774] CPU: 1 PID: 34 Comm: xen-balloon Not tainted 
5.16.15-1.37.fc32.qubes.x86_64 #1
[   50.004792] Call Trace:
[   50.004799]  <TASK>
[   50.004808]  dump_stack_lvl+0x48/0x5e
[   50.004821]  warn_alloc+0x162/0x190
[   50.004832]  ? __alloc_pages+0x1fa/0x230
[   50.004842]  vmemmap_alloc_block+0x11c/0x1c5
[   50.004856]  vmemmap_populate_hugepages+0x185/0x519
[   50.004868]  vmemmap_populate+0x9e/0x16c
[   50.004878]  __populate_section_memmap+0x6a/0xb1
[   50.004890]  section_activate+0x20a/0x278
[   50.004901]  sparse_add_section+0x70/0x160
[   50.004911]  __add_pages+0xc3/0x150
[   50.004921]  add_pages+0x12/0x60
[   50.004931]  add_memory_resource+0x12b/0x320
[   50.004943]  reserve_additional_memory+0x10c/0x150
[   50.004958]  balloon_thread+0x206/0x360
[   50.004968]  ? do_wait_intr_irq+0xa0/0xa0
[   50.004978]  ? decrease_reservation.constprop.0+0x2e0/0x2e0
[   50.004991]  kthread+0x16b/0x190
[   50.005001]  ? set_kthread_struct+0x40/0x40
[   50.005011]  ret_from_fork+0x22/0x30
[   50.005022]  </TASK>

Full dmesg: https://gist.github.com/marmarek/72dd1f9dbdd63cfe479c94a3f4392b45

After the above, `free` reports correct size (1GB in this case), but
that memory seems to be unusable really. "used" is kept low, and soon
OOM-killer kicks in.

I know the initial 400MB is not much for a full Linux, with X11 etc. But
I wouldn't expect it to fail this way when _adding_ memory.

I've tried also with initial 800MB. In this case, I do not get "alloc
failure" any more, but monitoring `free`, the extra memory still doesn't
seem to be used.

Any ideas?


I can't reproduce that.

I started a guest with 8GB of memory, in the guest I'm seeing:

# uname -a
Linux linux-d1cy 5.17.0-rc5-default+ #406 SMP PREEMPT Mon Feb 21 09:31:12
CET 2022 x86_64 x86_64 x86_64 GNU/Linux
# free
         total     used      free   shared  buff/cache   available
Mem:  8178260    71628   8023300     8560       83332     8010196
Swap: 2097132        0   2097132

Then I'm raising the memory for the guest in dom0:

# xl list
Name                ID   Mem VCPUs      State   Time(s)
Domain-0             0  2634     8     r-----    1016.5
Xenstore             1    31     1     -b----       0.9
sle15sp1             3  8190     6     -b----     184.6
# xl mem-max 3 10000
# xenstore-write /local/domain/3/memory/static-max 10240000
# xl mem-set 3 10000
# xl list
Name                ID   Mem VCPUs      State   Time(s)
Domain-0             0  2634     8     r-----    1018.5
Xenstore             1    31     1     -b----       1.0
sle15sp1             3 10000     6     -b----     186.7

In the guest I get now:

# free
         total     used     free   shared  buff/cache   available
Mem: 10031700   110904  9734172     8560      186624     9814344
Swap: 2097132        0  2097132

And after using lots of memory via a ramdisk:

# free
         total     used     free   shared  buff/cache   available
Mem: 10031700   116660  1663840  7181776     8251200     2635372
Swap: 2097132        0  2097132

You can see buff/cache is now larger than the initial total memory
and free is lower than the added memory amount.

Hmm, I have a different behavior:

I'm starting with 800M

# uname -a
Linux personal 5.16.15-1.37.fc32.qubes.x86_64 #1 SMP PREEMPT Tue Mar 22 
12:59:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# free -m
               total        used        free      shared  buff/cache   available
Mem:            740         209         278           2         252         415
Swap:          1023           0        1023

Then raising to ~2GB:

[root@dom0 ~]# xl list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0  4082     6     r-----  184271.3
(...)
personal                                    21   800     2     -b----       4.8
[root@dom0 ~]# xl mem-max personal 2048
[root@dom0 ~]# xenstore-write /local/domain/$(xl domid 
personal)/memory/static-max $((2048*1024))
[root@dom0 ~]# xl mem-set personal 2000
[root@dom0 ~]# xenstore-ls -fp /local/domain/$(xl domid personal)/memory
/local/domain/21/memory/static-max = "2097152"   (n0,r21)
/local/domain/21/memory/target = "2048001"   (n0,r21)
/local/domain/21/memory/videoram = "-1"   (n0,r21)

And then observe inside domU:
[root@personal ~]# free -m
               total        used        free      shared  buff/cache   available
Mem:           1940         235        1452           2         252        1585
Swap:          1023           0        1023

So far so good. But when trying to actually use it, it doesn't work:

[root@personal ~]# free -m
               total        used        free      shared  buff/cache   available
Mem:           1940         196        1240         454         503        1206
Swap:          1023         472         551

As you can see, all the new memory is still in "free", and swap is used
instead.

Hmm, weird.

Maybe some kernel config differences, or other udev rules (memory onlining
is done via udev in my guest)?

I'm seeing:

# zgrep MEMORY_HOTPLUG /proc/config.gz
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG=y
# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y
CONFIG_XEN_MEMORY_HOTPLUG_LIMIT=512

The relevant udev rule seems to be:

SUBSYSTEM=="memory", ACTION=="add", PROGRAM=="/bin/sh -c '/usr/bin/systemd-detect-virt || :'", RESULT!="zvm", ATTR{state}=="offline", \
  ATTR{state}="online"

What type of guest are you using? Mine was a PVH guest.

There is also /proc/meminfo (state before filling ramdisk), if that
would give some hints:
[root@personal ~]# cat /proc/meminfo

...

No, I don't think this is helping. At least not me.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.