[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: domU memory exceeded domU memory exceeded Re: domU memory exceeded domU memory exceeded =?=> spontaneous reboots



----------------------- Original message -----------------------
From: Mike <debian@xxxxxxxxxxxxxxxxxxxxx>
To: xen-users@xxxxxxxxxxxxxxxxxxxx
Date: Wed, 04 Dec 2024 23:21:38 +0100
----------------------------------------------------------------

One of the first commands that I try to execute in one
domU, which causes activity in the other domU, triggers a reboot.

So I pinned each of the domUs' vCPUs: one with `cpus = "4-5"` and the other
with `cpus = "all,^0-5"`. Also reduced `vcpus` and `maxvcpus` in the latter,
to avoid oversubscription. Testing...

I overcommit my CPUs probably more then I should and never had a reboot caused by this. One of my systems currently has 5 times as many vCPUs assigned as physical cores. (This one is scheduled to be replaced)
Apart from being a bit slow, it is rock solid.

So what "solved" the issue, or at least seems to avoid it for now, is increasing
the `memory` and `maxmem` of the first domU to `4096` and the `vcpus` and
`maxvcpus` to `3`. I probably didn't even need to increase the vCPUs; I think
it's a memory constraint issue.

How do you overcommit memory? In default config, it won't let me start a domU if the memory is filled. Except I do fix the dom0 to never allow the memory to be removed using: "dom0_mem=4096M,max:4096M" in the xen-boot options.

If you do allow the memory to be removed, you risk the OOM killer to remove something important. I would recommend monitoring the dom0 to see what happens there.


This seems likely to be a Xen bug. An attempt to overuse memory from inside
a domU shouldn't reboot the hardware.

It doesn't. The domU will crash, but won't kill the hardware. Unless you configured the domU to be privileged enough.

--
Joost




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.