[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/pod: Do not fragment PoD memory allocations

On 26.01.2021 18:51, Elliott Mitchell wrote:
> On Tue, Jan 26, 2021 at 12:08:15PM +0100, Jan Beulich wrote:
>> On 25.01.2021 18:46, Elliott Mitchell wrote:
>>> On Mon, Jan 25, 2021 at 10:56:25AM +0100, Jan Beulich wrote:
>>>> On 24.01.2021 05:47, Elliott Mitchell wrote:
>>>>> ---
>>>>> Changes in v2:
>>>>> - Include the obvious removal of the goto target.  Always realize you're
>>>>>   at the wrong place when you press "send".
>>>> Please could you also label the submission then accordingly? I
>>>> got puzzled by two identically titled messages side by side,
>>>> until I noticed the difference.
>>> Sorry about that.  Would you have preferred a third message mentioning
>>> this mistake?
>> No. But I'd have expected v2 to have its subject start with
>> "[PATCH v2] ...", making it relatively clear that one might
>> save looking at the one labeled just "[PATCH] ...".
> Yes, in fact I spotted this just after.  I was in a situation of, "does
> this deserve sending an additional message out?"  (ugh, yet more damage
> from that issue...)
>>>>> I'm not including a separate cover message since this is a single hunk.
>>>>> This really needs some checking in `xl`.  If one has a domain which
>>>>> sometimes gets started on different hosts and is sometimes modified with
>>>>> slightly differing settings, one can run into trouble.
>>>>> In this case most of the time the particular domain is most often used
>>>>> PV/PVH, but every so often is used as a template for HVM.  Starting it
>>>>> HVM will trigger PoD mode.  If it is started on a machine with less
>>>>> memory than others, PoD may well exhaust all memory and then trigger a
>>>>> panic.
>>>>> `xl` should likely fail HVM domain creation when the maximum memory
>>>>> exceeds available memory (never mind total memory).
>>>> I don't think so, no - it's the purpose of PoD to allow starting
>>>> a guest despite there not being enough memory available to
>>>> satisfy its "max", as such guests are expected to balloon down
>>>> immediately, rather than triggering an oom condition.
>>> Even Qemu/OVMF is expected to handle ballooning for a *HVM* domain?
>> No idea how qemu comes into play here. Any preboot environment
>> aware of possibly running under Xen of course is expected to
>> tolerate running with maxmem > memory (or clearly document its
>> inability, in which case it may not be suitable for certain
>> use cases). For example, I don't see why a preboot environment
>> would need to touch all of the memory in a VM, except maybe
>> for the purpose of zeroing it (which PoD can deal with fine).
> I'm reading that as your answer to the above question is "yes".

For the OVMF part of your question.

>>>>> For example try a domain with the following settings:
>>>>> memory = 8192
>>>>> maxmem = 2147483648
>>>>> If type is PV or PVH, it will likely boot successfully.  Change type to
>>>>> HVM and unless your hardware budget is impressive, Xen will soon panic.
>>>> Xen will panic? That would need fixing if so. Also I'd consider
>>>> an excessively high maxmem (compared to memory) a configuration
>>>> error. According to my experiments long, long ago I seem to
>>>> recall that a factor beyond 32 is almost never going to lead to
>>>> anything good, irrespective of guest type. (But as said, badness
>>>> here should be restricted to the guest; Xen itself should limp
>>>> on fine.)
>>> I'll confess I haven't confirmed the panic is in Xen itself.  Problem is
>>> when this gets triggered, by the time the situation is clear and I can
>>> get to the console the computer is already restarting, thus no error
>>> message has been observed.
>> If the panic isn't in Xen itself, why would the computer be
>> restarting?
> I was thinking there was a possibility it is actually Domain 0 which is
> panicing.

Which wouldn't be any different in how it would need dealing

>>> This is most certainly a configuration error.  Problem is this is a very
>>> small delta between a perfectly valid configuration and the one which
>>> reliably triggers a panic.
>>> The memory:maxmem ratio isn't the problem.  My example had a maxmem of
>>> 2147483648 since that is enough to exceed the memory of sub-$100K
>>> computers.  The crucial features are maxmem >= machine memory,
>>> memory < free memory (thus potentially bootable PV/PVH) and type = "hvm".
>>> When was the last time you tried running a Xen machine with near zero
>>> free memory?  Perhaps in the past Xen kept the promise of never panicing
>>> on memory exhaustion, but this feels like this hasn't held for some time.
>> Such bugs needs fixing. Which first of all requires properly
>> pointing them out. A PoD guest misbehaving when there's not
>> enough memory to fill its pages (i.e. before it manages to
>> balloon down) is expected behavior. If you can't guarantee the
>> guest ballooning down quickly enough, don't configure it to
>> use PoD. A PoD guest causing a Xen crash, otoh, is very likely
>> even a security issue. Which needs to be treated as such: It
>> needs fixing, not avoiding by "curing" one of perhaps many
>> possible sources.
> Okay, this has been reliably reproducing for a while.  I had originally
> thought it was a problem of HVM plus memory != maxmem, but the
> non-immediate restart disagrees with that assessment.

I guess it's not really clear what you mean with this, but anyway:
The important aspect here that I'm concerned about is what the
manifestations of the issue are. I'm still hoping that you would
provide such information, so we can then start thinking about how
to solve these. If, of course, there is anything worse than the
expected effects which use of PoD can have on the guest itself.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.