[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Linux PV/PVH domU crash on (guest) resume from suspend



On 19.02.2021 13:48, Jürgen Groß wrote:
> On 17.02.21 14:48, Marek Marczykowski-Górecki wrote:
>> On Wed, Feb 17, 2021 at 07:51:42AM +0100, Jürgen Groß wrote:
>>> On 17.02.21 06:12, Marek Marczykowski-Górecki wrote:
>>>> Hi,
>>>>
>>>> I'm observing Linux PV/PVH guest crash when I resume it from sleep. I do
>>>> this with:
>>>>
>>>>       virsh -c xen dompmsuspend <vmname> mem
>>>>       virsh -c xen dompmwakeup <vmname>
>>>>
>>>> But it's possible to trigger it with plain xl too:
>>>>
>>>>       xl save -c <vmname> <some-file>
>>>>
>>>> The same on HVM works fine.
>>>>
>>>> This is on Xen 4.14.1, and with guest kernel 5.4.90, the same happens
>>>> with 5.4.98. Dom0 kernel is the same, but I'm not sure if that's
>>>> relevant here. I can reliably reproduce it.
>>>
>>> This is already on my list of issues to look at.
>>>
>>> The problem seems to be related to the XSA-332 patches. You could try
>>> the patches I've sent out recently addressing other fallout from XSA-332
>>> which _might_ fix this issue, too:
>>>
>>> https://patchew.org/Xen/20210211101616.13788-1-jgross@xxxxxxxx/
>>
>> Thanks for the patches. Sadly it doesn't change anything - I get exactly
>> the same crash. I applied that on top of 5.11-rc7 (that's what I had
>> handy). If you think there may be a difference with the final 5.11 or
>> another branch, please let me know.
>>
> 
> Some more tests reveal that this seems to be s hypervisor regression.
> I can reproduce the very same problem with a 4.12 kernel from 2019.
> 
> It seems as if the EVTCHNOP_init_control hypercall is returning
> -EINVAL when the domain is continuing to run after the suspend
> hypercall (in contrast to the case where a new domain has been created
> when doing a "xl restore").

But when you resume the same domain, the kernel isn't supposed to
call EVTCHNOP_init_control, as that's a one time operation (per
vCPU, and unless EVTCHNOP_reset was called of course). In the
hypervisor map_control_block() has (always had) as its first step

    if ( v->evtchn_fifo->control_block )
        return -EINVAL;

Re-setup is needed only when resuming in a new domain.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.