[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom



On 05.06.2020 13:18, Marek Marczykowski-Górecki wrote:
> On Fri, Jun 05, 2020 at 11:38:17AM +0200, Jan Beulich wrote:
>> On 04.06.2020 03:46, Marek Marczykowski-Górecki wrote:
>>> Hi,
>>>
>>> (continuation of a thread from #xendevel)
>>>
>>> During system shutdown quite often I hit infinite stream of errors like
>>> this:
>>>
>>>     (XEN) d3v0 Weird PIO status 1, port 0xb004 read 0xffff
>>>     (XEN) domain_crash called from io.c:178
>>>
>>> This is all running on Xen 4.13.0 (I think I've got this with 4.13.1
>>> too), nested within KVM. The KVM part means everything is very slow, so
>>> various race conditions are much more likely to happen.
>>>
>>> It started happening not long ago, and I'm pretty sure it's about
>>> updating to qemu 4.2.0 (in linux stubdom), previous one was 3.0.0.
>>>
>>> Thanks to Andrew and Roger, I've managed to collect more info.
>>>
>>> Context:
>>>     dom0: pv
>>>     dom1: hvm
>>>     dom2: stubdom for dom1
>>>     dom3: hvm
>>>     dom4: stubdom for dom3
>>>     dom5: pvh
>>>     dom6: pvh
>>>
>>> It starts I think ok:
>>>
>>>     (XEN) hvm.c:1620:d6v0 All CPUs offline -- powering off.
>>>     (XEN) d3v0 handle_pio port 0xb004 read 0x0000
>>>     (XEN) d3v0 handle_pio port 0xb004 read 0x0000
>>>     (XEN) d3v0 handle_pio port 0xb004 write 0x0001
>>>     (XEN) d3v0 handle_pio port 0xb004 write 0x2001
>>>     (XEN) d4v0 XEN_DMOP_remote_shutdown domain 3 reason 0
>>
>> I can't seem to be able to spot the call site of this, in any of
>> qemu, libxl, or libxc. I'm in particular curious as to the further
>> actions taken on the domain after this was invoked: Do any ioreq
>> servers get unregistered immediately (which I think would be a
>> problem)?
> 
> It is here:
> https://github.com/qemu/qemu/blob/master/hw/i386/xen/xen-hvm.c#L1539
> 
> I think it's called from cpu_handle_ioreq(), and I think the request
> state is set to STATE_IORESP_READY before exiting (unless there is some
> exit() hidden in another function used there).

Thanks. There's nothing in surrounding code there that would unregister
an ioreq server. But as said elsewhere, I don't know qemu very well,
and hence I may easily overlook how else one may get unregistered
prematurely.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.