[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetBSD dom0 PVH: hardware interrupts stalls



On 27.11.2020 11:59, Roger Pau Monné wrote:
> On Thu, Nov 26, 2020 at 06:20:34PM +0100, Manuel Bouyer wrote:
>> On Thu, Nov 26, 2020 at 04:09:37PM +0100, Roger Pau Monné wrote:
>>>>
>>>> Oh, that's actually very useful. The interrupt is being constantly
>>>> injected from the hardware and received by Xen, it's just not then
>>>> injected into dom0 - that's the bit we are missing. Let me look into
>>>> adding some more debug to that path, hopefully it will tell us where
>>>> things are getting blocked.
>>>
>>> So I have yet one more patch for you to try, this one has more
>>> debugging and a slight change in the emulated IO-APIC behavior.
>>> Depending on the result I might have to find a way to mask the
>>> interrupt so it doesn't spam the whole buffer in order for us to see
>>> exactly what triggered this scenario you are in.
>>
>> OK, here it is:
>> http://www-soc.lip6.fr/~bouyer/xen-log9.txt
>>
>> I had to restart from a clean source tree to apply this patch, so to make
>> sure we're in sync I attached the diff from my sources
> 
> I'm quite confused about why your trace don't even get into
> hvm_do_IRQ_dpci, so I've added some more debug info.

Are you sure it doesn't? I'm somewhat worried we may ...

> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -828,6 +828,9 @@ int hvm_do_IRQ_dpci(struct domain *d, struct pirq *pirq)
>           !pirq_dpci || !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
>          return 0;
>  
> +    if ( pirq->pirq == TRACK_IRQ )
> +        debugtrace_printk("hvm_do_IRQ_dpci irq %u\n", pirq->pirq);

... take the early exit path up from here. I still wouldn't be
able to say why that is, because when I looked yesterday I
think I found all failure paths leading to HVM_IRQ_DPCI_MAPPED
remaining clear to have a log message associated, while Manuel
said there were no other log messages.

In the context of this I also started wondering whether it's
the right thing to do to start the EOI timer if the subsequent
call to send_guest_pirq() also doesn't actually send any event.
In this case the guest is effectively guaranteed to not handle
the interrupt. When the interrupt isn't shared, I think we
ought to ->end() it right away, but without unmasking it, to
unblock same or lower priority interrupts. What to do in the
shared case is less obvious to me ...

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.