[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/dpci: remove the dpci EOI timer



On Wed, Jan 13, 2021 at 1:06 PM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Wed, Jan 13, 2021 at 10:48:52AM -0500, Jason Andryuk wrote:
> > On Wed, Jan 13, 2021 at 8:11 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> 
> > wrote:
> > >
> > > On Wed, Jan 13, 2021 at 06:21:03AM +0000, Tian, Kevin wrote:
> > > > > From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> > > > > Sent: Wednesday, January 13, 2021 1:33 AM
> > > > >
> > > > > Current interrupt pass though code will setup a timer for each
> > > > > interrupt injected to the guest that requires an EOI from the guest.
> > > > > Such timer would perform two actions if the guest doesn't EOI the
> > > > > interrupt before a given period of time. The first one is deasserting
> > > > > the virtual line, the second is perform an EOI of the physical
> > > > > interrupt source if it requires such.
> > > > >
> > > > > The deasserting of the guest virtual line is wrong, since it messes
> > > > > with the interrupt status of the guest. It's not clear why this was
> > > > > odne in the first place, it should be the guest the one to EOI the
> > > > > interrupt and thus deassert the line.
> > > > >
> > > > > Performing an EOI of the physical interrupt source is redundant, since
> > > > > there's already a timer that takes care of this for all interrupts,
> > > > > not just the HVM dpci ones, see irq_guest_action_t struct eoi_timer
> > > > > field.
> > > > >
> > > > > Since both of the actions performed by the dpci timer are not
> > > > > required, remove it altogether.
> > > > >
> > > > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > > > > ---
> > > > > As with previous patches, I'm having a hard time figuring out why this
> > > > > was required in the first place. I see no reason for Xen to be
> > > > > deasserting the guest virtual line. There's a comment:
> > > > >
> > > > > /*
> > > > >  * Set a timer to see if the guest can finish the interrupt or not. 
> > > > > For
> > > > >  * example, the guest OS may unmask the PIC during boot, before the
> > > > >  * guest driver is loaded. hvm_pci_intx_assert() may succeed, but the
> > > > >  * guest will never deal with the irq, then the physical interrupt 
> > > > > line
> > > > >  * will never be deasserted.
> > > > >  */
> > > > >
> > > > > Did this happen because the device was passed through in a bogus state
> > > > > where it would generate interrupts without the guest requesting
> > > >
> > > > It could be a case where two devices share the same interrupt line and
> > > > are assigned to different domains. In this case, the interrupt activity 
> > > > of
> > > > two devices interfere with each other.
> > >
> > > This would also seem to be problematic if the device decides to use
> > > MSI instead of INTx, but due to the shared nature of the INTx line we
> > > would continue to inject INTx (generated from another device not
> > > passed through to the guest) to the guest even if the device has
> > > switched to MSI.
> > >
> > > > >
> > > > > Won't the guest face the same issues when booted on bare metal, and
> > > > > thus would already have the means to deal with such issues?
> > > >
> > > > The original commit was added by me in ~13yrs ago (0f843ba00c95)
> > > > when enabling Xen in a client virtualization environment where interrupt
> > > > sharing is popular.
> > >
> > > Thanks, the reference to the above commit is helpful, I wasn't able to
> > > find it and it contains a comment that I think has been lost, which
> > > provides the background on why this was added.
> > >
> > > > I believe above comment was recorded for a real
> > > > problem at the moment (deassert resets the intx line to unblock further
> > > > interrupts). But I'm not sure whether it is still the case after both 
> > > > Xen and
> > > > guest OS have changed a lot. At least some test from people who
> > > > still use Xen in shared interrupt scenario would be helpful. Or, if such
> > > > usage is already niche, maybe we can consider disallow passing through
> > > > devices which share the same interrupt line to different domains and
> > > > then safely remove this dpci EOI trick.
> > >
> > > So the deassert done by timeout only deasserts the virtual line, but
> > > doesn't for example clear the IRR bit from the vIO-APIC pin, which
> > > will cause further interrupts to not be delivered anyway until a
> > > proper EOI (or a switch to trigger mode) is done to the pin.
> > >
> > > I think it's going to be complicated for me to find a system that has
> > > two devices I can passthrough sharing the same GSI.
> >
> > I have some laptops running OpenXT where the USB controller and NIC
> > share an interrupt, and I assign them to different domains.  Qubes
> > would hit this as well.
>
> Is there any chance you could try the patch and see if you can hit the
> issue it was trying to fix?

Would testing a backport to 4.12 be useful?  There were some file
renames, but it looks to apply.  The only difference is the 4.12
hvm_pirq_eoi hunk still has `(ent && ent->fields.mask) || `.  Maybe
backport commit eb298f32fac5ac "x86/dpci: EOI interrupt regardless of
its masking status" as well?

Testing staging would take a little longer to make a machine available.

I guess I'd also need to disable MSI for the two devices to ensure
they are both using the GSI?

Regards,
Jason



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.