Xen project Mailing List

Re: [PATCH] x86/dpci: remove the dpci EOI timer

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

From: Jason Andryuk <jandryuk@xxxxxxxxx>

Date: Wed, 13 Jan 2021 13:34:14 -0500

Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Paul Durrant <paul@xxxxxxx>, "Cooper, Andrew" <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Wed, 13 Jan 2021 18:34:34 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Jan 13, 2021 at 1:06 PM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote: > > On Wed, Jan 13, 2021 at 10:48:52AM -0500, Jason Andryuk wrote: > > On Wed, Jan 13, 2021 at 8:11 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> > > wrote: > > > > > > On Wed, Jan 13, 2021 at 06:21:03AM +0000, Tian, Kevin wrote: > > > > > From: Roger Pau Monne <roger.pau@xxxxxxxxxx> > > > > > Sent: Wednesday, January 13, 2021 1:33 AM > > > > > > > > > > Current interrupt pass though code will setup a timer for each > > > > > interrupt injected to the guest that requires an EOI from the guest. > > > > > Such timer would perform two actions if the guest doesn't EOI the > > > > > interrupt before a given period of time. The first one is deasserting > > > > > the virtual line, the second is perform an EOI of the physical > > > > > interrupt source if it requires such. > > > > > > > > > > The deasserting of the guest virtual line is wrong, since it messes > > > > > with the interrupt status of the guest. It's not clear why this was > > > > > odne in the first place, it should be the guest the one to EOI the > > > > > interrupt and thus deassert the line. > > > > > > > > > > Performing an EOI of the physical interrupt source is redundant, since > > > > > there's already a timer that takes care of this for all interrupts, > > > > > not just the HVM dpci ones, see irq_guest_action_t struct eoi_timer > > > > > field. > > > > > > > > > > Since both of the actions performed by the dpci timer are not > > > > > required, remove it altogether. > > > > > > > > > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > > > > --- > > > > > As with previous patches, I'm having a hard time figuring out why this > > > > > was required in the first place. I see no reason for Xen to be > > > > > deasserting the guest virtual line. There's a comment: > > > > > > > > > > /* > > > > > * Set a timer to see if the guest can finish the interrupt or not. > > > > > For > > > > > * example, the guest OS may unmask the PIC during boot, before the > > > > > * guest driver is loaded. hvm_pci_intx_assert() may succeed, but the > > > > > * guest will never deal with the irq, then the physical interrupt > > > > > line > > > > > * will never be deasserted. > > > > > */ > > > > > > > > > > Did this happen because the device was passed through in a bogus state > > > > > where it would generate interrupts without the guest requesting > > > > > > > > It could be a case where two devices share the same interrupt line and > > > > are assigned to different domains. In this case, the interrupt activity > > > > of > > > > two devices interfere with each other. > > > > > > This would also seem to be problematic if the device decides to use > > > MSI instead of INTx, but due to the shared nature of the INTx line we > > > would continue to inject INTx (generated from another device not > > > passed through to the guest) to the guest even if the device has > > > switched to MSI. > > > > > > > > > > > > > Won't the guest face the same issues when booted on bare metal, and > > > > > thus would already have the means to deal with such issues? > > > > > > > > The original commit was added by me in ~13yrs ago (0f843ba00c95) > > > > when enabling Xen in a client virtualization environment where interrupt > > > > sharing is popular. > > > > > > Thanks, the reference to the above commit is helpful, I wasn't able to > > > find it and it contains a comment that I think has been lost, which > > > provides the background on why this was added. > > > > > > > I believe above comment was recorded for a real > > > > problem at the moment (deassert resets the intx line to unblock further > > > > interrupts). But I'm not sure whether it is still the case after both > > > > Xen and > > > > guest OS have changed a lot. At least some test from people who > > > > still use Xen in shared interrupt scenario would be helpful. Or, if such > > > > usage is already niche, maybe we can consider disallow passing through > > > > devices which share the same interrupt line to different domains and > > > > then safely remove this dpci EOI trick. > > > > > > So the deassert done by timeout only deasserts the virtual line, but > > > doesn't for example clear the IRR bit from the vIO-APIC pin, which > > > will cause further interrupts to not be delivered anyway until a > > > proper EOI (or a switch to trigger mode) is done to the pin. > > > > > > I think it's going to be complicated for me to find a system that has > > > two devices I can passthrough sharing the same GSI. > > > > I have some laptops running OpenXT where the USB controller and NIC > > share an interrupt, and I assign them to different domains. Qubes > > would hit this as well. > > Is there any chance you could try the patch and see if you can hit the > issue it was trying to fix? Would testing a backport to 4.12 be useful? There were some file renames, but it looks to apply. The only difference is the 4.12 hvm_pirq_eoi hunk still has `(ent && ent->fields.mask) || `. Maybe backport commit eb298f32fac5ac "x86/dpci: EOI interrupt regardless of its masking status" as well? Testing staging would take a little longer to make a machine available. I guess I'd also need to disable MSI for the two devices to ensure they are both using the GSI? Regards, Jason

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.