Xen project Mailing List

Re: [Xen-devel] [PATCH v3] interrupts: allow guest to set/clear MSI-X mask bit

To: Joby Poriyath <joby.poriyath@xxxxxxxxxx>

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Wed, 14 Aug 2013 17:36:26 +0100

Cc: malcolm.crossley@xxxxxxxxxx, keir@xxxxxxx, JBeulich@xxxxxxxx, xen-devel@xxxxxxxxxxxxx

Delivery-date: Wed, 14 Aug 2013 16:36:47 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 14/08/13 17:30, Andrew Cooper wrote: > On 14/08/13 17:18, Joby Poriyath wrote: >> Guest needs the ability to enable and disable MSI-X interrupts >> by setting the MSI-X control bit, for a passed-through device. >> Guest is allowed to write MSI-X mask bit only if Xen *thinks* >> that mask is clear (interrupts enabled). If the mask is set by >> Xen (interrupts disabled), writes to mask bit by the guest is >> ignored. >> >> Currently, a write to MSI-X mask bit by the guest is silently >> ignored. >> >> A likely scenario is where we have a 82599 SR-IOV nic passed >> through to a guest. From the guest if you do >> >> ifconfig <ETH_DEV> down >> ifconfig <ETH_DEV> up >> >> the interrupts remain masked. On VF reset, the mask bit is set >> by the controller. At this point, Xen is not aware that mask is set. >> However, interrupts are enabled by VF driver by clearing the mask >> bit by writing directly to BAR3 region containing the MSI-X table. >> >> From dom0, we can verify that >> interrupts are being masked using 'xl debug-keys M'. >> >> Initially, guest was allowed to modify MSI-X bit. >> Later this behaviour was changed. >> See changeset 74c213c506afcd74a8556dd092995fd4dc38b225. >> >> Signed-off-by: Joby Poriyath <joby.poriyath@xxxxxxxxxx> >> --- >> xen/arch/x86/hvm/vmsi.c | 47 >> +++++++++++++++++++++++++++++++++-------------- >> 1 file changed, 33 insertions(+), 14 deletions(-) >> >> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c >> index 36de312..21421cc 100644 >> --- a/xen/arch/x86/hvm/vmsi.c >> +++ b/xen/arch/x86/hvm/vmsi.c >> @@ -169,6 +169,7 @@ struct msixtbl_entry >> uint32_t msi_ad[3]; /* Shadow of address low, high and data */ >> } gentries[MAX_MSIX_ACC_ENTRIES]; >> struct rcu_head rcu; >> + struct pirq *pirq; >> }; >> >> static DEFINE_RCU_READ_LOCK(msixtbl_rcu_lock); >> @@ -254,6 +255,9 @@ static int msixtbl_write(struct vcpu *v, unsigned long >> address, >> void *virt; >> unsigned int nr_entry, index; >> int r = X86EMUL_UNHANDLEABLE; >> + unsigned long flags; >> + struct irq_desc *desc; >> + unsigned long orig; > unsigned long flags, orig; > > To be more compact. > >> >> if ( len != 4 || (address & 3) ) >> return r; >> @@ -283,22 +287,35 @@ static int msixtbl_write(struct vcpu *v, unsigned long >> address, >> if ( !virt ) >> goto out; >> >> - /* Do not allow the mask bit to be changed. */ >> -#if 0 /* XXX >> - * As the mask bit is the only defined bit in the word, and as the >> - * host MSI-X code doesn't preserve the other bits anyway, doing >> - * this is pointless. So for now just discard the write (also >> - * saving us from having to determine the matching irq_desc). >> - */ >> - spin_lock_irqsave(&desc->lock, flags); >> + desc = pirq_spin_lock_irq_desc(entry->pirq, &flags); >> + if ( !desc ) >> + goto out; >> + >> + if ( !desc->msi_desc ) >> + goto unlock; >> + >> + /* Do not allow guest to modify MSIX control bit if it is masked >> + * by Xen. We'll only handle the case where Xen thinks that >> + * bit is unmasked, but hardware has silently masked the bit >> + * (in case of SR-IOV VF reset, etc). >> + */ >> + if ( desc->msi_desc->msi_attrib.masked ) >> + goto unlock; > If Xen wants the msi masked, or the guest wants the msi masked then you > must set the masked bit, else must clear it. > > The root cause of this whole issue is that Xen doesn't actually know > what state the mask bit is in; it only knows its intention. > > Therefore, goto unlock is incorrect here. By this point, we must write > the bit one way or another. > >> + >> + /* The mask bit is the only defined bit in the word. But we >> + * ought to preserve the reserved bits. Clearing the reserved >> + * bits can result in undefined behaviour (see PCI Local Bus >> + * Specification revision 2.3). >> + */ >> orig = readl(virt); >> - val &= ~PCI_MSIX_VECTOR_BITMASK; >> - val |= orig & PCI_MSIX_VECTOR_BITMASK; >> + val &= PCI_MSIX_VECTOR_BITMASK; >> + val |= ( orig & ~PCI_MSIX_VECTOR_BITMASK ); >> writel(val, virt); >> - spin_unlock_irqrestore(&desc->lock, flags); >> -#endif >> >> +unlock: >> + spin_unlock_irqrestore(&desc->lock, flags); >> r = X86EMUL_OKAY; >> + >> out: >> rcu_read_unlock(&msixtbl_rcu_lock); >> return r; >> @@ -328,7 +345,8 @@ const struct hvm_mmio_handler msixtbl_mmio_handler = { >> static void add_msixtbl_entry(struct domain *d, >> struct pci_dev *pdev, >> uint64_t gtable, >> - struct msixtbl_entry *entry) >> + struct msixtbl_entry *entry, >> + struct pirq *pirq) > I would advocate const-correctness here, so "const struct pirq *pirq". Sorry - please ignore this. I was being an idiot. The other points still stand. ~Andrew > > ~Andrew > >> { >> u32 len; >> >> @@ -342,6 +360,7 @@ static void add_msixtbl_entry(struct domain *d, >> entry->table_len = len; >> entry->pdev = pdev; >> entry->gtable = (unsigned long) gtable; >> + entry->pirq = pirq; >> >> list_add_rcu(&entry->list, &d->arch.hvm_domain.msixtbl_list); >> } >> @@ -404,7 +423,7 @@ int msixtbl_pt_register(struct domain *d, struct pirq >> *pirq, uint64_t gtable) >> >> entry = new_entry; >> new_entry = NULL; >> - add_msixtbl_entry(d, pdev, gtable, entry); >> + add_msixtbl_entry(d, pdev, gtable, entry, pirq); >> >> found: >> atomic_inc(&entry->refcnt); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.