[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Issue with MSI in a HVM domU with several passed through PCI devices



On Tue, 26 Jun 2012, Rolu wrote:
> On Mon, Jun 25, 2012 at 1:38 PM, Stefano Stabellini
> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > On Mon, 25 Jun 2012, Jan Beulich wrote:
> >> >>> On 24.06.12 at 04:21, Rolu <rolu@xxxxxxxx> wrote:
> >> > On Wed, Jun 20, 2012 at 6:03 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
> >> >> At the same time, adding logging to the guest kernel would
> >> >> be nice, to see what value it actually writes (in a current
> >> >> kernel this would be in __write_msi_msg()).
> >> >>
> >> >
> >> > Turns out that msg->data here is also 0x4300, so it seems the guest
> >> > kernel is producing these values. I caused it to make a stack trace
> >> > and this pointed back to xen_hvm_setup_msi_irqs. This function uses
> >> > the macro XEN_PIRQ_MSI_DATA, which evaluates to 0x4300. It checks the
> >> > current data field and if it isn't equal to the macro it uses
> >> > xen_msi_compose_msg to make a new message, but that function just sets
> >> > the data field of the message to XEN_PIRQ_MSI_DATA - so, 0x4300. This
> >> > then gets passed to __write_msi_msg and that's that. There are no
> >> > other writes through __write_msi_msg (except for the same thing for
> >> > other devices).
> >> >
> >> > The macro XEN_PIRQ_MSI_DATA contains a part (3 << 8) which ends up
> >> > decoded as the delivery mode, so it seems the kernel is intentionally
> >> > setting it to 3.
> >>
> >> So that can never have worked properly afaict. Stefano, the
> >> code as it is currently - using literal (3 << 8) - is clearly bogus.
> >> Your original commit at least had a comment saying that the
> >> reserved delivery mode encoding is intentional here, but that
> >> comment got lost with the later introduction of XEN_PIRQ_MSI_DATA.
> >> In any case - the cooperation with qemu apparently doesn't
> >> work, as the reserved encoding should never make it through
> >> to the hypervisor. Could you explain what the intention here
> >> was?
> >>
> >> And regardless of anything, can the literal numbers please be
> >> replaced by proper manifest constants - the "8" here already
> >> has MSI_DATA_DELIVERY_MODE_SHIFT, and giving the 3 a
> >> proper symbolic would permit locating where this is being (or
> >> really, as it doesn't appear to work supposed to be) consumed
> >> in qemu, provided it uses the same definition (i.e. that one
> >> should go into one of the public headers).
> >
> > The (3 << 8) is unimportant. The delivery mode chosen is "reserved"
> > because notifications are not supposed to be delivered as MSI anymore.
> >
> > This is what should happen:
> >
> > 1) Linux configures the device with a 0 vector number and the pirq number
> > in the address field;
> >
> > 2) QEMU notices a vector number of 0 and reads the pirq number from the
> > address field, passing it to xc_domain_update_msi_irq;
> >
> > 3) Xen assignes the given pirq to the physical MSI;
> >
> > 4) The guest issues a EVTCHNOP_bind_pirq hypercall;
> >
> > 5) Xen sets the pirq as "IRQ_PT";
> >
> > 6) When Xen tries to inject the MSI into the guest, hvm_domain_use_pirq
> > returns true so Xen calls send_guest_pirq instead.
> >
> >
> > Obviously 6) is not happening. hvm_domain_use_pirq is:
> >
> > is_hvm_domain(d) && pirq && pirq->arch.hvm.emuirq != IRQ_UNBOUND
> >
> > My guess is that emuirq is IRQ_UNBOUND when it should be IRQ_PT (see
> > above).
> 
> This appears to be true. I added logging to hvm_pci_msi_assert in
> xen/drivers/passthrough/io.c and it indicates that
> pirq->arch.hvm.emuirq is -1 (while IRQ_PT is -2) every time right
> before an unsupported delivery mode message.
> 
> I also log pirq->pirq but I found that most of the time I can't find
> this value anywhere else (I'm not sure how to interpret the value,
> though). For example, in my last try:
> 
> * I get an unsupported delivery mode error for pirq->pirq 55, 54 and
> 53. The vast majority are for 54.
> * I have logging in map_domain_emuirq_pirq in xen/arch/x86/irq.c. It
> gets called with pirq 19, 20, 21, 22, 23, 52, 51, 50, 16, 17, 55.
> Never for 54 or 53. It also gets called with pirq=49,emuirq=23 once
> but complains it's already mapped.
> * I have logging in evtchn_bind_pirq in xen/common/event_channel.c. It
> gets called with bind->pirq 16, 17, 51, 55, 49, 29 (twice), 21, 19,
> 22, 52, 48, 47. Also never 54 or 53.
> * map_domain_emuirq_pirq is called from evtchn_bind_pirq for pirq 16, 17, 55.
> * The qemu log mentions pirq 35, 36 and 37
> 
> It seems pirq values don't always mean the same? Is it a coincidence
> that 55 occurs almost everywhere, or is something going wrong with the
> other two values (53 and 54 versus 16 and 17)?
> 
> I have three MSI capable devices passed through to the domU, and I do
> see groups of three distinct pirqs in the data above - just not the
> same ones in every place I look.
> 
> > So maybe the guest is not issuing a EVTCHNOP_bind_pirq hypercall
> > (__startup_pirq doesn't get called), or Xen is erroring out in
> > map_domain_emuirq_pirq.
> 
> evtchn_bind_pirq gets called, though I'm not sure if it is with the right 
> data.
> 
> map_domain_emuirq_pirq always gets past the checks in the top half
> (i.e. up to the line /* do not store emuirq mappings for pt devices
> */), except for one time with pirq=49,emuirq=23 where it finds they
> are already mapped.
> It is called three times with an emuirq of -2, for pirq 16, 17 and 55.
> This implies their info->arch.hvm.emuirq is also set to -2 (haven't
> directly logged that but it's the only assignment there).
> 
> Interestingly, I get an unsupported delivery mode error for pirq 55
> where my logging says pirq->arch.hvm.emuirq is -1, *after*
> map_domain_emuirq_pirq was called for pirq 55 and emuirq -2.

Looking back at your QEMU logs, it seems that pt_msi_setup is not
called (or it is not called at the right time), otherwise you should
get:

pt_msi_setup requested pirq = %d

in your logs.
Could you try disabling msitranslate? You can do that adding

pci_msitranslate=0

to your VM config file.
If that works, probably this (untested) QEMU patch could fix your problem:



diff --git a/hw/pt-msi.c b/hw/pt-msi.c
index 70c4023..09b1391 100644
--- a/hw/pt-msi.c
+++ b/hw/pt-msi.c
@@ -59,6 +59,26 @@ static void msix_set_enable(struct pt_dev *dev, int en)
 
 /* MSI virtuailization functions */
 
+int pt_msi_pirq(struct pt_dev *d, int *pirq)
+{
+    int gvec = d->msi->data & 0xFF;
+    if (gvec != 0) {
+        return -1;
+    }
+
+    /* if gvec is 0, the guest is asking for a particular pirq that
+     * is passed as dest_id */
+    *pirq = (dev->msi->addr_hi & 0xffffff00) |
+        ((dev->msi->addr_lo >> MSI_TARGET_CPU_SHIFT) & 0xff);
+    if (!pirq) {
+        /* this probably identifies an misconfiguration of the guest,
+         * try the emulated path */
+        *pirq = -1;
+        return -1;
+    } else {
+        PT_LOG("%s requested pirq = %d\n", __func__, *pirq);
+    }
+}
 /*
  * setup physical msi, but didn't enable it
  */
@@ -74,18 +94,7 @@ int pt_msi_setup(struct pt_dev *dev)
     }
 
     gvec = dev->msi->data & 0xFF;
-    if (!gvec) {
-        /* if gvec is 0, the guest is asking for a particular pirq that
-         * is passed as dest_id */
-        pirq = (dev->msi->addr_hi & 0xffffff00) |
-               ((dev->msi->addr_lo >> MSI_TARGET_CPU_SHIFT) & 0xff);
-        if (!pirq)
-            /* this probably identifies an misconfiguration of the guest,
-             * try the emulated path */
-            pirq = -1;
-        else
-            PT_LOG("pt_msi_setup requested pirq = %d\n", pirq);
-    }
+    pt_msi_pirq(d, &pirq);
 
     if ( xc_physdev_map_pirq_msi(xc_handle, domid, AUTO_ASSIGN, &pirq,
                                  PCI_DEVFN(dev->pci_dev->dev,
@@ -138,6 +147,8 @@ int pt_msi_update(struct pt_dev *d)
     addr = (uint64_t)d->msi->addr_hi << 32 | d->msi->addr_lo;
     gflags = __get_msi_gflags(d->msi->data, addr);
 
+    pt_msi_pirq(d, &d->msi->pirq);
+
     PT_LOG("Update msi with pirq %x gvec %x gflags %x\n",
            d->msi->pirq, gvec, gflags);
 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.