[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/pt: skip setup of posted format IRTE when gvec is 0



On Thu, May 02, 2019 at 10:20:09AM +0200, Roger Pau Monné wrote:
>On Wed, May 01, 2019 at 12:41:13AM +0800, Chao Gao wrote:
>> On Tue, Apr 30, 2019 at 11:30:33AM +0200, Roger Pau Monné wrote:
>> >On Tue, Apr 30, 2019 at 05:01:21PM +0800, Chao Gao wrote:
>> >> On Tue, Apr 30, 2019 at 01:56:31AM -0600, Jan Beulich wrote:
>> >> >>>> On 30.04.19 at 07:19, <chao.gao@xxxxxxxxx> wrote:
>> >> >> When testing with an UP guest with a pass-thru device with vt-d pi
>> >> >> enabled in host, we observed that guest couldn't receive interrupts
>> >> >> from that pass-thru device. Dumping IRTE, we found the corresponding
>> >> >> IRTE is set to posted format with "vector" field as 0.
>> >> >> 
>> >> >> We would fall into this issue when guest used the pirq format of MSI
>> >> >> (see the comment xen_msi_compose_msg() in linux kernel). As 'dest_id'
>> >> >> is repurposed, skip migration which is based on 'dest_id'.
>> >> >
>> >> >I've gone through all uses of gvec, and I couldn't find any existing
>> >> >special casing of it being zero. I assume this is actually communication
>> >> >between the kernel and qemu,
>> >> 
>> >> Yes. 
>> >> 
>> >> >in which case I'd like to see an
>> >> >explanation of why the issue needs to be addressed in Xen rather
>> >> >than qemu.
>> >> 
>> >> To call pirq_guest_bind() to configure irq_desc properly.
>> >> Especially, we append a pointer of struct domain to 'action->guest' in
>> >> pirq_guest_bind(). Then __do_IRQ_guest() knows domains that are interested
>> >> in this interrupt and injects an interrupt to those domains.
>> >> 
>> >> >Otherwise, if I've overlooked something, would you
>> >> >mind pointing out where such special casing lives in Xen?
>> >> >
>> >> >In any event it doesn't look correct to skip migration altogether in
>> >> >that case. I'd rather expect it to require getting done differently.
>> >> >After all there still is a (CPU, vector) tuple associated with that
>> >> >{,p}IRQ if it's not posted, and hvm_migrate_pirq() is a no-op if it is
>> >> >posted.
>> >> 
>> >> Here, we try to set irq's target cpu to the cpu which the vmsi's target 
>> >> vcpu
>> >> is running on to reduce IPI. But the 'dest_id' field which used to
>> >> indicate the vmsi's target vcpu is missing, we don't know which cpu we 
>> >> should
>> >> migrate the irq to. One possible choice is the 'chn->notify_vcpu_id'
>> >> used in send_guest_pirq(). Do you think this choice is fine?
>> >
>> >I think that by the time the device model calls into pirq_guest_bind
>> >the PIRQ won't be bound to any event channel, so pirq->evtchn would be
>> >0.
>> 
>> Then skip pirq migration is the only choice here? And we can migrate
>> pirq when it is bound with an event channel.
>> 
>> >
>> >Note that the binding of the PIRQ with the event channel is done
>> >afterwards in xen_hvm_setup_msi_irqs by the Linux kernel.
>> >
>> >It seems like the device model should be using a different set of
>> >hypercalls to setup a PIRQ that is routed over an event channel, ie:
>> >PHYSDEVOP_map_pirq and friends.
>> 
>> Now qemu is using PHYSDEVOP_map_pirq. Right?
>
>Oh yes, QEMU already uses PHYSDEVOP_map_pirq to setup the interrupt.
>Then I'm not sure I see why QEMU calls XEN_DOMCTL_bind_pt_irq for
>interrupts that are routed over event channels. That hypercall is used

As I said above, it is to call pirq_guest_bind() to hook up to irq handler.

XEN_DOMCTL_bind_pt_pirq does two things:
#1. bind pirq with a guest interrupt
#2. register (domain,pirq) to the interrupt handler

currently, for pirq routed to evtchn, #1 is done by another hypercall,
evtchn_bind_pirq. and #2 is done in XEN_DOMCTL_bind_pt_irq.

>to bind a pirq to a native guest interrupt injection mechanism, which
>shouldn't be used if the interrupt is going to be delivered over an
>event channel.
>
>Can you see about avoiding the XEN_DOMCTL_bind_pt_irq call in QEMU if
>the interrupt is going to be routed over an event channel?

Yes. It is doable. But it needs changes in both qemu and Xen and some tricks
to be compatible with old qemu.

I prefer not to touch qemu and keep qemu unware of MSI's "routing over evtchn",
like the patch below:

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index e86e2bf..0bcddb9 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -504,10 +504,7 @@ static long evtchn_bind_pirq(evtchn_bind_pirq_t *bind)
     if ( !info )
         ERROR_EXIT(-ENOMEM);
     info->evtchn = port;
-    rc = (!is_hvm_domain(d)
-          ? pirq_guest_bind(v, info,
-                            !!(bind->flags & BIND_PIRQ__WILL_SHARE))
-          : 0);
+    rc = pirq_guest_bind(v, info, !!(bind->flags & BIND_PIRQ__WILL_SHARE));
     if ( rc != 0 )
     {
         info->evtchn = 0;
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 4290c7c..5a0b83e 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -346,6 +346,12 @@ int pt_irq_create_bind(
         uint32_t gflags = pt_irq_bind->u.msi.gflags &
                           ~XEN_DOMCTL_VMSI_X86_UNMASKED;
 
+        if ( !pt_irq_bind->u.msi.gvec )
+        {
+            spin_unlock(&d->event_lock);
+            return 0;
+        }
+
         if ( !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
         {
             pirq_dpci->flags = HVM_IRQ_DPCI_MAPPED | HVM_IRQ_DPCI_MACH_MSI |


Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.