[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 1/3] xen/pt: fix some pass-thru devices don't work across reboot



>>> On 22.01.19 at 17:08, <chao.gao@xxxxxxxxx> wrote:
> On Tue, Jan 22, 2019 at 01:24:48AM -0700, Jan Beulich wrote:
>>>>> On 22.01.19 at 06:50, <chao.gao@xxxxxxxxx> wrote:
>>> On Wed, Jan 16, 2019 at 11:38:23AM +0100, Roger Pau Monné wrote:
>>>>On Wed, Jan 16, 2019 at 04:17:30PM +0800, Chao Gao wrote:
>>>>> @@ -1529,6 +1591,8 @@ int deassign_device(struct domain *d, u16 seg, u8 
>>>>> bus, 
> u8 devfn)
>>>>>      if ( !pdev )
>>>>>          return -ENODEV;
>>>>>  
>>>>> +    pci_unmap_msi(pdev);
>>>>
>>>>Just want to make sure, since deassign_device will be called for both
>>>>PV and HVM domains. AFAICT pci_unmap_msi is safe to call when the
>>>>device is assigned to a PV guest, but would like your confirmation.
>>> 
>>> Tested with a PV guest loaded by Pygrub. PV guest doesn't suffer the
>>> msi-x issue I want to fix.
>>> 
>>> With these three patches applied, I got some error messages from Xen
>>> and Dom0 as follow:
>>> 
>>> (XEN) irq.c:2176: dom3: forcing unbind of pirq 332
>>> (XEN) irq.c:2176: dom3: forcing unbind of pirq 331
>>> (XEN) irq.c:2176: dom3: forcing unbind of pirq 328
>>> (XEN) irq.c:2148: dom3: pirq 359 not mapped
>>> [ 2887.067685] xen:events: unmap irq failed -22
>>> (XEN) irq.c:2148: dom3: pirq 358 not mapped
>>> [ 2887.075917] xen:events: unmap irq failed -22
>>> (XEN) irq.c:2148: dom3: pirq 357 not mapped
>>> 
>>> It seems, the cause of such error is that pirq-s are unmapped and forcibly
>>> unbound on deassignment; subsequent unmapping pirq issued by dom0 fail.
>>> From some aspects, this error is expected. Because with this patch,
>>> pirq-s are expected to be mapped by qemu or dom0 kernel (for pv case) before
>>> deassignment and mapping/binding pirq after deassignment should fail.
>>> 
>>> So what's your opinion on handling such error? We should figure out another
>>> method to fix msi-x issue to avoid such error or suppress these errors in
>>> qemu and linux kernel?
>>
>>The "forcing unbind" ones are probably fine to leave alone, but
>>the errors would better be avoided in Xen (i.e. without a need
>>to also change qemu and/or Linux). Since you don't really say
>>when / why these errors now surface, it's hard to suggest what
>>might be best to do.
> 
> With these patches applied, these errors surface in three cases:
> 1. destroy the PV guest with assigned devices by "xl destroy"
> 2. hot-unplug a assigned device from the PV guest
> 3. shut down the PV guest by executing "init 0" in guest (only for some
> devices whose driver doesn't clean up MSI-x when shutdown)
> 
> The reason is:
> when detaching a device from a domain, Toolstack always calls
> xc_deassign_device() prior to libxl__device_pci_remove_xenstore().
> The latter notifies xen_pciback to clean up the pci devices. I guess
> unbinding and unmapping pirq are steps of the cleanup (just like
> qemu's role in device deassignment for HVM guest). But in this patch,
> pirqs are forcibly unmapped when calling xc_deassign_device(). Thus when
> xen_pciback tries to unmap pirqs as usual, xen reports this pirq isn't
> mapped and propagates this error to xen_pciback.

Why are you talking about pciback here? I don't think it plays any
role in IRQ unmapping. If what the tool stack does if going to be
fully taen care of by the hypervisor, I think that tool stack code
could the be deleted (unless compatibility requirements don't
allow doing so, in which case it could be conditionally bypassed).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.