[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] pci: cleanup MSI interrupts before removing device from IOMMU


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 21 Oct 2020 13:34:17 +0200
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 21 Oct 2020 11:34:40 +0000
  • Ironport-sdr: zzHIZr1zb7xQqPtakwZxhbNkxAe6FSNXlECuthiGZWMFc2oWOuLVXZFDBRt2y597h055NQgeow hMrDK7ClNvP7/9ZMEARkQZOKUe8ErQUeIuzjq4ik+9SDpgHybY4PO5c9Lj33LlVcZKcwMyQdPE 5+GuXCGzeQdcK8BLW8j296OXHj83tpuLqt19Q7NRbVF0vKyVWoeeCFVyLQPH7nVeRTtuHUdgh0 CKG0mgfW/QW2O8Y4JvwWDzmUY+GpoWKnOuTHO49WjZ8xYYBjQO7Iq8gk28uhImJ6NE2no0fUdQ RqA=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Oct 21, 2020 at 01:20:27PM +0200, Jan Beulich wrote:
> On 21.10.2020 10:19, Roger Pau Monne wrote:
> > Doing the MSI cleanup after removing the device from the IOMMU leads
> > to the following panic on AMD hardware:
> > 
> > Assertion 'table.ptr && (index < intremap_table_entries(table.ptr, iommu))' 
> > failed at iommu_intr.c:172
> > ----[ Xen-4.13.1-10.0.3-d  x86_64  debug=y   Not tainted ]----
> > CPU:    3
> > RIP:    e008:[<ffff82d08026ae3c>] 
> > drivers/passthrough/amd/iommu_intr.c#get_intremap_entry+0x52/0x7b
> > [...]
> > Xen call trace:
> >    [<ffff82d08026ae3c>] R 
> > drivers/passthrough/amd/iommu_intr.c#get_intremap_entry+0x52/0x7b
> >    [<ffff82d08026af25>] F 
> > drivers/passthrough/amd/iommu_intr.c#update_intremap_entry_from_msi_msg+0xc0/0x342
> >    [<ffff82d08026ba65>] F amd_iommu_msi_msg_update_ire+0x98/0x129
> >    [<ffff82d08025dd36>] F iommu_update_ire_from_msi+0x1e/0x21
> >    [<ffff82d080286862>] F msi_free_irq+0x55/0x1a0
> >    [<ffff82d080286f25>] F pci_cleanup_msi+0x8c/0xb0
> >    [<ffff82d08025cf52>] F pci_remove_device+0x1af/0x2da
> >    [<ffff82d0802a42d1>] F do_physdev_op+0xd18/0x1187
> >    [<ffff82d080383925>] F pv_hypercall+0x1f5/0x567
> >    [<ffff82d08038a432>] F lstar_enter+0x112/0x120
> > 
> > That's because the call to iommu_remove_device on AMD hardware will
> > remove the per-device interrupt remapping table, and hence the call to
> > pci_cleanup_msi done afterwards will find a null intremap table and
> > crash.
> > 
> > Reorder the calls so that MSI interrupts are torn down before removing
> > the device from the IOMMU.
> 
> I guess this wants
> 
> Fixes: d7cfeb7c13ed ("AMD/IOMMU: don't blindly allocate interrupt remapping 
> tables")
> 
> ?

Oh yes, I didn't git blame the file to figure out when such allocating
and freeing was added.

> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

Thanks.

> > --- a/xen/drivers/passthrough/pci.c
> > +++ b/xen/drivers/passthrough/pci.c
> > @@ -834,10 +834,15 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
> >      list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
> >          if ( pdev->bus == bus && pdev->devfn == devfn )
> >          {
> > +            /*
> > +             * Cleanup MSI interrupts before removing the device from the
> > +             * IOMMU, or else the internal IOMMU data used to track the 
> > device
> > +             * interrupts might be already gone.
> > +             */
> > +            pci_cleanup_msi(pdev);
> >              ret = iommu_remove_device(pdev);
> >              if ( pdev->domain )
> >                  list_del(&pdev->domain_list);
> > -            pci_cleanup_msi(pdev);
> 
> To be honest I'm not sure about the comment. It should really have
> been this way from the very beginning, and VT-d not being affected
> makes me wonder what possible improvements are there waiting to be
> noticed and then carried out.

I'm fine with dropping the comment, I would also expect the normal
flow to be to cleanup any interrupt and then remove the device,
instead of the other way around.

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.