[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 3/3] x86/msi: clear initial MSI-X state on boot
On Tue, Mar 28, 2023 at 9:35 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 28.03.2023 15:32, Jason Andryuk wrote: > > On Tue, Mar 28, 2023 at 9:28 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> > > wrote: > >> On Tue, Mar 28, 2023 at 03:23:56PM +0200, Jan Beulich wrote: > >>> On 28.03.2023 15:04, Marek Marczykowski-Górecki wrote: > >>>> On Tue, Mar 28, 2023 at 02:54:38PM +0200, Jan Beulich wrote: > >>>>> On 25.03.2023 03:49, Marek Marczykowski-Górecki wrote: > >>>>>> Some firmware/devices are found to not reset MSI-X properly, leaving > >>>>>> MASKALL set. Xen relies on initial state being both disabled. > >>>>>> Especially, pci_reset_msix_state() assumes if MASKALL is set, it was > >>>>>> Xen > >>>>>> setting it due to msix->host_maskall or msix->guest_maskall. Clearing > >>>>>> just MASKALL might be unsafe if ENABLE is set, so clear them both. > >>>>> > >>>>> But pci_reset_msix_state() comes into play only when assigning a device > >>>>> to a DomU. If the tool stack doing a reset doesn't properly clear the > >>>>> bit, how would it be cleared the next time round (i.e. after the guest > >>>>> stopped and then possibly was started again)? It feels like the issue > >>>>> wants dealing with elsewhere, possibly in the tool stack. > >>>> > >>>> I may be misremembering some details, but AFAIR Xen intercepts > >>>> toolstack's (or more generally: accesses from dom0) attempt to clean > >>>> this up and once it enters an inconsistent state (or rather: starts with > >>>> such at the start of the day), there was no way to clean it up. > >>> > >>> Iirc Roger and you already discussed that there needs to be an > >>> indication of device reset having happened, so that Xen can resync > >>> from this "behind its back" operation. That would look to be the > >>> point/place where such inconsistencies should be eliminated. > >> > >> I think that was a different conversation with Huang Rui related to > >> the AMD GPU work, see: > >> > >> https://lore.kernel.org/xen-devel/ZBwtaceTNvCYksmR@Air-de-Roger/ > >> > >> I understood the problem Marek was trying to solve was that some > >> devices where initialized with the MASKALL bit set (likely by the > >> firmware?) and that prevented Xen from using them. But now seeing the > >> further replies on this patch I'm unsure whether that's the case. > > > > In my case, Xen's setting of MASKALL persists through a warm reboot, > > And does this get in the way of Dom0 using the device? (Before a DomU > gets to use it, things should be properly reset anyway.) Dom0 doesn't have drivers for the device, so I am not sure. I don't seem to have the logs around, but I believe when MASKALL is set, the initial quarantine of the device fails. Yes, some notes I have mention: It's getting -EBUSY from pdev_msix_assign() which means pci_reset_msix_state() is failing: if ( pci_conf_read16(pdev->sbdf, msix_control_reg(pos)) & PCI_MSIX_FLAGS_MASKALL ) return -EBUSY; Regards, Jason
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |