[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 3/3] x86/msi: clear initial MSI-X state on boot
On Tue, Mar 28, 2023 at 9:17 AM Jason Andryuk <jandryuk@xxxxxxxxx> wrote: > > On Tue, Mar 28, 2023 at 9:04 AM Marek Marczykowski-Górecki > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > On Tue, Mar 28, 2023 at 02:54:38PM +0200, Jan Beulich wrote: > > > On 25.03.2023 03:49, Marek Marczykowski-Górecki wrote: > > > > Some firmware/devices are found to not reset MSI-X properly, leaving > > > > MASKALL set. Xen relies on initial state being both disabled. > > > > Especially, pci_reset_msix_state() assumes if MASKALL is set, it was Xen > > > > setting it due to msix->host_maskall or msix->guest_maskall. Clearing > > > > just MASKALL might be unsafe if ENABLE is set, so clear them both. > > > > > > But pci_reset_msix_state() comes into play only when assigning a device > > > to a DomU. If the tool stack doing a reset doesn't properly clear the > > > bit, how would it be cleared the next time round (i.e. after the guest > > > stopped and then possibly was started again)? It feels like the issue > > > wants dealing with elsewhere, possibly in the tool stack. > > > > I may be misremembering some details, but AFAIR Xen intercepts > > toolstack's (or more generally: accesses from dom0) attempt to clean > > this up and once it enters an inconsistent state (or rather: starts with > > such at the start of the day), there was no way to clean it up. > > > > I have considered changing pci_reset_msix_state() to not choke on > > MASKALL being set, but I'm a bit afraid of doing it, as there it seems > > there is a lot of assumptions all over the place and I may miss some. > > Hi, Marek and Jan, > > Marek, thank you for working on MSI-X support. > > As Jan says, the clearing here works during system boot. However, I > have found that Xen itself is setting MASKALL in __pci_disable_msix() > when shutting down a domU. When that is called, memory_decoded(dev) > returns false, and Xen prints "cannot disable IRQ 137: masking MSI-X > on 0000:00:14.3". That makes the device unavailable for subsequent > domU assignment. I haven't investigated where and why memory decoding > gets disabled for the device. > > Testing was with this v2 patchset integrated into OpenXT w/ Xen 4.16. > We have some device reset changes, so I'll have to look at them again. > Hmmm, they move the libxl device reseting from pci_remove_detached() > to libxl__destroy_domid() to ensure all devices are de-assign after > the domain is destroyed. A kernel patch implements a "more thorough > reset" which could do a slot or bus level reset, and the desire is to > have all devices deassigned before that. Maybe the shift later is > throwing off Xen's expectations? I dropped the OpenXT libxl patch, and Xen is not setting MASKALL. Regards, Jason
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |