[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Config space access to Mediatek MT7922 doesn't work after device reset in Xen PV dom0 (regression, Linux 6.12)
On Tue, Jan 28, 2025 at 07:15:26PM -0600, Bjorn Helgaas wrote: > On Fri, Jan 17, 2025 at 01:05:30PM +0100, Marek Marczykowski-Górecki wrote: > > After updating PV dom0 to Linux 6.12, The Mediatek MT7922 device reports > > all 0xff when accessing its config space. This happens only after device > > reset (which is also triggered when binding the device to the > > xen-pciback driver). > > Thanks for the report and for all the debugging you've already done! > > > Reproducer: > > > > # lspci -xs 01:00.0 > > 01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express > > Wireless Network Adapter > > 00: c3 14 16 06 00 00 10 00 00 00 80 02 10 00 00 00 > > ... > > # echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset > > # lspci -xs 01:00.0 > > 01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express > > Wireless Network Adapter > > 00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > > > > The same operation done on Linux 6.12 running without Xen works fine. > > > > git bisect points at: > > > > commit d591f6804e7e1310881c9224d72247a2b65039af > > Author: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > Date: Tue Aug 27 18:48:46 2024 -0500 > > > > PCI: Wait for device readiness with Configuration RRS > > > > part of that commit: > > @@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char > > *reset_type, int timeout) > > return -ENOTTY; > > } > > > > - pci_read_config_dword(dev, PCI_COMMAND, &id); > > - if (!PCI_POSSIBLE_ERROR(id)) > > - break; > > + if (root && root->config_crs_sv) { > > + pci_read_config_dword(dev, PCI_VENDOR_ID, &id); > > + if (!pci_bus_crs_vendor_id(id)) > > + break; > > + } else { > > + pci_read_config_dword(dev, PCI_COMMAND, &id); > > + if (!PCI_POSSIBLE_ERROR(id)) > > + break; > > + } > > > > > > Adding some debugging, the PCI_VENDOR_ID read in pci_dev_wait() returns > > initially 0xffffffff. If I extend the condition with > > "&& !PCI_POSSIBLE_ERROR(id)", then the issue disappear. But reading the > > patch description, it would break VF. > > I'm not sure where the issue is, but given it breaks only when running > > with Xen, I guess something is wrong with "Configuration RRS Software > > Visibility" in that case. > > I'm missing something. If you get 0xffffffff, that is not the 0x0001 > Vendor ID, so pci_dev_wait() should exit immediately. I'm not sure what is going on there either, but my _guess_ is that the loop exits too early due to the above. And it makes some further actions to fail. > But the log at > https://github.com/QubesOS/qubes-issues/issues/9689#issuecomment-2582927149 > says it *doesn't* exit and eventually times out. Note this log is from "working" kernel, so that timeout must be something else. > And the lspci above shows ~0 data for much of the header, even though > the device must be ready by then. > > I don't have any good ideas, but since the problem only happens with > Xen, and it seems to affect more than just the Vendor ID, maybe you > could instrument xen_pcibk_config_read() and see if there's something > wonky going on there? This one is used when pcifront (from a different PV VM) is asking pciback to read something. I see the issue even before starting any other VM and not even attaching the device to the xen-pciback driver... -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |