[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Config space access to Mediatek MT7922 doesn't work after device reset in Xen PV dom0 (regression, Linux 6.12)
On Fri, Jan 17, 2025 at 01:05:30PM +0100, Marek Marczykowski-Górecki wrote: > After updating PV dom0 to Linux 6.12, The Mediatek MT7922 device reports > all 0xff when accessing its config space. This happens only after device > reset (which is also triggered when binding the device to the > xen-pciback driver). Thanks for the report and for all the debugging you've already done! > Reproducer: > > # lspci -xs 01:00.0 > 01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express > Wireless Network Adapter > 00: c3 14 16 06 00 00 10 00 00 00 80 02 10 00 00 00 > ... > # echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset > # lspci -xs 01:00.0 > 01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express > Wireless Network Adapter > 00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > > The same operation done on Linux 6.12 running without Xen works fine. > > git bisect points at: > > commit d591f6804e7e1310881c9224d72247a2b65039af > Author: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > Date: Tue Aug 27 18:48:46 2024 -0500 > > PCI: Wait for device readiness with Configuration RRS > > part of that commit: > @@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char > *reset_type, int timeout) > return -ENOTTY; > } > > - pci_read_config_dword(dev, PCI_COMMAND, &id); > - if (!PCI_POSSIBLE_ERROR(id)) > - break; > + if (root && root->config_crs_sv) { > + pci_read_config_dword(dev, PCI_VENDOR_ID, &id); > + if (!pci_bus_crs_vendor_id(id)) > + break; > + } else { > + pci_read_config_dword(dev, PCI_COMMAND, &id); > + if (!PCI_POSSIBLE_ERROR(id)) > + break; > + } > > > Adding some debugging, the PCI_VENDOR_ID read in pci_dev_wait() returns > initially 0xffffffff. If I extend the condition with > "&& !PCI_POSSIBLE_ERROR(id)", then the issue disappear. But reading the > patch description, it would break VF. > I'm not sure where the issue is, but given it breaks only when running > with Xen, I guess something is wrong with "Configuration RRS Software > Visibility" in that case. I'm missing something. If you get 0xffffffff, that is not the 0x0001 Vendor ID, so pci_dev_wait() should exit immediately. But the log at https://github.com/QubesOS/qubes-issues/issues/9689#issuecomment-2582927149 says it *doesn't* exit and eventually times out. And the lspci above shows ~0 data for much of the header, even though the device must be ready by then. I don't have any good ideas, but since the problem only happens with Xen, and it seems to affect more than just the Vendor ID, maybe you could instrument xen_pcibk_config_read() and see if there's something wonky going on there? > BTW, shouldn't PCI_VENDOR_ID be accessed via pci_read_config_word() > instead of pci_read_config_dword()? Per PCIe r6.0, sec 2.3.2: If Configuration RRS Software Visibility is enabled (see below): For a Configuration Read Request that includes both bytes of the Vendor ID field of a device Function's Configuration Space Header, the Root Complex must complete the Request to the host by returning a read-data value of 0001h for the Vendor ID field and all ‘1’s for any additional bytes included in the request. Since either a word (16 bit) or dword (32 bit) read includes both bytes of Vendor ID, I think either should work. We use a 32-bit read in the enumeration path, where we need both Vendor ID and Device ID, but we don't care about the Device ID here, so it probably doesn't really matter here.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |