[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: RMRRs and Phantom Functions
On 27/04/2022 09:03, Roger Pau Monne wrote: > On Tue, Apr 26, 2022 at 05:51:32PM +0000, Andrew Cooper wrote: >> Hello, >> >> Edvin has found a machine with some very weird properties. It is an HP >> ProLiant BL460c Gen8 with: >> >> \-[0000:00]-+-00.0 Intel Corporation Xeon E5/Core i7 DMI2 >> +-01.0-[11]-- >> +-01.1-[02]-- >> +-02.0-[04]--+-00.0 Emulex Corporation OneConnect 10Gb NIC >> (be3) >> | +-00.1 Emulex Corporation OneConnect 10Gb NIC >> (be3) >> | +-00.2 Emulex Corporation OneConnect 10Gb >> iSCSI Initiator (be3) >> | \-00.3 Emulex Corporation OneConnect 10Gb >> iSCSI Initiator (be3) >> >> yet all 4 other functions on the device periodically hit IOMMU faults >> (~once every 5 mins, so definitely stats). >> >> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.4] fault addr >> bdf80000 >> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.5] fault addr >> bdf80000 >> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.6] fault addr >> bdf80000 >> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.7] fault addr >> bdf80000 >> >> There are several RMRRs covering the these devices, with: >> >> (XEN) [VT-D]found ACPI_DMAR_RMRR: >> (XEN) [VT-D] endpoint: 0000:03:00.0 >> (XEN) [VT-D] endpoint: 0000:01:00.0 >> (XEN) [VT-D] endpoint: 0000:01:00.2 >> (XEN) [VT-D] endpoint: 0000:04:00.0 >> (XEN) [VT-D] endpoint: 0000:04:00.1 >> (XEN) [VT-D] endpoint: 0000:04:00.2 >> (XEN) [VT-D] endpoint: 0000:04:00.3 >> (XEN) [VT-D]dmar.c:608: RMRR region: base_addr bdf8f000 end_addr bdf92fff >> >> being the one relevant to these faults. I've not manually decoded the >> DMAR table because device paths are horrible to follow but there are at >> least the correct number of endpoints. The functions all have SR-IOV >> (disabled) and ARI (enabled). None have any Phantom functions described. > According to the PCIe spec ARI capable devices must not have phantom > functions: > > "With every Function in an ARI Device, the Phantom Functions Supported > field must be set to 00b. The remainder of this field description > applies only to non-ARI multi-Function devices." Lovely... > >> Specifying pci-phantom=04:00,1 does appear to work around the faults, >> but it's not right, because functions 1 thru 3 aren't actually phantom. >> >> Also, I don't see any logic which actually wires up phantom functions >> like this to share RMRRs/IVMDs in IO contexts. The faults only >> disappear as a side effect of 04:00.0 and 04:00.4 being in dom0, as far >> as I can tell. > I think I'm slightly confused, so those faults only happen when the > devices are assigned to domains different than dom0? > > It would seem to me that functions 4 to 7 not being recognized by Xen > should also lead to their context entries not being setup in the dom0 > case, and thus the faults should also happen. Functions 4 thru 7 do not exist in the system. Their config space is all ~0's. As they appear to be non-existent, no IOMMU context is set up for them, hence the DMA faults when their source id is actually used. When specifying phantom, what we're saying is that "function $X uses $Y as a source id too". Or in other words, treat $Y as if it were $X. In a theoretical future with working IOMMU groups, this would force $X and $Y into the same IOMMU group as they can't be separated. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |