[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: RMRRs and Phantom Functions
On Tue, Apr 26, 2022 at 05:51:32PM +0000, Andrew Cooper wrote: > Hello, > > Edvin has found a machine with some very weird properties. It is an HP > ProLiant BL460c Gen8 with: > > \-[0000:00]-+-00.0 Intel Corporation Xeon E5/Core i7 DMI2 > +-01.0-[11]-- > +-01.1-[02]-- > +-02.0-[04]--+-00.0 Emulex Corporation OneConnect 10Gb NIC > (be3) > | +-00.1 Emulex Corporation OneConnect 10Gb NIC > (be3) > | +-00.2 Emulex Corporation OneConnect 10Gb > iSCSI Initiator (be3) > | \-00.3 Emulex Corporation OneConnect 10Gb > iSCSI Initiator (be3) > > yet all 4 other functions on the device periodically hit IOMMU faults > (~once every 5 mins, so definitely stats). > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.4] fault addr > bdf80000 > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.5] fault addr > bdf80000 > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.6] fault addr > bdf80000 > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.7] fault addr > bdf80000 > > There are several RMRRs covering the these devices, with: > > (XEN) [VT-D]found ACPI_DMAR_RMRR: > (XEN) [VT-D] endpoint: 0000:03:00.0 > (XEN) [VT-D] endpoint: 0000:01:00.0 > (XEN) [VT-D] endpoint: 0000:01:00.2 > (XEN) [VT-D] endpoint: 0000:04:00.0 > (XEN) [VT-D] endpoint: 0000:04:00.1 > (XEN) [VT-D] endpoint: 0000:04:00.2 > (XEN) [VT-D] endpoint: 0000:04:00.3 > (XEN) [VT-D]dmar.c:608: RMRR region: base_addr bdf8f000 end_addr bdf92fff > > being the one relevant to these faults. I've not manually decoded the > DMAR table because device paths are horrible to follow but there are at > least the correct number of endpoints. The functions all have SR-IOV > (disabled) and ARI (enabled). None have any Phantom functions described. According to the PCIe spec ARI capable devices must not have phantom functions: "With every Function in an ARI Device, the Phantom Functions Supported field must be set to 00b. The remainder of this field description applies only to non-ARI multi-Function devices." > Specifying pci-phantom=04:00,1 does appear to work around the faults, > but it's not right, because functions 1 thru 3 aren't actually phantom. > > Also, I don't see any logic which actually wires up phantom functions > like this to share RMRRs/IVMDs in IO contexts. The faults only > disappear as a side effect of 04:00.0 and 04:00.4 being in dom0, as far > as I can tell. I think I'm slightly confused, so those faults only happen when the devices are assigned to domains different than dom0? It would seem to me that functions 4 to 7 not being recognized by Xen should also lead to their context entries not being setup in the dom0 case, and thus the faults should also happen. > Simply giving the RMRR via rmrr= doesn't work (presumably because of no > patching actual devices, but there's no warning), but it feels as if it > ought to. Xen should likely complain that there's no matching PCI device for the provided RMRR regions, and so they are effectively ignored. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |