[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: RMRRs and Phantom Functions
- To: Jan Beulich <jbeulich@xxxxxxxx>
- From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
- Date: Wed, 27 Apr 2022 10:05:54 +0000
- Accept-language: en-GB, en-US
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xuH8RRONxl+DQX/OIKIJJ9vSVJLrKLopyzbW8H3FAnE=; b=gmYauG9WlsWjL1c2a19BWabZKHekClzJURbtfm+qQVXZOOxUD8/JsqQ6cWQLw/XiHM93FxujIKGUXIYM+M0Ie8kTBhROXKYaqKMfCxGR3PXNVvhhBV03Qb8vwGdfAyeA0crhnkPeEN++h7qLaVu5rr6Uxaii8WO/YzpiejeDKRezHJVTmj2SIAuJmkWGgUtsHfjBsxd4D1pbsbEPjXuRdxh3DG+u9iKmHC8g4GUk+hg55zLA/PlQd3hMmbtj56vevl+S92nmCZJleMu31o3KTOnmjLV5tddtdlA5TbBJbNfR9TdW2OEPVjntpn25VcNGLovDkogN8gBOTgSCQemH1A==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TaYzN+GLsTuGjJVbNIANTyQJNgdDME/RCeB2nKLNXTp909joZ0JwYybxIh4zJQVq+ju7HQyeNMbzcgGVU3a6RPjvsEX8R2kX61+bohR1djpjTO64j7HTT2lgPjGIFmVjMTVbzXZi3pmxZvTL9/Ks6o3wB+bau4rFhOnDRUeNqaOJActHRpNuWhGPJhoO0zibNBD8T3YVEWm8Fu6nheQb8NJkRkc74DzYdRLVCBQcWRmlzoSlM30q7mG/Xhn2W20d48mtXzh8ZKn+8MtZ1VvmmBaL2K5B+JS4RyL0Vjr7yAJtzV7qSO/AckKaEfw7QyALsPZ0LXRInUc2jtVy1rsT4Q==
- Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
- Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Edwin Torok <edvin.torok@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
- Delivery-date: Wed, 27 Apr 2022 10:06:16 +0000
- Ironport-data: A9a23:aF+OG67nPKkoT/vjqUXiagxRtDzGchMFZxGqfqrLsTDasY5as4F+v msbD2CAPayKM2Dxed13Ot7k/U0AvMODxtFmTQdp+XpjHi5G8cbLO4+Ufxz6V8+wwmwvb67FA +E2MISowBUcFyeEzvuVGuG96yE6j8lkf5KkYAL+EnkZqTRMFWFw0XqPp8Zj2tQy2YTjXVvW0 T/Pi5a31GGNimYc3l08s8pvmDs31BglkGpF1rCWTakjUG72zxH5PrpGTU2CByKQrr1vNvy7X 47+IISRpQs1yfuP5uSNyd4XemVSKlLb0JPnZnB+A8BOiTAazsA+PzpS2FPxpi67hh3Q9+2dx umhurSNdBcYOaPQod5HXhNpTgEjG7RhwJzudC3XXcy7lyUqclPK6tA3VAQaGNNd/ex6R2ZT6 fYfNTYBKAiZgP67y666Te8qgdk/KM7sP8UUvXQIITPxVK56B8ycBfiao4YAgF/chegXdRraT +MfZSBic1LrZBpXN01MIJk/gP2plj/0dDgwRFe9+vJruDKOlVMZPL7FbIrqR4ySR/5vgWGUl kXbw3b6XRI6DYnKodaC2jf27gPVpgv5Uo8PELyz9tZxnUaegGcUDXU+VlaloP//lk+3XfpeL VAZ/mwlqq1a3FymSJzxUgO1pFaAvwUAQJxAHusi8gaPx6HIpQGDCQA5oiVpbdUnsIo6QGIs3 1rQx9fxX2U37PuSVG6X8aqSoXWqIy8JIGQeZCgCCwwY/93kp4J1hRXKJjp+LJOIYhTOMWmY6 1i3QOIW3ux7YRIjv0ljwW36vg==
- Ironport-hdrordr: A9a23:NwaSvKEBzSDElh6epLqFt5LXdLJyesId70hD6qkvc3Fom52j/f xGws5x6fatskdrZJkh8erwW5Vp2RvnhNFICPoqTM2ftW7dySWVxeBZnMffKljbdxEWmdQtsp uIH5IeNDS0NykDsS+Y2nj4Lz9D+qjgzEnAv463oBlQpENRGthdBmxCe2Sm+zhNNW177O0CZf +hD6R8xwaISDAyVICWF3MFV+/Mq5ngj5T9eyMLABYh9U2nkS6owKSSKWna4j4uFxd0hZsy+2 nMlAL0oo+5teug9xPa32jPq7xLhdrazMdZDsDksLlWFtyssHfsWG1SYczEgNkHmpDo1L/sqq iUn/4UBbU215oWRBDsnfKi4Xi67N9k0Q6e9bbRuwqenSW+fkN7NyMJv/MmTvOSgXBQw+1Uwe ZF2XmUuIFQCg6FlCPh58LQXxUvjUasp2E++NRjxEC3/rFuGoO5gLZvtX+9Kq1wVB4SKbpXZd VGHYXZ/rJbYFmaZ3fWsi1mx8GtRG06GlODTlIZssKY3jBKlDQhpnFoifA3jzMF7tYwWpNE7+ PLPuBhk6xPVNYfaeZ4CP0aScW6B2TRSVbHMX6UI17gCKYbUki94aLf8fEw/qWnaZYIxJw9lN DIV05Zr3c7fwb0BciHzPRwg2bwqaWGLEPQI+1lluhEU+fHNcvW2AW4OSMTutrlpekDCcvGXP v2MI5KApbYXB/TJbo=
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
- Thread-index: AQHYWZZDIrEkmq9fL0+3qLNT2+7QyK0DVX0AgAA0GQA=
- Thread-topic: RMRRs and Phantom Functions
On 27/04/2022 07:59, Jan Beulich wrote:
> On 26.04.2022 19:51, Andrew Cooper wrote:
>> Hello,
>>
>> Edvin has found a machine with some very weird properties. It is an HP
>> ProLiant BL460c Gen8 with:
>>
>> \-[0000:00]-+-00.0 Intel Corporation Xeon E5/Core i7 DMI2
>> +-01.0-[11]--
>> +-01.1-[02]--
>> +-02.0-[04]--+-00.0 Emulex Corporation OneConnect 10Gb NIC
>> (be3)
>> | +-00.1 Emulex Corporation OneConnect 10Gb NIC
>> (be3)
>> | +-00.2 Emulex Corporation OneConnect 10Gb
>> iSCSI Initiator (be3)
>> | \-00.3 Emulex Corporation OneConnect 10Gb
>> iSCSI Initiator (be3)
>>
>> yet all 4 other functions on the device periodically hit IOMMU faults
>> (~once every 5 mins, so definitely stats).
>>
>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.4] fault addr
>> bdf80000
>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.5] fault addr
>> bdf80000
>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.6] fault addr
>> bdf80000
>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.7] fault addr
>> bdf80000
>>
>> There are several RMRRs covering the these devices, with:
>>
>> (XEN) [VT-D]found ACPI_DMAR_RMRR:
>> (XEN) [VT-D] endpoint: 0000:03:00.0
>> (XEN) [VT-D] endpoint: 0000:01:00.0
>> (XEN) [VT-D] endpoint: 0000:01:00.2
>> (XEN) [VT-D] endpoint: 0000:04:00.0
>> (XEN) [VT-D] endpoint: 0000:04:00.1
>> (XEN) [VT-D] endpoint: 0000:04:00.2
>> (XEN) [VT-D] endpoint: 0000:04:00.3
>> (XEN) [VT-D]dmar.c:608: RMRR region: base_addr bdf8f000 end_addr bdf92fff
>>
>> being the one relevant to these faults. I've not manually decoded the
>> DMAR table because device paths are horrible to follow but there are at
>> least the correct number of endpoints. The functions all have SR-IOV
>> (disabled) and ARI (enabled). None have any Phantom functions described.
>>
>> Specifying pci-phantom=04:00,1 does appear to work around the faults,
>> but it's not right, because functions 1 thru 3 aren't actually phantom.
> Indeed, and I think you really mean "pci-phantom=04:00,4".
As a quick tangent, the cmdline docs for pci-phantom= are in desperate
need of an example and a description of how stride works. I've got some
ideas and notes jotted down.
Do we really mean ,4 here? What happens for function 1?
> I guess we
> should actually refuse "pci-phantom=04:00,1" in a case like this one.
> The problem is that at the point we set pdev->phantom_stride we may
> not know of the other devices, yet. But I guess we could attempt a
> config space read of the supposed phantom function's device/vendor
> and do <whatever> if these aren't both 0xffff.
At a minimum, we ought to warn when it looks like something is wonky,
but I wouldn't go as far as rejecting.
All of these options to work around firmware/system screwups are applied
to an already-non-working system, and there is absolutely no guarantee
that necessary fixes make any kind of logical sense.
~Andrew
|