[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RMRRs and Phantom Functions


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 27 Apr 2022 10:03:43 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=x3857EKIiRgMECL1axKYb89+TdkuiPe6fyL1azL8axA=; b=CFGxvdA1fhbT0tOH8ZWF6V0kOSFIu3ttBcOD6H5gUAFB0lscHLvShALpgII6gBRDOxJSzNeUKj7fbWBEAJpaQc6Yr/yOAZElKhDqrFKO1fnMHdjT0vLuWb32qOERF16mYim5Ebgn/ZcJaJtsPAW2aAiACOfd1XcN9LV/3axRnPRw3Mo3HgAm6RCW1AH0dVRkw4TUSwHPNhObsC3t+CjbyTgFndrQlVvad0YgMz9FHpZQPmkKTtB4JbygQWL8Xn2gP1n3AjHMTrirFA1AHYvAlhBSN6t3ZFT/t+u2RYc8UMtLCvvp1TXkHoyp+3BmPLdXmLRerqHzRtXJFrDS71wIrQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=H6Xm5jngDy5P7Pg6daiXUsYWE2lMSSIhsIpe4xXmxyEZ/j4KiaIkne1c/sQsifr+eKZImuNJzeIGpSuNndmXXmOiBwdEw7y9B32KrZd1wL6LXuOuD1yscIp4gL9itE0nEbb5DdWF5t1AkthY8UFol0RaDIeggopoDWCO1W++FoCch+dnR4G0sGP1JVAfpthVJX37TqrQCdiEai0JcPZYhs/CpE8exXLANjjdSoSbpn1YUmFAoMfeMa3Aa5snzuKYhdj1odGDIqmTZIYUdyOk3sU2659e3wr5XHBLEJlDmcOItsTxYZzMqYTGoEwSNJCoRW2IELMtSNS4XI+Drwwkgw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Edwin Torok <edvin.torok@xxxxxxxxxx>
  • Delivery-date: Wed, 27 Apr 2022 08:04:23 +0000
  • Ironport-data: A9a23:uAN0P6DWaL9WnxVW/1fiw5YqxClBgxIJ4kV8jS/XYbTApGh21zwOz jFNW2/QbveOMGbxcotyaI60oE1QsMXTx4NnQQY4rX1jcSlH+JHPbTi7wuYcHM8wwunrFh8PA xA2M4GYRCwMZiaA4E/raNANlFEkvU2ybuOU5NXsZ2YgHWeIdA970Ug5w7Jj3NYz6TSEK1jlV e3a8pW31GCNg1aYAkpMg05UgEoy1BhakGpwUm0WPZinjneH/5UmJMt3yZWKB2n5WuFp8tuSH I4v+l0bElTxpH/BAvv9+lryn9ZjrrT6ZWBigVIOM0Sub4QrSoXfHc/XOdJFAXq7hQllkPhSl PlLlMeCFD5qEbTBwfshCkBATSBhaPguFL/veRBTsOS15mifKT7G5aUrC0s7e4oF5uxwHGdCs +QCLywAZQyCgOTwx6+nTu5rhYIoK8yD0IE34yk8i22GS6t7B8mcGM0m5vcBtNs0rtpJEvvEI dIQdBJkbQjaYg0JMVASYH47tLjy2iajK2AHwL6TjfQw6iv55yBM6/vKa+HLQfHXZe9axH/N8 woq+Ey8WHn2Lue32TeDt36hmOLLtSf6Q54JUq218OZwh1+ezXBVDwcZPXO5q/Skjk+1W/pEN lcZvCEpqMAa90G1T9+7QxyxplaFuAIRX5xbFOhS1e2W4q/d4gLcDG5USDdEMYYirJVvGmds0 UKVldT0AzApqKeSVX+W6raTq3W1JDQRKmgBIyQDSGPp/uXenW36tTqXJv4LLUJ/poed9e3Yq 9xSkBUDug==
  • Ironport-hdrordr: A9a23:LO7wFKOI03aT4sBcT0j155DYdb4zR+YMi2TDiHoddfUFSKalfp 6V98jztSWatN/eYgBEpTmlAtj5fZq6z+8P3WBxB8baYOCCggeVxe5ZjbcKrweQeBEWs9Qtr5 uIEJIOd+EYb2IK6voSiTPQe7hA/DDEytHPuQ639QYQcegAUdAF0+4WMHf4LqUgLzM2eKbRWa Dsrvau4FGbCAcqR/X+IkNAc/nIptXNmp6jSRkaByQ/4A3LqT+z8rb1HzWRwx9bClp0sP0f2F mAtza8yrSosvm9xBOZ/2jP765OkN+k7tdYHsSDhuUcNz2poAe1Y4ZKXaGEoVkO0aqSwWdvtO OJjwYrPsx15X+UVmapoSH10w2l6zoq42+K8y7uvVLT5ejCAB4qActIgoxUNjHD7VA7gd162K VXm0qEqpt+F3r77WrAzumNcysvulu/oHIkn+JWpWdYS5EiZLhYqpFa1F9JEa0HADnx5OkcYa ZT5fnnlbZrmG6hHjPkVjEF+q3vYp1zJGbLfqE6gL3V79AM90oJinfxx6Qk7wM9HdwGOt15Dt //Q9VVfYF1P7ErhJ1GdZc8qOuMexrwqEH3QSuvyWqOLtB0B1v977jK3Z4S2MaGPLQ18bpaou W1bLofjx9+R37T
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Apr 26, 2022 at 05:51:32PM +0000, Andrew Cooper wrote:
> Hello,
> 
> Edvin has found a machine with some very weird properties.  It is an HP
> ProLiant BL460c Gen8 with:
> 
>  \-[0000:00]-+-00.0  Intel Corporation Xeon E5/Core i7 DMI2
>              +-01.0-[11]--
>              +-01.1-[02]--
>              +-02.0-[04]--+-00.0  Emulex Corporation OneConnect 10Gb NIC
> (be3)
>              |            +-00.1  Emulex Corporation OneConnect 10Gb NIC
> (be3)
>              |            +-00.2  Emulex Corporation OneConnect 10Gb
> iSCSI Initiator (be3)
>              |            \-00.3  Emulex Corporation OneConnect 10Gb
> iSCSI Initiator (be3)
> 
> yet all 4 other functions on the device periodically hit IOMMU faults
> (~once every 5 mins, so definitely stats).
> 
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.4] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.5] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.6] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.7] fault addr
> bdf80000
> 
> There are several RMRRs covering the these devices, with:
> 
> (XEN) [VT-D]found ACPI_DMAR_RMRR:
> (XEN) [VT-D] endpoint: 0000:03:00.0
> (XEN) [VT-D] endpoint: 0000:01:00.0
> (XEN) [VT-D] endpoint: 0000:01:00.2
> (XEN) [VT-D] endpoint: 0000:04:00.0
> (XEN) [VT-D] endpoint: 0000:04:00.1
> (XEN) [VT-D] endpoint: 0000:04:00.2
> (XEN) [VT-D] endpoint: 0000:04:00.3
> (XEN) [VT-D]dmar.c:608:   RMRR region: base_addr bdf8f000 end_addr bdf92fff
> 
> being the one relevant to these faults.  I've not manually decoded the
> DMAR table because device paths are horrible to follow but there are at
> least the correct number of endpoints.  The functions all have SR-IOV
> (disabled) and ARI (enabled).  None have any Phantom functions described.

According to the PCIe spec ARI capable devices must not have phantom
functions:

"With every Function in an ARI Device, the Phantom Functions Supported
field must be set to 00b. The remainder of this field description
applies only to non-ARI multi-Function devices."

> Specifying pci-phantom=04:00,1 does appear to work around the faults,
> but it's not right, because functions 1 thru 3 aren't actually phantom.
> 
> Also, I don't see any logic which actually wires up phantom functions
> like this to share RMRRs/IVMDs in IO contexts.  The faults only
> disappear as a side effect of 04:00.0 and 04:00.4 being in dom0, as far
> as I can tell.

I think I'm slightly confused, so those faults only happen when the
devices are assigned to domains different than dom0?

It would seem to me that functions 4 to 7 not being recognized by Xen
should also lead to their context entries not being setup in the dom0
case, and thus the faults should also happen.

> Simply giving the RMRR via rmrr= doesn't work (presumably because of no
> patching actual devices, but there's no warning), but it feels as if it
> ought to.

Xen should likely complain that there's no matching PCI device for the
provided RMRR regions, and so they are effectively ignored.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.