[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/hvmloader: select xenpci MMIO BAR UC or WB MTRR cache attribute
On 6/6/25 3:41 PM, Roger Pau Monné
wrote:
On Thu, Jun 05, 2025 at 06:16:59PM +0200, Roger Pau Monne wrote:The Xen PCI device (vendor ID 0x5853) exposed to x86 HVM guests doesn't have the functionality of a traditional PCI device. The exposed MMIO BAR is used by some guests (including Linux) as a safe place to map foreign memory, including the grant table itself. Traditionally BARs from devices have the uncacheable (UC) cache attribute from the MTRR, to ensure correct functionality of such devices. hvmloader mimics this behavior and sets the MTRR attributes of both the low and high PCI MMIO windows (where BARs of PCI devices reside) as UC in MTRR. This however causes performance issues for users of the Xen PCI device BAR, as for the purposes of mapping remote memory there's no need to use the UC attribute. On Intel systems this is worked around by using iPAT, that allows the hypervisor to force the effective cache attribute of a p2m entry regardless of the guest PAT value. AMD however doesn't have an equivalent of iPAT, and guest PAT values are always considered. Linux commit: 41925b105e34 xen: replace xen_remap() with memremap() Attempted to mitigate this by forcing mappings of the grant-table to use the write-back (WB) cache attribute. However Linux memremap() takes MTRRs into account to calculate which PAT type to use, and seeing the MTRR cache attribute for the region being UC the PAT also ends up as UC, regardless of the caller having requested WB. As a workaround to allow current Linux to map the grant-table as WB using memremap() introduce an xl.cfg option (xenpci_bar_uc=0) that can be used to select whether the Xen PCI device BAR will have the UC attribute in MTRR. Such workaround in hvmloader should also be paired with a fix for Linux so it attempts to change the MTRR of the Xen PCI device BAR to WB by itself. Overall, the long term solution would be to provide the guest with a safe range in the guest physical address space where mappings to foreign pages can be created. Some vif throughput performance figures provided by Anthoine from a 8 vCPUs, 4GB of RAM HVM guest(s) running on AMD hardware: Without this patch: vm -> dom0: 1.1Gb/s vm -> vm: 5.0Gb/s With the patch: vm -> dom0: 4.5Gb/s vm -> vm: 7.0Gb/s Reported-by: Anthoine Bourgeois <anthoine.bourgeois@xxxxxxxxxx> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> --- Changes since v2: - Add default value in xl.cfg. - List xenstore path in the pandoc file. - Adjust comment in hvmloader. - Fix commit message MIO -> MMIO. Changes since v1: - Leave the xenpci BAR as UC by default. - Introduce an option to not set it as UC. --- docs/man/xl.cfg.5.pod.in | 8 ++++ docs/misc/xenstore-paths.pandoc | 5 +++ tools/firmware/hvmloader/config.h | 2 +- tools/firmware/hvmloader/pci.c | 49 ++++++++++++++++++++++++- tools/firmware/hvmloader/util.c | 2 +- tools/include/libxl.h | 9 +++++ tools/libs/light/libxl_create.c | 1 + tools/libs/light/libxl_dom.c | 9 +++++ tools/libs/light/libxl_types.idl | 1 + tools/xl/xl_parse.c | 2 + xen/include/public/hvm/hvm_xs_strings.h | 2 + 11 files changed, 86 insertions(+), 4 deletions(-)I've noticed this is missing a changelog entry, I propose the following: diff --git a/CHANGELOG.md b/CHANGELOG.md index 1ee2f42e7405..23215a8cc1e6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,6 +15,9 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) - On x86: - Restrict the cache flushing done as a result of guest physical memory map manipulations and memory type changes. + - Allow controlling the MTRR cache attribute of the Xen PCI device BAR + for HVM guests, to improve performance of guests using it to map the grant + table or foreign memory. ### Added - On x86: I can fold into the patch if Oleksii and others agree. It would be nice: Reviewed-by: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx> Thanks. ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |