Xen project Mailing List

Re: hvmloader - allow_memory_relocate overlaps

Date: Thu, 21 Dec 2023 19:08:50 +0100

Cc: Neowutran <xen@xxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Thu, 21 Dec 2023 18:09:23 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 2023-12-19 17:12, Jan Beulich wrote: > On 16.12.2023 08:01, Neowutran wrote: > > I am wondering if the variable "allow_memory_relocate" is still > > relevant today and if its default value is still relevant. > > Should it be defined to 0 by default instead of 1 (it seems to be a > > workaround for qemu-traditional, so maybe an outdated default value ? ) ? > > So are you saying you use qemu-trad? No, I am using "qemu_xen" ( from xenstore-read -> 'device-model = "qemu_xen"' > Otherwise isn't libxl suppressing this behavior anyway? If by "isn't libxl suppressing this behavior" you means if libxl is setting the value of "allow_memory_relocate", then the answer is no. Following this lead, I checked in what code path "allow_memory_relocate" could be defined. It is only defined in one code path, In the file "tools/libs/light/libxl_dm.c", in the function "void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)": ''' // ... if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) { // ... libxl__xs_printf(gc, XBT_NULL, GCSPRINTF("%s/hvmloader/allow-memory-relocate", path), "%d", b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL && !libxl__vnuma_configured(b_info)); // ... ''' However, on QubesOS (my system), "local_dm" is never used, "stub_dm" is always used. In the function "void lib xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", the value of "allow_memory_relocate" is never defined. I tried to patch the code to define "allow_memory_relocate" in "libxl__spawn_stub_dm": ''' --- a/tools/libs/light/libxl_dm.c +++ b/tools/libs/light/libxl_dm.c @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss) libxl__xs_get_dompath(gc, guest_domid)), "%s", libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios)); + libxl__xs_printf(gc, XBT_NULL, + libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)), + "%d", + 0); } ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid); if (ret<0) { ''' And it is indeed another way to solve my issue. However I do not understand the xen hypervisor enough to known i f "allow-memory-relocate" should have been defined in "libxl__spawn_stub_dm" or if the real issue is somewhere else. > Or did that code go stale? > > > Code extract: > > > > tools/firmware/hvmloader/pci.c > > " > > /* > > * Do we allow hvmloader to relocate guest memory in order to > > * increase the size of the lowmem MMIO hole? Defaulting to 1 > > * here will > > mean that non-libxl toolstacks (including xend and > > * home-grown ones) means that those using qemu-xen will still > > * experience the memory relocation bug described below; but it > > * also means that those using q > > emu-traditional will *not* > > * experience any change; and it also means that there is a > > * work-around for those using qemu-xen, namely switching to > > * qemu-traditional. > > * > > * If we defaulted to 0, and failing to resize the hole caused any > > * problems with qemu-traditional, then there is no work-around. > > * > > * Since xend can only use qemu-traditional, I think this is the > > * option that will have the least impact. > > */ > > bool allow_memory_relocate = 1; > > " > > > > " > > /* > > * At the moment qemu-xen can't deal with relocated memory regions. > > * It's too close to the release to make a proper fix; for now, > > * only allow t > > he MMIO hole to grow large enough to move guest memory > > * if we're running qemu-traditional. Items that don't fit will be > > * relocated into the 64-bit address space. > > * > > * This loop now does the following: > > * - If allow_memory_relocate, increase the MMIO hole until it's > > * big enough, or > > until it's 2GiB > > * - If !allow_memory_relocate, increase the MMIO hole until it's > > * big enough, or until it's 2GiB, or until it overlaps guest > > * memory > > */ > > while ( (mmio_total > (pci_mem_end - pci_mem_start)) > > && ((pci_mem_start << 1) != 0) > > && (allow_memory_relocate > > || (((pci_mem_start << 1) >> PAGE_SHIFT) > > >= hvm_info->low_mem_pgend)) ) > > pci_mem_start <<= 1; > > " > > > > The issue it cause is documented in the source code: guest memory can > > be trashed by the hvmloader. > > > > Due to this issue, it is impossible to passthrough a large PCI device to a > > HVM with a lot of ram. > > (https://github.com/QubesOS/qubes-issues/issues/4321). > > > > (Forcing "allow_memory_relocate" to be 0 seems to solve the issue > > linked) > > What I don't understand here (and what I also can't find helpful logs for) > is: The code in hvmloader is in principle intended to work in both cases. > If there's suspected guest memory corruption, can we get to see the actual > results of the MMIO hole creation and its using for device ranges? If there > indeed is an overlap with guest RAM, that's a bug which wants fixing. tools/firmware/ hvmloader/pci.c, function "void pci_setup(void)" I added a log to print if the hvmloader would have tried to overlaps guest memory ''' + printf("NEOWUTRAN pci.c: pci_mem_end: %d\n",pci_mem_end); + printf("NEOWUTRAN pci.c: pci_mem_start: %d\n",pci_mem_start); + printf("NEOWUTRAN pci.c: allow_memory_relocate: %d\n",allow_memory_relocate); + printf("NEOWUTRAN pci.c: hvm_info->low_mem_pgend: %d\n",hvm_info->low_mem_pgend); + + while ( (mmio_total > (pci_mem_end - pci_mem_start)) && ((pci_mem_start << 1) != 0) && (allow_memory_relocate || (((pci_mem_start << 1) >> PAGE_SHIFT) >= hvm_info->low_mem_pgend)) ) pci_mem_start <<= 1; + if ( (mmio_total > (pci_mem_end - pci_mem_start)) + && ((pci_mem_start << 1) != 0) + && (1 + || (((pci_mem_start << 1) >> PAGE_SHIFT) + >= hvm_info->low_mem_pgend)) ){ + printf("NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory\n"); + printf("NEOWUTRAN pci.c: pci_mem_start: %d\n",pci_mem_start); + + } ''' And the associated result is: (d22) NEOWUTRAN pci.c: pci_mem_end: -67108864 (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456 (d22) NEOWUTRAN pci.c: allow_memory_relocate: 0 (d22) NEOWUTRAN pci.c: hvm_info->low_mem_pgend: 983040 (d22) NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456 So if "allow_memory_relocate" was not defined to 0, the hvmloader would have tried to overlaps guest memory, and it seems that is something that qemu_xen cannot handle. > I > don't see why we would want to drop allow_memory_relocate in the way you > suggest; quite the other way around, if upstream qemu still can't tolerate > hvmloader doing this, then it wants fixing there such that RAM relocation > can be enabled unconditionally. Then the variable will indeed not be needed > anymor e. I can send different logs or try different things if needed > > Jan Thanks, Neowutran

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.