[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hvmloader - allow_memory_relocate overlaps



On 2023-12-19 17:12, Jan Beulich wrote:
> On 16.12.2023 08:01, Neowutran wrote:
> > I am wondering if the variable "allow_memory_relocate" is still
> > relevant today and if its default value is still relevant. 
> > Should it be defined to 0 by default instead of 1 (it seems to be a
> > workaround for qemu-traditional, so maybe an outdated default value ? ) ? 
> 
> So are you saying you use qemu-trad?
No, I am using "qemu_xen" ( from 
xenstore-read -> 'device-model =
"qemu_xen"' 

> Otherwise isn't libxl suppressing this behavior anyway?
If by "isn't libxl suppressing this behavior" you means if libxl is setting
the value of "allow_memory_relocate", then the answer is no. 

Following this lead, I checked in what code path
"allow_memory_relocate" could be defined. 

It is only defined in one code path, 

In the file "tools/libs/light/libxl_dm.c",
in the function "void libxl__spawn_local_dm(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)": 

'''
 // ...
    if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
 // ...

        libxl__xs_printf(gc, XBT_NULL,
                         GCSPRINTF("%s/hvmloader/allow-memory-relocate", path),
                         "%d",
                         
b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
                         !libxl__vnuma_configured(b_info));
// ...
'''

However, on QubesOS (my system), "local_dm" is never used, "stub_dm"
is always used. 

In the function "void lib
xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", 
the value of "allow_memory_relocate" is never defined. 

I tried to patch the code to define "allow_memory_relocate" in
"libxl__spawn_stub_dm": 

'''
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
libxl__stub_dm_spawn_state *sdss)
                                        libxl__xs_get_dompath(gc, guest_domid)),
                         "%s",
                         
libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
+        libxl__xs_printf(gc, XBT_NULL,
+                         libxl__sprintf(gc, 
"%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
+                         "%d",
+                         0);
     }
     ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
     if (ret<0) {
'''

And it is indeed another way to solve my issue. 
However I do not understand the xen hypervisor enough to known i
f
"allow-memory-relocate" should have been defined in
"libxl__spawn_stub_dm" or if the real issue is somewhere else. 


> Or did that code go stale?
> 
> > Code extract: 
> > 
> > tools/firmware/hvmloader/pci.c 
> > "
> >    /*
> >      * Do we allow hvmloader to relocate guest memory in order to
> >      * increase the size of the lowmem MMIO hole?  Defaulting to 1
> >      * here will
> >  mean that non-libxl toolstacks (including xend and
> >      * home-grown ones) means that those using qemu-xen will still
> >      * experience the memory relocation bug described below; but it
> >      * also means that those using q 
> > emu-traditional will *not*
> >      * experience any change; and it also means that there is a
> >      * work-around for those using qemu-xen, namely switching to
> >      * qemu-traditional.
> >      *
> >      * If we defaulted to 0, and failing to resize the hole caused any
> >      * problems with qemu-traditional, then there is no work-around.
> >      *
> >      * Since xend can
 only use qemu-traditional, I think this is the
> >      * option that will have the least impact.
> >      */
> >     bool allow_memory_relocate = 1;
> > "
> > 
> > "
> >         /*
> >          * At the moment qemu-xen can't deal with relocated memory regions.
> >          * It's too close to the release to make a proper fix; for now,
> >          * only allow t
> > he MMIO hole to grow large enough to move guest memory
> >          * if we're running qemu-traditional.  Items that don't fit will be
> >          * relocated into the 64-bit address space.
> >          *
> >          * This loop now does the following:
> >          * - If allow_memory_relocate, increase the MMIO hole until it's
> >          *   big enough, or  
> > until it's 2GiB
> >          * - If !allow_memory_relocate, increase the MMIO hole until it's
> >          *   big enough, or until it's 2GiB, or until it overlaps guest
> >          *   memory
> >          */
> >         while ( (mmio_total > (pci_mem_end - pci_mem_start))
> >    
             && ((pci_mem_start << 1) != 0)
> >                 && (allow_memory_relocate
> >                     || (((pci_mem_start << 1) >> PAGE_SHIFT)
> >                         >= hvm_info->low_mem_pgend)) )
> >             pci_mem_start <<= 1;
> > "
> > 
> > The issue it cause is documented in the source code: guest memory can
> > be trashed by the hvmloader. 
> > 
> > Due to this issue, it is impossible to passthrough a large PCI device to a 
> > HVM with a lot of ram.
> > (https://github.com/QubesOS/qubes-issues/issues/4321). 
> > 
> > (Forcing "allow_memory_relocate" to be 0 seems to solve the issue
> > linked)
> 
> What I don't understand here (and what I also can't find helpful logs for)
> is: The code in hvmloader is in principle intended to work in both cases.
> If there's suspected guest memory corruption, can we get to see the actual
> results of the MMIO hole creation and its using for device ranges? If there
> indeed is an overlap with guest RAM, that's a bug which wants fixing.

tools/firmware/
hvmloader/pci.c, function "void pci_setup(void)"
I added a log to print if the hvmloader would have tried to overlaps
guest memory

'''

+        printf("NEOWUTRAN pci.c: pci_mem_end: %d\n",pci_mem_end);
+        printf("NEOWUTRAN pci.c: pci_mem_start: %d\n",pci_mem_start);
+        printf("NEOWUTRAN pci.c: allow_memory_relocate: 
%d\n",allow_memory_relocate);
+        printf("NEOWUTRAN pci.c: hvm_info->low_mem_pgend: 
%d\n",hvm_info->low_mem_pgend);
+
+
        while ( (mmio_total > (pci_mem_end - pci_mem_start))
                && ((pci_mem_start << 1) != 0)
                && (allow_memory_relocate
                    || (((pci_mem_start << 1) >> PAGE_SHIFT)
                        >= hvm_info->low_mem_pgend)) )
            pci_mem_start <<= 1;

+         if ( (mmio_total > (pci_mem_end - pci_mem_start))
+                && ((pci_mem_start << 1) != 0)
+                && (1
+                    || (((pci_mem_start << 1) >> PAGE_SHIFT)
+                        >= hvm_info->low_mem_pgend)) ){
+               
printf("NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest 
memory\n");
+               printf("NEOWUTRAN pci.c: pci_mem_start: %d\n",pci_mem_start);
+
+        }
'''

And the associated result is: 

(d22) NEOWUTRAN pci.c: pci_mem_end: -67108864
(d22) NEOWUTRAN pci.c: pci_mem_start: -268435456
(d22) NEOWUTRAN pci.c: allow_memory_relocate: 0
(d22) NEOWUTRAN pci.c: hvm_info->low_mem_pgend: 983040
(d22) NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory
(d22) NEOWUTRAN pci.c: pci_mem_start: -268435456

So if "allow_memory_relocate" was not defined to 0, the hvmloader
would have tried to overlaps guest memory, and it seems that is
something that qemu_xen cannot handle.

> I
> don't see why we would want to drop allow_memory_relocate in the way you
> suggest; quite the other way around, if upstream qemu still can't tolerate
> hvmloader doing this, then it wants fixing there such that RAM relocation
> can be enabled unconditionally. Then the variable will indeed not be needed
> anymor
e.

I can send different logs or try different things if needed 


> 
> Jan

Thanks, 
Neowutran




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.