[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] HVM support for e820_host (Was: Bug: Limitation of <=2GB RAM in domU persists with 4.3.0)



On Tue, 10 Sep 2013 09:35:59 -0400, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:
On Fri, Sep 06, 2013 at 08:54:24PM +0100, Gordan Bobic wrote:
Here is a test patch I applied to:
/tools/firmware/hvmloader/e820.c

===
--- e820.c.orig 2013-09-06 11:15:20.023337321 +0100
+++ e820.c      2013-09-06 19:53:00.141876019 +0100
@@ -79,6 +79,7 @@
     unsigned int nr = 0;
     struct xen_memory_map op;
     struct e820entry map[E820MAX];
+    int e820_host = 0;
     int rc;

     if ( !lowmem_reserved_base )
@@ -88,6 +89,7 @@

     rc = hypercall_memory_op ( XENMEM_memory_map, &op);
     if ( rc != -ENOSYS) { /* It works!? */
+        e820_host = 1;
         printf("%s:%d got %d op.nr_entries \n", __func__, __LINE__,
op.nr_entries);
         dump_e820_table(&map[0], op.nr_entries);
     }
@@ -133,7 +135,12 @@
     /* Low RAM goes here. Reserve space for special pages. */
     BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20));
     e820[nr].addr = 0x100000;
-    e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) -
e820[nr].addr;
+
+    if (e820_host)
+        e820[nr].size = 0x3f7e0000 - e820[nr].addr;
+    else
+        e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) -
e820[nr].addr;
+
     e820[nr].type = E820_RAM;
     nr++;

===

I'm sure this doesn't need explicitly pointing out, but for the
record, it is a gross hack just to prove the concept.

The map dump with this patch applied and memory set to 8192 is:

===
(XEN) HVM5: BIOS map:
(XEN) HVM5:  f0000-fffff: Main BIOS
(XEN) HVM5: build_e820_table:93 got 8 op.nr_entries
(XEN) HVM5: E820 table:
(XEN) HVM5:  [00]: 00000000:00000000 - 00000000:3f790000: RAM
(XEN) HVM5:  [01]: 00000000:3f790000 - 00000000:3f79e000: ACPI
(XEN) HVM5:  [02]: 00000000:3f79e000 - 00000000:3f7d0000: NVS
(XEN) HVM5:  [03]: 00000000:3f7d0000 - 00000000:3f7e0000: RESERVED
(XEN) HVM5:  HOLE: 00000000:3f7e0000 - 00000000:3f7e7000
(XEN) HVM5:  [04]: 00000000:3f7e7000 - 00000000:40000000: RESERVED
(XEN) HVM5:  HOLE: 00000000:40000000 - 00000000:fee00000
(XEN) HVM5:  [05]: 00000000:fee00000 - 00000000:fee01000: RESERVED
(XEN) HVM5:  HOLE: 00000000:fee01000 - 00000000:ffc00000
(XEN) HVM5:  [06]: 00000000:ffc00000 - 00000001:00000000: RESERVED
(XEN) HVM5:  [07]: 00000001:00000000 - 00000002:c0870000: RAM
(XEN) HVM5: E820 table:
(XEN) HVM5:  [00]: 00000000:00000000 - 00000000:0009e000: RAM
(XEN) HVM5:  [01]: 00000000:0009e000 - 00000000:000a0000: RESERVED
(XEN) HVM5:  HOLE: 00000000:000a0000 - 00000000:000e0000
(XEN) HVM5:  [02]: 00000000:000e0000 - 00000000:00100000: RESERVED
(XEN) HVM5:  [03]: 00000000:00100000 - 00000000:3f7e0000: RAM
(XEN) HVM5:  HOLE: 00000000:3f7e0000 - 00000000:fc000000
(XEN) HVM5:  [04]: 00000000:fc000000 - 00000001:00000000: RESERVED
(XEN) HVM5:  [05]: 00000001:00000000 - 00000002:1f800000: RAM
(XEN) HVM5: Invoking ROMBIOS ...
===

Good observations:
It works! No crashes, no screen corruption! As an added bonus, it
fixes the problem of rebooting domUs causing them to lose GPU access
and eventually crash the host even with memory allocation below the
first PCI MMIO block. I am suspecting that something in the
0x3f7e0000-0x3f7e7000 hole that isn't showing up on lspci might be
responsible.

I think that proves beyond any doubt what the problem was before.

Interesting observations:
1) GPU PCI MMIO is still mapped at E0000000, rather than at the
bottom of the memory hole. That implies that SeaBIOS (or whatever
does the mapping) makes assumptions about where the memory hole
begins. This will need to somehow be fixed / made dynamic. What
decides where to map PCI memory for each device?

2) The memory hole size difference counts toward the total guest
memory. I set
memory=8192
maxmem=8192
but Windows in domU only sees 5.48GB. What is particularly odd is
that that the missing memory isn't 3GB, but 2.5GB - which implies
that, again, there are other things making assumptions about the
size and shape of the memory hole and moving the memory from the
hole elsewhere to make it usable. What does this?

My todo list, in order of priority (unless somebody here has a
better idea) is:
1) Tidy up the hole enlargement to make it dynamically based on the
host hole locations. In cases where the host hole overlaps something
other than guest RAM/HOLE (i.e. RESERVED), guest spec wins.

guest spec is .. the default hvmloader behavior?

Yes, that's exactly what I meant. At least until I can figure out
what necessitates the default HVM behaviour.

2) Fix whatever is causing the hole memory increase to reduce the
guest memory. The memory hole is a hole, not a shadow. I need some
pointers on where to look for whatever is responsible for this.

That is where git log tools/hvmloader/firmware might shed some light.

I grepped for low_mem_pgend and high_mem_pgend, and the only place
where I have found anything is in one place in libxc. Is this what
sets it? Is this common to xm and xl?

3) Fix what makes decisions on where to map devices' memory
apertures. Ideally, the fix should be to detect host's pBAR make
vBAR=pBAR. Again, I need some pointers on where to look for whatever
is responsible for doing this mapping.

That should be all in tools/hvmloader/firmware I believe.
'pci_setup' function, where it says:
 /* Assign iomem and ioport resources in descending order of size. */

Thanks, will take a closer look there.

Gordan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.