[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges



On 2015/6/17 17:24, Jan Beulich wrote:
On 17.06.15 at 11:18, <tiejun.chen@xxxxxxxxx> wrote:
On 2015/6/17 17:02, Jan Beulich wrote:
On 17.06.15 at 10:26, <tiejun.chen@xxxxxxxxx> wrote:
Something hits me to generate another idea,

#1. Still allocate all devices as before.
#2. Lookup all actual bars to check if they're conflicting RMRR

We can skip these bars to keep zero. Then later it would make lookup easily.

#3. Need to reallocate these conflicting bars.
#3.1 Trying to reallocate them with the remaining resources
#3.2 If the remaining resources aren't enough, we need to allocate them
from high_mem_resource.

That's possible onyl for 64-bit BARs.

You're right so this means its not proper to adjust mmio_total to
include conflicting reserved ranges and finally moved all conflicting
bars to high_mem_resource as Kevin suggested previously, so i high
level, we still need to decrease pci_mem_start to populate more RAM to
compensate them as I did, right?

You probably should do both: Prefer moving things beyond 4Gb,
but if not possible increase the MMIO hole.


I'm trying to figure out a better solution. Perhaps we can allocate 32-bit bars and 64-bit bars orderly. This may help us bypass those complicated corner cases.

#1. We don't calculate how much memory should be compensated to add them to expand something like we though previously.

#2. Instead, before allocating bars, we just check if reserved device memory is really conflicting this default region [pci_mem_start, pci_mem_end]

#2.1 If not, obviously nothing is changed.
#2.2 If yes, we introduce a new local bool, bar32_allocating, which indicates if we want to allocate 32-bit bars and 64-bit bars separately.

So here we should set as true, and we also need to set 'bar64_relocate' to relocate bars to 64-bit.

    /*
     * Check if reserved device memory conflicts current pci memory.
     * If yes, we need to first allocate bar32 since reserved devices
     * always occupy low memory, and also enable relocating some BARs
     * to 64bit as possible.
     */
    for ( i = 0; i < memory_map.nr_map ; i++ )
    {
        reserved_start = memory_map.map[i].addr;
        reserved_size = memory_map.map[i].size;
        reserved_end = reserved_start + reserved_size;
        if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start,
                           reserved_start, reserved_size) )
        {
            printf("Reserved device memory conflicts current PCI memory,"
                   " so first to allocate 32-bit BAR and trying to"
                   " relocating some BARs to 64-bit\n");
            bar32_allocating = 1;
            if ( !bar64_relocate )
                bar64_relocate = 1;
        }
    }

#2.2 then this also means we may allocate many times so add a label at the allocation,

+ further_allocate:
     /* Assign iomem and ioport resources in descending order of size. */
     for ( i = 0; i < nr_bars; i++ )
     {

#2.3 Make sure we can allocate all bars separately as we expect,

         bar_sz  = bars[i].bar_sz;

         /*
+         * This means we'd like to first allocate 32bit bar to make sure
+         * all 32bit bars can be allocated as possible.
+         */
+        if ( bars[i].is_64bar && bar32_allocating )
+            continue;
+
+        /*
          * Relocate to high memory if the total amount of MMIO needed
          * is more than the low MMIO available.  Because devices are
          * processed in order of bar_sz, this will preferentially

#2.4 We still need to skip reserved device memory

         base = (resource->base  + bar_sz - 1) & ~(uint64_t)(bar_sz - 1);
         bar_data |= (uint32_t)base;
         bar_data_upper = (uint32_t)(base >> 32);
+        /*
+         * Skip all reserved device memory.
+         *
+ * Note you need to make sure memory_map.map doesn't go high memory,
+         * then here we can make life easy.
+         */
+        if ( !using_64bar )
+        {
+            for ( j = 0; j < memory_map.nr_map ; j++ )
+            {
+                if ( memory_map.map[j].type != E820_RAM )
+                {
+ reserved_end = memory_map.map[j].addr + memory_map.map[j].size;
+                    if ( check_overlap(base, bar_sz,
+                                       memory_map.map[j].addr,
+                                       memory_map.map[j].size) )
+                    {
+                        base = reserved_end;
+                        continue;
+                    }
+                }
+            }
+        }
         base += bar_sz;

         if ( (base < resource->base) || (base > resource->max) )

#2.5 Then go to the second allocation if necessary

#2.5.1 After we're trying to allocate all 32bit-bars at the first allocation, we'll check if any 32bit bars are not allocated.

#2.5.1 If all 32bit-bars are allocated, we just set something as follows,

+                bar32_allocating = 0;
+                goto further_allocate;

then we allocate 64bit-bards at the second allocation. But note this doesn't mean we will allocate 64bit-bars just from highmem since the original mechanism still work at this point, so we're still trying to allocate 64bit-bars from the remaining low pci memory & highmem like before.

#2.5.2 If not all 32bit-bars are allocated, we need to populate RAM to extend low pci memory.

#2.5.2.1 Calculate how much memory are still needed

+            /* Calculate the remaining 32bars. */
+            for ( n = 0; n < nr_bars ; n++ )
+            {
+                if ( !bars[n].is_64bar )
+                {
+                    uint32_t devfn32, bar_reg32, bar_data32;
+                    uint64_t bar_sz32;
+                    devfn32   = bars[n].devfn;
+                    bar_reg32 = bars[n].bar_reg;
+                    bar_sz32  = bars[n].bar_sz;
+                    bar_data32 = pci_readl(devfn32, bar_reg32);
+                    if ( !bar_data32 )
+                        mmio32_unallocated_total  += bar_sz32;
+                }
+            }

#2.5.2.2 populate these memory

+ cur_pci_mem_start = pci_mem_start - mmio32_unallocated_total;
+                relocate_ram_for_pci_memory(cur_pci_mem_start);
+                exp_mem_resource.base = cur_pci_mem_start;
+                exp_mem_resource.max = pci_mem_start;

#2.5.2.3 goto further_allocate to reallocate the remaining 32bit-bars. Note we should make sure we allocate them just from exp_mem_resource above.

#2.5.3 In theory we can allocate all 32bit-bars successfully. But we can check if mmio32_unallocated_total is already set to make sure we don't fall into an infinite loop.

#2.5.4 Then as before, we would go to allocate 64bit-bars at the first allocation.

The following patch is trying to implement my idea,

diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
index 5ff87a7..f2953c0 100644
--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -38,6 +38,31 @@ uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
 enum virtual_vga virtual_vga = VGA_none;
 unsigned long igd_opregion_pgbase = 0;

+static void relocate_ram_for_pci_memory(unsigned long cur_pci_mem_start)
+{
+    struct xen_add_to_physmap xatp;
+    unsigned int nr_pages = min_t(
+        unsigned int,
+        hvm_info->low_mem_pgend - (cur_pci_mem_start >> PAGE_SHIFT),
+        (1u << 16) - 1);
+    if ( hvm_info->high_mem_pgend == 0 )
+        hvm_info->high_mem_pgend = 1ull << (32 - PAGE_SHIFT);
+    hvm_info->low_mem_pgend -= nr_pages;
+    printf("Relocating 0x%x pages from "PRIllx" to "PRIllx\
+           " for lowmem MMIO hole\n",
+           nr_pages,
+           PRIllx_arg(((uint64_t)hvm_info->low_mem_pgend)<<PAGE_SHIFT),
+           PRIllx_arg(((uint64_t)hvm_info->high_mem_pgend)<<PAGE_SHIFT));
+    xatp.domid = DOMID_SELF;
+    xatp.space = XENMAPSPACE_gmfn_range;
+    xatp.idx   = hvm_info->low_mem_pgend;
+    xatp.gpfn  = hvm_info->high_mem_pgend;
+    xatp.size  = nr_pages;
+    if ( hypercall_memory_op(XENMEM_add_to_physmap, &xatp) != 0 )
+        BUG();
+    hvm_info->high_mem_pgend += nr_pages;
+}
+
 void pci_setup(void)
 {
     uint8_t is_64bar, using_64bar, bar64_relocate = 0;
@@ -50,7 +75,7 @@ void pci_setup(void)
     /* Resources assignable to PCI devices via BARs. */
     struct resource {
         uint64_t base, max;
-    } *resource, mem_resource, high_mem_resource, io_resource;
+ } *resource, mem_resource, high_mem_resource, io_resource, exp_mem_resource;

     /* Create a list of device BARs in descending order of size. */
     struct bars {
@@ -59,8 +84,11 @@ void pci_setup(void)
         uint32_t bar_reg;
         uint64_t bar_sz;
     } *bars = (struct bars *)scratch_start;
-    unsigned int i, nr_bars = 0;
-    uint64_t mmio_hole_size = 0;
+    unsigned int i, j, n, nr_bars = 0;
+ uint64_t mmio_hole_size = 0, reserved_start, reserved_end, reserved_size;
+    bool bar32_allocating = 0;
+    uint64_t mmio32_unallocated_total = 0;
+    unsigned long cur_pci_mem_start = 0;

     const char *s;
     /*
@@ -309,29 +337,31 @@ void pci_setup(void)
     }

     /* Relocate RAM that overlaps PCI space (in 64k-page chunks). */
+    cur_pci_mem_start = pci_mem_start;
     while ( (pci_mem_start >> PAGE_SHIFT) < hvm_info->low_mem_pgend )
+        relocate_ram_for_pci_memory(cur_pci_mem_start);
+
+    /*
+     * Check if reserved device memory conflicts current pci memory.
+     * If yes, we need to first allocate bar32 since reserved devices
+     * always occupy low memory, and also enable relocating some BARs
+     * to 64bit as possible.
+     */
+    for ( i = 0; i < memory_map.nr_map ; i++ )
     {
-        struct xen_add_to_physmap xatp;
-        unsigned int nr_pages = min_t(
-            unsigned int,
-            hvm_info->low_mem_pgend - (pci_mem_start >> PAGE_SHIFT),
-            (1u << 16) - 1);
-        if ( hvm_info->high_mem_pgend == 0 )
-            hvm_info->high_mem_pgend = 1ull << (32 - PAGE_SHIFT);
-        hvm_info->low_mem_pgend -= nr_pages;
-        printf("Relocating 0x%x pages from "PRIllx" to "PRIllx\
-               " for lowmem MMIO hole\n",
-               nr_pages,
-               PRIllx_arg(((uint64_t)hvm_info->low_mem_pgend)<<PAGE_SHIFT),
- PRIllx_arg(((uint64_t)hvm_info->high_mem_pgend)<<PAGE_SHIFT));
-        xatp.domid = DOMID_SELF;
-        xatp.space = XENMAPSPACE_gmfn_range;
-        xatp.idx   = hvm_info->low_mem_pgend;
-        xatp.gpfn  = hvm_info->high_mem_pgend;
-        xatp.size  = nr_pages;
-        if ( hypercall_memory_op(XENMEM_add_to_physmap, &xatp) != 0 )
-            BUG();
-        hvm_info->high_mem_pgend += nr_pages;
+        reserved_start = memory_map.map[i].addr;
+        reserved_size = memory_map.map[i].size;
+        reserved_end = reserved_start + reserved_size;
+        if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start,
+                           reserved_start, reserved_size) )
+        {
+            printf("Reserved device memory conflicts current PCI memory,"
+                   " so first to allocate 32-bit BAR and trying to"
+                   " relocating some BARs to 64-bit\n");
+            bar32_allocating = 1;
+            if ( !bar64_relocate )
+                bar64_relocate = 1;
+        }
     }

high_mem_resource.base = ((uint64_t)hvm_info->high_mem_pgend) << PAGE_SHIFT;
@@ -352,6 +382,7 @@ void pci_setup(void)
     io_resource.base = 0xc000;
     io_resource.max = 0x10000;

+ further_allocate:
     /* Assign iomem and ioport resources in descending order of size. */
     for ( i = 0; i < nr_bars; i++ )
     {
@@ -360,6 +391,13 @@ void pci_setup(void)
         bar_sz  = bars[i].bar_sz;

         /*
+         * This means we'd like to first allocate 32bit bar to make sure
+         * all 32bit bars can be allocated as possible.
+         */
+        if ( bars[i].is_64bar && bar32_allocating )
+            continue;
+
+        /*
          * Relocate to high memory if the total amount of MMIO needed
          * is more than the low MMIO available.  Because devices are
          * processed in order of bar_sz, this will preferentially
@@ -395,7 +433,14 @@ void pci_setup(void)
                 bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
             }
             else {
-                resource = &mem_resource;
+                /*
+                 * This menas we're trying to use that expanded
+                 * memory to reallocate 32bars.
+                 */
+                if ( mmio32_unallocated_total )
+                    resource = &exp_mem_resource;
+                else
+                    resource = &mem_resource;
                 bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
             }
             mmio_total -= bar_sz;
@@ -409,6 +454,29 @@ void pci_setup(void)
         base = (resource->base  + bar_sz - 1) & ~(uint64_t)(bar_sz - 1);
         bar_data |= (uint32_t)base;
         bar_data_upper = (uint32_t)(base >> 32);
+        /*
+         * Skip all reserved device memory.
+         *
+ * Note you need to make sure memory_map.map doesn't go high memory,
+         * then here we can make life easy.
+         */
+        if ( !using_64bar )
+        {
+            for ( j = 0; j < memory_map.nr_map ; j++ )
+            {
+                if ( memory_map.map[j].type != E820_RAM )
+                {
+ reserved_end = memory_map.map[j].addr + memory_map.map[j].size;
+                    if ( check_overlap(base, bar_sz,
+                                       memory_map.map[j].addr,
+                                       memory_map.map[j].size) )
+                    {
+                        base = reserved_end;
+                        continue;
+                    }
+                }
+            }
+        }
         base += bar_sz;

         if ( (base < resource->base) || (base > resource->max) )
@@ -439,6 +507,54 @@ void pci_setup(void)
         else
             cmd |= PCI_COMMAND_IO;
         pci_writew(devfn, PCI_COMMAND, cmd);
+
+        /* If we finish allocating bar32 at the first time. */
+        if ( i == nr_bars && bar32_allocating )
+        {
+            /*
+             * We won't repeat to populate more RAM to finalize
+             * allocate all 32bars, so just go to allocate 64bit-bars.
+             */
+            if ( mmio32_unallocated_total )
+            {
+                bar32_allocating = 0;
+                mmio32_unallocated_total = 0;
+                high_mem_resource.base =
+                        ((uint64_t)hvm_info->high_mem_pgend) << PAGE_SHIFT;
+                goto further_allocate;
+            }
+
+            /* Calculate the remaining 32bars. */
+            for ( n = 0; n < nr_bars ; n++ )
+            {
+                if ( !bars[n].is_64bar )
+                {
+                    uint32_t devfn32, bar_reg32, bar_data32;
+                    uint64_t bar_sz32;
+                    devfn32   = bars[n].devfn;
+                    bar_reg32 = bars[n].bar_reg;
+                    bar_sz32  = bars[n].bar_sz;
+                    bar_data32 = pci_readl(devfn32, bar_reg32);
+                    if ( !bar_data32 )
+                        mmio32_unallocated_total  += bar_sz32;
+                }
+            }
+
+            /*
+             * We have to populate more RAM to further allocate
+             * the remaining 32bars.
+             */
+            if ( mmio32_unallocated_total )
+            {
+ cur_pci_mem_start = pci_mem_start - mmio32_unallocated_total;
+                relocate_ram_for_pci_memory(cur_pci_mem_start);
+                exp_mem_resource.base = cur_pci_mem_start;
+                exp_mem_resource.max = pci_mem_start;
+            }
+            else
+                bar32_allocating = 0;
+            goto further_allocate;
+        }
     }

     if ( pci_hi_mem_start )


Thanks
Tiejun

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.