[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: Tmem vs order>0 allocation, workaround RFC



Besides generally not liking hackery like this (but we all seem to agree on
that part), and besides having an un-explained feeling that there may
be other bad effects from this, I also think that on large systems this
may not work well: When you have 1Tb, you'd reserve 8G, making Dom0
single-page-below-4G-allocations impossible (unless dom0_mem= was
used) if I read the logic correctly.

Jan

>>> Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> 15.02.10 17:36 >>>
This version should have zero impact if tmem is not enabled.

=======

When tmem is enabled, reserve a fraction of memory
for allocations of 0<order<9 to avoid fragmentation
issues.

Signed-off by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>

diff -r 3bb163b74673 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c   Fri Feb 12 09:24:18 2010 +0000
+++ b/xen/common/page_alloc.c   Mon Feb 15 09:28:01 2010 -0700
@@ -223,6 +223,12 @@ static heap_by_zone_and_order_t *_heap[M
 
 static unsigned long *avail[MAX_NUMNODES];
 static long total_avail_pages;
+static long max_total_avail_pages; /* highwater mark */
+
+/* reserved for midsize (0<order<9) allocations, tmem only for now */
+static long midsize_alloc_zone_pages;
+#define MIDSIZE_ALLOC_FRAC 128
+
 
 static DEFINE_SPINLOCK(heap_lock);
 
@@ -304,6 +310,15 @@ static struct page_info *alloc_heap_page
     spin_lock(&heap_lock);
 
     /*
+       When available memory is scarce, allow only mid-size allocations
+       to avoid worst of fragmentation issues.  For now, only special-case
+       this when transcendent memory is enabled
+    */
+    if ( opt_tmem && ((order == 0) || (order >= 9)) &&
+         (total_avail_pages <= midsize_alloc_zone_pages) )
+        goto fail;
+
+    /*
      * Start with requested node, but exhaust all node memory in requested 
      * zone before failing, only calc new node value if we fail to find memory 
      * in target node, this avoids needless computation on fast-path.
@@ -337,6 +352,7 @@ static struct page_info *alloc_heap_page
     }
 
     /* No suitable memory blocks. Fail the request. */
+fail:
     spin_unlock(&heap_lock);
     return NULL;
 
@@ -503,6 +519,11 @@ static void free_heap_pages(
 
     avail[node][zone] += 1 << order;
     total_avail_pages += 1 << order;
+    if ( total_avail_pages > max_total_avail_pages )
+    {
+        max_total_avail_pages = total_avail_pages;
+        midsize_alloc_zone_pages  = max_total_avail_pages / MIDSIZE_ALLOC_FRAC;
+    }
 
     /* Merge chunks as far as possible. */
     while ( order < MAX_ORDER )
@@ -842,6 +863,8 @@ static unsigned long avail_heap_pages(
 
 unsigned long total_free_pages(void)
 {
+    if ( opt_tmem )
+        return total_avail_pages - midsize_alloc_zone_pages ;
     return total_avail_pages;
 }


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.