[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-changelog] [xen master] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest.



commit 7ec4b2f89e54d8bdfbeb5f4a4c2ec85d15819e8b
Author:     Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
AuthorDate: Wed Aug 26 13:15:03 2015 -0400
Commit:     Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
CommitDate: Wed Sep 2 08:47:59 2015 -0400

    tmem: Don't crash/hang/leak hypervisor when using shared pools within an 
guest.
    
    This is a regression introduced by a36590e1b5040838af19d2ca712a516f07a6062b
    "tmem: reorg the shared pool allocate path".
    
    When we are using shared pools we have an global array
    (on which we put the pool), and an array of pools per domain.
    We also have an shared list of clients (guests) _except_
    for the very first domain that created the shared pool.
    
    To deal with multiple guests using an shared pool we have an
    ref count and a linked list. Whenever an new user of
    a the shared pool joins we increase the ref count and add to
    the linked list. Whenever an user quits the shared pool
    we decrement and remove from the linked list.
    
    Unfortunately this ref counting and linked list never
    worked properly. There are multiple issues:
    
     1) If we have one shared pool (and only one guest creating it)
        - we do not add it to the shared list of clients. Which
        means the logic in 'shared_pool_quit' never removed
        the pool from global_shared_pools. That meant when the pool
        was de-allocated - we still had an pointer to the pool
        which would be accessed by tmemc_list_client (xl tmem-list -a)
        and hit a NULL page!
    
     2). If we have two shared pools in a domain - it (shared_pool_quit)
         would remove the domain from the share_list linked list, decrements
         the refcount to zero - and remove the pool from the global shared pool.
         When done it would also clear the client->pools[] to NULL for itself.
         Which is good. However since there are two shared pools in the domain
         the next entry in the client->pools[] would have a stale pointer to
         the just de-allocated pool. Accessing it and trying to de-allocate it
         would lead to crashes or hypervisor hang - depending on the build.
    
    Fun times!
    
    To trigger this use
    
http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/tmem_test/tmem_test.c
    
    This patch fixes it by making the very first domain that created
    an shared pool to follow the same logic as every domain that is
    joining a shared pool. That is increment the refcount and also
    add itself to the shared list of domains using it.
    
    We also remove an ASSERT that incorrectly assumed
    that only one shared pool would exist for a domain.
    
    And to mirror the reporting logic in shared_pool_join
    we also add a printk to advertise inter-domain shared pool
    joining.
    
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
    Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
    Release-acked-by: Wei Liu <wei.liu2@xxxxxxxxxx>
---
 xen/common/tmem.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index f2dc26e..ed9b975 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -1037,6 +1037,9 @@ static int shared_pool_join(struct tmem_pool *pool, 
struct client *new_client)
         tmem_client_info("adding new %s %d to shared pool owned by %s %d\n",
                     tmem_client_str, new_client->cli_id, tmem_client_str,
                     pool->client->cli_id);
+    else if ( pool->shared_count )
+        tmem_client_info("inter-guest sharing of shared pool %s by client 
%d\n",
+                         tmem_client_str, pool->client->cli_id);
     ++pool->shared_count;
     return 0;
 }
@@ -1056,7 +1059,10 @@ static void shared_pool_reassign(struct tmem_pool *pool)
     }
     old_client->pools[pool->pool_id] = NULL;
     sl = list_entry(pool->share_list.next, struct share_list, share_list);
-    ASSERT(sl->client != old_client);
+    /*
+     * The sl->client can be old_client if there are multiple shared pools
+     * within an guest.
+     */
     pool->client = new_client = sl->client;
     for (poolid = 0; poolid < MAX_POOLS_PER_DOMAIN; poolid++)
         if (new_client->pools[poolid] == pool)
@@ -1982,6 +1988,8 @@ static int do_tmem_new_pool(domid_t this_cli_id,
         {
             INIT_LIST_HEAD(&pool->share_list);
             pool->shared_count = 0;
+            if ( shared_pool_join(pool, client) )
+                goto fail;
             global_shared_pools[first_unused_s_poolid] = pool;
         }
     }
--
generated by git-patchbot for /home/xen/git/xen.git#master

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxx
http://lists.xensource.com/xen-changelog


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.