[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v2 1/8] tmem: Don't crash/hang/leak hypervisor when using shared pools within an guest.

When we are using shared pools we have an global array
(on which we put the pool), and an array of pools per domain.
We also have an shared list of clients (guests) _except_
for the very first domain that created the shared pool.

To deal with multiple guests using an shared pool we have an
ref count and a linked list. Whenever an new user of
a the shared pool joins we increase the ref count and add to
the linked list. Whenever an user quits the shared pool
we decrement and remove from the linked list.

Unfortunately this ref counting and linked list never
worked properly. There are multiple issues:

 1) If we have one shared pool (and only one guest creating it)
    - we do not add it to the shared list of clients. Which
    means the logic in 'shared_pool_quit' never removed
    the pool from global_shared_pools. That meant when the pool
    was de-allocated - we still had an pointer to the pool
    which would be accessed by tmemc_list_client (xl tmem-list -a)
    and hit a NULL page!

 2). If we have two shared pools in a domain - it (shared_pool_quit)
     would remove the domain from the share_list linked list, decrements
     the refcount to zero - and remove the pool from the global shared pool.
     When done it would also clear the client->pools[] to NULL for itself.
     Which is good. However since there are two shared pools in the domain
     the next entry in the client->pools[] would have a stale pointer to
     the just de-allocated pool. Accessing it and trying to de-allocate it
     would lead to crashes or hypervisor hang - depending on the build.

Fun times!

To trigger this use

This patch fixes it by making the very first domain that created
an shared pool to follow the same logic as every domain that is
joining a shared pool. That is increment the refcount and also
add itself to the shared list of domains using it.

We also remove an ASSERT that incorrectly assumed
that only one shared pool would exist for a domain.

And to mirror the reporting logic in shared_pool_join
we also add a printk to advertise inter-domain shared pool

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
 xen/common/tmem.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/common/tmem.c b/xen/common/tmem.c
index f2dc26e..572944e 100644
--- a/xen/common/tmem.c
+++ b/xen/common/tmem.c
@@ -1037,6 +1037,9 @@ static int shared_pool_join(struct tmem_pool *pool, 
struct client *new_client)
         tmem_client_info("adding new %s %d to shared pool owned by %s %d\n",
                     tmem_client_str, new_client->cli_id, tmem_client_str,
+    else if (pool->shared_count)
+        tmem_client_info("inter-guest sharing of shared pool %s by client 
+                         tmem_client_str, pool->client->cli_id);
     return 0;
@@ -1056,7 +1059,10 @@ static void shared_pool_reassign(struct tmem_pool *pool)
     old_client->pools[pool->pool_id] = NULL;
     sl = list_entry(pool->share_list.next, struct share_list, share_list);
-    ASSERT(sl->client != old_client);
+    /*
+     * The sl->client can be old_client if there are multiple shared pools
+     * within an guest.
+     */
     pool->client = new_client = sl->client;
     for (poolid = 0; poolid < MAX_POOLS_PER_DOMAIN; poolid++)
         if (new_client->pools[poolid] == pool)
@@ -1982,6 +1988,8 @@ static int do_tmem_new_pool(domid_t this_cli_id,
             pool->shared_count = 0;
+            if (shared_pool_join(pool, client))
+                goto fail;
             global_shared_pools[first_unused_s_poolid] = pool;

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.