[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.7 crash



Hi Andrew,

On 01/06/2016 22:24, Andrew Cooper wrote:
On 01/06/2016 21:45, Aaron Cornelius wrote:

However, since I only have 1 domain active at a time, I'm not sure why I
should run out of VM IDs.

Sounds like a VMID resource leak.  Check to see whether it is freed properly
in domain_destroy().

~Andrew
That would be my assumption.  But as far as I can tell, arch_domain_destroy() 
calls pwm_teardown() which calls p2m_free_vmid(), and none of the functionality 
related to freeing a VM ID appears to have changed in years.

The VMID handling looks suspect.  It can be called repeatedly during
domain destruction, and it will repeatedly clear the same bit out of the
vmid_mask.

Can you explain how the p2m_free_vmid can be called multiple time?

We have the following path:
   arch_domain_destroy -> p2m_teardown -> p2m_free_vmid.

And I can find only 3 call of arch_domain_destroy we should only be done once per domain.

If arch_domain_destroy is called multiple time, p2m_free_vmid will not be the only place where Xen will be in trouble.

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 838d004..7adb39a 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1393,7 +1393,10 @@ static void p2m_free_vmid(struct domain *d)
     struct p2m_domain *p2m = &d->arch.p2m;
     spin_lock(&vmid_alloc_lock);
     if ( p2m->vmid != INVALID_VMID )
-        clear_bit(p2m->vmid, vmid_mask);
+    {
+        ASSERT(test_and_clear_bit(p2m->vmid, vmid_mask));
+        p2m->vmid = INVALID_VMID;
+    }

     spin_unlock(&vmid_alloc_lock);
 }

Having said that, I can't explain why that bug would result in the
symptoms you are seeing.  It is also possibly that your issue is memory
corruption from a separate source.

Can you see about instrumenting p2m_alloc_vmid()/p2m_free_vmid() (with
vmid_alloc_lock held) to see which vmid is being allocated/freed ?
After the initial boot of the system, you should see the same vmid being
allocated and freed for each of your domains.

Looking quickly at the log, the domain is dom1101. However, the number maximum number of VMID supported is 256, so the exhaustion might be a race somewhere.

I would be interested to get a reproducer. I wrote a script to cycle a domain (create/domain) in loop, and I have not seen any issue after 1200 cycles (and counting).

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.