[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.7 crash



On 6/7/2016 5:53 AM, Ian Jackson wrote:
Aaron Cornelius writes ("Re: [Xen-devel] Xen 4.7 crash"):
We realized that we had forgotten to remove the domain from the
permissions list when the domain is deleted (which would cause the error
we saw).  The application was updated to remove the domain from the
permissions list:
1. retrieve the permissions with xs_get_permissions()
2. find the domain ID that is being deleted
3. memmove() the remaining domains down by 1 to "delete" the old domain
from the permissions list
4. update the permissions with xs_set_permissions()

After we made that change, a load test over the weekend confirmed that
the Xen crash no longer happens.  We checked this morning first thing
and confirmed that without this change the crash reliably occurs.

This is rather odd behaviour.  I don't think xenstored should hang
onto the domain's xs ring page just because the domain is still
mentioned in a permission list.

But it may do.  I haven't checked the code.  Are you using the
ocaml xenstored (oxenstored) or the C one ?

I didn't remember specifying anything special when building the xen tools, but I did run into troubles where the ocaml tools appeared to conflict with the opam installed mirage packages and libraries. Running "sudo make dist-install" command installs the ocaml libraries as root which made using opam difficult. So I did disable the ocaml tools during my build.

I double checked and confirmed that the C version of xenstored was built. We will try to test the failure scenario with oxenstored to see if it behaves any differently.

- Aaron

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.