[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 5/6] tools/xenstored: partially handle domains without a shared ring
Hi all, On 23/09/2021 08:56, Julien Grall wrote: On 23/09/2021 12:23, Roger Pau Monné wrote:On Wed, Sep 22, 2021 at 06:46:25PM +0500, Julien Grall wrote:I thought a bit more and looked at the code (I don't have access to a testmachine at the moment). I think there is indeed a problem. Some watchers of @releaseDomain (such as xenconsoled) will only remove a domain from their internal state when the domain is actually dead.This is based on dominfo.dying which is only set when all the resources are relinquished and waiting for the other domains to release any resources forthat domain.The problem is Xenstore may fail to map the interface or the event channel long before the domain is actually dead. With the current check, we would free the allocated structure and therefore send @releaseDomain too early. Sodaemon like xenconsoled, would never cleanup for the domain and leave azombie domain. Well... until the next @releaseDomain (or @introduceDomainfor Xenconsoled) AFAICT.The revised patch is meant to solve it by just ignoring the connection. With that approach, we would correctly notify watches when the domain is dead.I think the patch provided by Julien is indeed better than the current code, or else you will miss @releaseDomain events in corner cases for dominfo.dying. I think however the patch is missing a change in the order of the checks in conn_can_{read,write}, so that the is_ignored check is performed before calling can_{read,write} which will try to poke at the interface and trigger a fault if not mapped.Ah good point. I originally moved the is_ignored check after the callback because the socket connection can in theory be closed from can_{read, write}.However, in pratice, is_ignored is only set for socket from ignore_connection() that will also close the socket.The new use will only set is_ignored for the domain connection. So I am guessing moving the check early on ought to be fine.The alternative would be to call ignore_connection() but this feels a bit weird because most of it would be a NOP as we are introducing the domain. At the end I went with re-using ignore_connection() and posted a patch for discussion: https://lore.kernel.org/xen-devel/20211020144519.10362-1-julien@xxxxxxx/T/#u Cheers, -- Julien Grall
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |