[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 12 of 14 v4] libxl: set frontend status to 6 on domain destroy



On Wed, 2011-12-14 at 09:13 +0000, Roger Pau Monnà wrote:
> 2011/12/13 Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>:
> > Roger Pau Monne writes ("[Xen-devel] [PATCH 12 of 14 v4] libxl: set 
> > frontend status to 6 on domain destroy"):
> >> libxl: set frontend status to 6 on domain destroy
> >>
> >> Set frontend status to 6 on domain destruction and wait for devices to
> >> be disconnected before executing hotplug scripts.
> >
> > There seems to be a race here.
> >
> >> +    libxl__xs_write(gc, XBT_NULL, libxl__sprintf(gc, "%s/%s", fe_path, 
> >> "state"), "6");
> >
> > So here, the kernel or backendd start racing, and you hope that they
> > win the race and close the device before ...
> 
> From my experience in NetBSD, the kernel only closes the device when
> it's frontend state is set to 6, since we destroy the domain, it is
> unable to set the status to 6, and thus the kernel doesn't detach the
> devices.

So if you rm the backend directory the NetBSD does not take that as a
sign to tear down the device? That sounds like a bug in the NetBSD
backend -- it should treat deletion of the backend state dir as if it
were reading state = "6" and shut everything down.

Or is the issue only in the userspace portions?

>  I've added some libxl__wait_for_device_state logic here, to
> assure the backend state is set to 6 before trying to execute hotplug
> scripts.

But that will always be true with this patch since you set it that way
just before, right?

If you go down this path then I think you need to set the state to
"5" (Closing) in order to prompt the backend to transition to
"6" (Closed) itself. However you need to be careful about adding a
synchronous wait to the device destroy function. This should eventually
work even if the frontend and backend are not co-operating. That starts
to look a bit like calling libxl__device_remove instead.

>  The truth is that I had it in previous versions of my patch,
> but it seems the kernel always switches contexts and detaches the
> devices before executing hotplug scripts (it might just be luck).

Probably just luck and partly due e.g. to your presumably system being
only very lightly loaded.

Ian.

> 
> Also this patch speeds domain destruction a lot (which is also quite
> slow under Linux from what I saw).
> 
> >>      libxl__device_execute_hotplug(gc, dev, DISCONNECT);
> >
> > ... the hotplug script tries to remove it.
> >
> > Is there something we can do to make sure that we always get this
> > right ?
> >
> > Ian.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.