[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Bug #515



On Thu, Mar 23, 2006 at 02:32:34PM +0000, Harry Butterworth wrote:

> On Thu, 2006-03-23 at 14:06 +0000, Ewan Mellor wrote:
> > On Thu, Mar 23, 2006 at 11:45:25AM +0000, Harry Butterworth wrote:
> > 
> > > I updated bug #515.  If you want to suggest an approach for a solution,
> > > I'll have a go at putting a patch together.
> > 
> > Good catch, Harry!
> > 
> > If you look at the block and xen-hotplug-cleanup scripts, you'll see
> > that they claim a lock to make sure that the teardown doesn't interfere
> > with the check when a device comes up.  I think that a similar thing for
> > vif teardown would suffice.
> 
> I saw the block locking.
> 
> The problem isn't mutual exclusion of the vif-route and cleanup scripts
> it's that the cleanup script must be serialised _after_ the vif-route
> script.
> 
> I think the kernel can reorder the event injection into user space which
> is why udevsend uses sequence numbers to put events back in-order.
> 
> So, in the case where the cleanup event is injected first and gets the
> lock first, the problem will still occur.
> 
> With no ordering guarantee on the event transport between kernel and
> user space, the options I could think of were:
> 
> 1) Use sequence numbers and reserialise events as is done by
> udevsend---the easiest way to do this is to drop support for hotplug and
> require udev.
> 
> 2) Change the protocol such that the kernel code waits for the offline
> script to complete before issuing the cleanup event.  This would require
> a state change in the backend when the offline script completes which
> would trigger the backend to unregister the xenbus device.  I'm not sure
> that I understand the full implications of this.
> 
> 3) Somehow combine the offline and cleanup into one event.  I don't know
> exactly how.
> 
> 4) Use the lock even though we think the design is flawed and hope it
> will work most of the time.

Yes, well I was kind of hoping we'd get away with 4) !  Especially since
the move to 2.6.16 and onwards might move people towards udev in any
case.

I'm not sure number 3 would work, so I'm inclined to avoid that.

Number 2 sounds like the best option if you really want to do this
properly -- register a watch on "vif/<domid>/<devid>/hotplug-teardown-status",
issue the offline event, and then only call device_unregister once
hotplug-teardown-status goes to "done".

Cheers,

Ewan.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.