[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Bug #515

On Thu, 2006-03-23 at 14:06 +0000, Ewan Mellor wrote:
> On Thu, Mar 23, 2006 at 11:45:25AM +0000, Harry Butterworth wrote:
> > I updated bug #515.  If you want to suggest an approach for a solution,
> > I'll have a go at putting a patch together.
> Good catch, Harry!
> If you look at the block and xen-hotplug-cleanup scripts, you'll see
> that they claim a lock to make sure that the teardown doesn't interfere
> with the check when a device comes up.  I think that a similar thing for
> vif teardown would suffice.

I saw the block locking.

The problem isn't mutual exclusion of the vif-route and cleanup scripts
it's that the cleanup script must be serialised _after_ the vif-route

I think the kernel can reorder the event injection into user space which
is why udevsend uses sequence numbers to put events back in-order.

So, in the case where the cleanup event is injected first and gets the
lock first, the problem will still occur.

With no ordering guarantee on the event transport between kernel and
user space, the options I could think of were:

1) Use sequence numbers and reserialise events as is done by
udevsend---the easiest way to do this is to drop support for hotplug and
require udev.

2) Change the protocol such that the kernel code waits for the offline
script to complete before issuing the cleanup event.  This would require
a state change in the backend when the offline script completes which
would trigger the backend to unregister the xenbus device.  I'm not sure
that I understand the full implications of this.

3) Somehow combine the offline and cleanup into one event.  I don't know
exactly how.

4) Use the lock even though we think the design is flawed and hope it
will work most of the time.

Any better ideas?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.