[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1 12/12] libxl: add device backend listener in order to launch backends



Roger Pau Monné writes ("Re: [PATCH v1 12/12] libxl: add device backend 
listener in order to launch backends"):
> So if I got it right, this new libxl__nested_ao_create will return a new
> ao (with a new gc), that I could use in conjunction with the
> long-running ao that I use in the main xs_watch loop, right?

Yes.  It would give you a new psuedo-ao, which you can use for
per-event memory allocation.  It's a psuedo-ao in the sense that you
mustn't call libxl__ao_abort or libxl__ao_complete on it, but it would
have the right type and in particular you could stuff it in
sub-operations' ao fields, call STATE_AO_GC on it and so on.  I could
make it possible to call libxl__ao_inprogress and have that reflected
to the underlyhing real ao.

> That sounds like a good solution to my problem, I wouldn't mind if you
> write that :)

OK, watch this space.

> I'm wondering if there are also other memory problems, even when using
> this approach, for example I register a xswatch callback, and the
> callback gets called with a watch_path and an event_path arguments, does
> the internal libxl event handler machinery reuse those (or allocate and
> free them after each loop)?

The event machinery gets those from a different gc which is
per-system-event, so that's not a problem.  (Otherwise waiting for a a
particular thing in xenstore would involve memory growing endlessly
with calls to read from xenstore, ec.)

> >> +    case LIBXL__DEVICE_KIND_VBD:
> >> +    case LIBXL__DEVICE_KIND_VIF:
> >> +        if (dev->backend_kind == LIBXL__DEVICE_KIND_VBD) 
> >> dguest->num_vbds--;
> >> +        if (dev->backend_kind == LIBXL__DEVICE_KIND_VIF) 
> >> dguest->num_vifs--;
> > 
> > Is it really safe to decrement these already ?  What if something else
> > comes along in the meantime and makes num_devs 0 (below) and removes
> > everything while this operation is still running and liable to be
> > reentered on completion ?
> 
> That's the point of decrementing it here, so that we get to 0 (if this
> is the last device), and remove the libxl__ddomain_guest and
> libxl__ddomain_device. Then, when the remove AO finishes, the AO
> callback will take care of removing the associated libxl__device.
> 
> I thought backend_watch_callback could not be called concurrently, but
> maybe that's not true? (and if that's the case ignore everything above
> because it's completely wrong)

While you are _actually in this function_, you hold the Big Lock.  So
nothing else can come along find the wrong value of num_*.

But what you actually do is call initiate_device_remove and then
return - ie, you return to the event loop.  That gives up the lock,
obviously.  So while the device removal is proceeding, other events
can occur.

If backend_watch_callback happens then, I think you may find that it
seems num_*==0 and decides to tear down the state for that domain.
That would at the very least involve messing about in xenstore with
the device which is still being removed.

Then, later, the device removal will complete and device_complete will
be called.

I think you need to do the decrement in device_complete, and that
means you need a kind of "perhaps tidy up domain" function which you
can call from both there and backend_watch_callback.  And you probably
need to provide some more useful pointers to device_complete.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.