[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC] libxl: use libxl_event_wait to process libxl events



Jim Fehlig writes ("[PATCH RFC] libxl: use libxl_event_wait to process libxl 
events"):
> Prior to this patch, libxl events were delivered to libvirt via
> the libxlDomainEventHandler callback registered with libxl.
> Documenation in $xensrc/tools/libxl/libxl_event.h states that the
> callback "may occur on any thread in which the application calls
> libxl". This can result in deadlock since many of the libvirt
> callees of libxl hold a lock on the virDomainObj they are working
> on. When the callback is invoked, it attempts to find a virDomainObj
> corresponding to the domain ID provided by libxl. Searching the
> domain obj list results in locking each obj before checking if it is
> active, and its ID equals the requested ID. Deadlock is possible
> when attempting to lock an obj that is already locked further up
> the call stack. Indeed, Max Ustermann recently reported an instance
> of this deadlock

This sounds like a very plausible failure mode.

> https://www.redhat.com/archives/libvir-list/2015-November/msg00130.html
> 
> This patch moves processing of libxl events to a thread, where
> libxl_event_wait() is used to collect events. This allows processing
> libxl events asynchronously in libvirt, avoiding the deadlock.

The reasoning is sound and the remedy is IMO correct.  (However, you
mean "this allows processing libxl events _synchronously_", since what
you are doing is serialising them all into your main loop.)

So:

Acked-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

NB that I have not reviewed the patch in detail.  I can do that if you
like, although of course my knowledge of the innards of libvirt is not
wonderful.

> The only reservations I have about this patch come from comments
> about libxl_event_wait in libxl_event.h
> 
>   Like libxl_event_check but blocks if no suitable events are
>   available, until some are.  Uses libxl_osevent_beforepoll/
>   _afterpoll so may be inefficient if very many domains are being
>   handled by a single program.

If this turns out to be a problem in practice, we will improve libxl's
data structures to not involve so many linear searches.  (In fact I
think you are probably already calling synchronous libxl ao functions,
which have the same performance properties, although this is not
documented.)

Given what you say above I don't think there is a reasonable
alternative remedy.  So you should go ahead with this patch.

Regards,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.