[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Cancelling asynchronous operations in libxl



Dave Scott writes ("Cancelling asynchronous operations in libxl"):
> Iâve re-read the thread from Nov 2013: (2013!)
> http://lists.xen.org/archives/html/xen-devel/2013-11/msg01176.html
> and found it quite thought-provoking.

Thanks.

However, I think the message you really want is

  From: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
  To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
  CC: Ian Campbell <ian.campbell@xxxxxxxxxx>
  Subject: [RFC PATCH 00/14] libxl: Asynchronous event cancellation
  Date: Fri, 20 Dec 2013 18:45:38 +0000

and the subsequent thread, which I can't find in the
lists.xenproject.org archives but I did find for example here:

  http://osdir.com/ml/xen-development/2013-12/msg00472.html

Getting the patch series out of the archive there will be a PITA so
I have pushed it to my repo on xenbits:

  git://xenbits.xen.org/people/iwj/xen.git
  base.ao-cancel.v1-2013-12..wip.ao-cancel.v1-2013-12

What I need to know, really, is:

 * Is an API along these lines going to meet your needs ?

 * Can you help me test it ?  Trying to test this in xl is going to be
   awkward and involve a lot of extraneous and very complicated signal
   handling; and AFAIAA libvirt doesn't have any cancellation
   facility.

   So if your libxl callers can exercise this cancellation
   functionality then that would be much easier.

 * Any further comments (eg, re timescales etc).


> From the Xapi/Xenopsd point of view, the main feature that weâd like
> is to be able to âunstickâ the system when it appears stuck. When
> the user gets bored and hits the big red âcancelâ button weâd like
> the particular operation/thread/call to unblock (in a timely
> fashion, itâs probably ok if this takes 30s?) and for the system to
> be left in some kind of manageable state. I think itâs ok for
> Xapi/Xenopsd to destroy any half-built VMs via fresh libxl calls
> afterwards, so libxl doesnât need to tidy everything itself
> automatically.

This is roughly what the cancellation system is supposed to do.

> I think cancellation could be quite hard to test. One thing we could
> do is add a counter and increment it every time we pass a point
> where cancellation is possible. In some libxl debug mode we could
> configure it to simulate a cancellation event when the counter
> reaches a specific value. A test harness could then try to walk
> through all the different cancellation possibilities and check the
> system is in some sensible state afterwards.

I think it might be possible to add something like that to my
cancellation proposal.

> We were thinking about running some number of libxl-based stateless
> worker processes which would also allow us to kill them with various
> signals if we really needed to. I guess in the event that libxl
> cancel didnât work for whatever reason, we could fall back to this
> rather cruder approach (although this should be only in extreme
> circumstances).

Just killing a process executing a libxl operation is likely to leave
the system in an `ugly' state.  libxl ought still to be able to deal
with it, in principle, but I wouldn't be surprised to find bugs
lurking in this kind of area.  This ought to be a last resort.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.