[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC XEN PATCH 15/16] tools/libxl: handle return code of libxl__qmp_initializations()



On 02/09/17 10:13 +0000, Wei Liu wrote:
On Thu, Feb 09, 2017 at 10:47:01AM +0800, Haozhong Zhang wrote:
On 02/08/17 10:31 +0000, Wei Liu wrote:
> On Wed, Feb 08, 2017 at 02:07:26PM +0800, Haozhong Zhang wrote:
> > On 01/27/17 17:11 -0500, Konrad Rzeszutek Wilk wrote:
> > > On Mon, Oct 10, 2016 at 08:32:34AM +0800, Haozhong Zhang wrote:
> > > > If any error code is returned when creating a domain, stop the domain
> > > > creation.
> > >
> > > This looks like it is a bug-fix that can be spun off from this
> > > patchset?
> > >
> >
> > Yes, if everyone considers it's really a bug and the fix does not
> > cause compatibility problem (e.g. xl w/o this patch does not abort the
> > domain creation if it fails to connect to QEMU VNC port).
> >
>
> I'm two minded here. If the failure to connect is caused by some
> temporary glitches in QEMU and we're sure it will eventually succeed,
> there is no need to abort domain creation. If failure to connect is due
> to permanent glitches, we should abort.
>

Sorry, I should say "*query* QEMU VNC port" instead of *connect*.

libxl__qmp_initializations() currently does following tasks.
1/ Create a QMP socket.

  I think all failures in 1/ should be considered as permanent. It
  does not only fail the following tasks, but also fails the device
  hotplug which needs to cooperate with QEMU.

2/ If 1/ succeeds, query qmp about parameters of serial port and fill
  them in xenstore.
3/ If 1/ and 2/ succeed, set and query qmp about parameters (password,
  address, port) of VNC and fill them in xenstore.

  If we assume Xen always send the correct QMP commands and
  parameters, the QMP failures in 2/ and 3/ will be caused by QMP
  socket errors (see qmp_next()), which are hard to tell whether they
  are permanent or temporal. However, if the missing of serial port
  or VNC is considered as not affecting the execution of guest
  domain, we may ignore failures here.

> OOI how did you discover this issue? That could be the key to understand
> the issue here.

The next patch adds code in libxl__qmp_initialization() to query qmp
about vNVDIMM parameters (e.g. the base gpfn which is calculated by
QEMU) and return error code if it fails. While I was developing that
patch, I found xl didn't stop even if bugs in my QEMU patches failed
the code in my Xen patch.


Right, this should definitely be fatal.

Maybe we could let libxl__qmp_initializations() report whether a
failure can be tolerant. For non-tolerant failures (e.g. those in 1/),
xl should stop. For tolerant failures (e.g. those in 2/ and 3/), xl
can continue, but it needs to warn those failures.


Yes, we can do that. It's an internal function, we can change things as
we see fit.

I would suggest you only make vNVDIMM failure fatal as a start.


I'll send a patch out of this series to implement above w/o NVDIMM
stuffs.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.