[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3.1] libxl: Design of an async API to issue QMP commands to QEMU



On Thu, Jul 12, 2018 at 05:36:00PM +0100, Ian Jackson wrote:
> Anthony PERARD writes ("Re: [PATCH v3.1] libxl: Design of an async API to 
> issue QMP commands to QEMU"):
> > I don't think the state of a qmp connection can fit into
> > libxl__qmp_cmd_state. The "Active" state doesn't mean that a qmp
> > connection is open or that the command have been sent. It merely mean
> > that the syscall socket(3) and connect(3) have return successfully, and
> > there will be an attempt later to sent the command to qemu.
> 
> I think you haven't quite understood my point.
> 
> My understanding of your API is that it gives the user of
> libl__qmp_cmd a certain amount of visibility of the existence or
> otherwise of the qmp connection.
> 
> You say:
> 
>   + * When called from within a callback, the same QMP connection will be
>   + * reused to execute the new command. This is important in the case
>   + * where the first command is "add-fd" and the second command use the
>   + * fdset created by QEMU.
> 
> This implies that at the top of the callback, the qmp connection is
> actually in some kind of extra state which is not covered by any of
> your Undefined, Idle or Active.
> 
> It is not Undefined, obviously.
> 
> It is not Active because no response is being awaited any longer.
> (If a response were being awaited, then it would not be sensible for
> the callback to issue another command, because your rule is one
> command at once.)
> 
> And it is not Idle because it contains references to allocated
> resources - namely, the qmp connection fd.
> 
> So your documented state model is too poor to express what is going
> on.  You need to write down at least one additional state, which you
> might call `Connected'.
> 
> 
> Also, I have just noticed that you say "from within a callback".  That
> suggests that the code which makes the callback does something to the
> libxl__qmp_state after after the callback returns.
> 
> That is contrary to the way that everything else works in libxl.  In
> libxl, such a callback is made as the last thing before returning back
> to the event loop.
> 
> The reason is that the state struct (in this case the libxl__qmp_cmd)
> may be part of a larger struct, which is completed and deallocated
> somewhere in the series of (nested) callback functions.  So the memory
> may not be valid any more.
> 
> 
> Another way to look at this is that you actually have a fourth state
> which I will call WithinCallback.  On entry to the callback, the cmd
> is in WithinCallback state.
> 
> In WithinCallback state, it is allowed to request that another command
> is sent (putting the cmd back into Active).
> 
> But in the WithinCallback state, the libxl__qmp_cmd contains
> references to resources and may not be freed.  Furthermore, as I read
> your commentary, the WithinCallback state cannot be exited other than
> to Active, or by returning from the callback.
> 
> That would make it very awkward for the user of this interface to ever
> free the cmd.  After all, they can only free the memory for it when
> the state is Idle or Undefined.  So they need to get it into Idle
> which they can only do by returning, but then they have lost
> control...
> 
> 
> So this is why I say you should have a proper fourth state, which we
> can call Connected or something.  On entry to the callback, the cmd is
> in state Connected.  The cmd should stay Connected until it is
> explicitly disposed of.
> 
> The code which makes the callback then does not need to do anything
> after making the callback: the user has sent another command; or is
> going to send another command; or has called _dispose.  In any case,
> the caller has ownership.

I'll reply to that with a new patch which I hope will anwser your
comments.


> > > Also, I don't think this description of the semantics is right.  The
> > > caller must always somehow arrange to initialise this value.
> > > Presumably _init clears it ?  Certainly this as a parameter to the
> > > operation, this should be up with domid and callback.
> > > 
> > > Maybe you want comments like the ones in libxl__datacopier_state etc.,
> > > which say /* caller must fill these in */.
> > > 
> > > And, you probably want to make it clear that the fd remains open in
> > > the libxl process.  (I assume it does.)
> > 
> > Well I was closing the fd once it was sent to QEMU. But we can have the
> > callbacks takes care of closing it instead.
> 
> I don't mind what the semantics are, but they should be clear, and all
> the error cases need to work correctly.  Eg if asking to send a fd
> causes the fd to become owned by the qmp machiner, then if the attempt
> to send the qmp command fails (maybe becaue the qmp connection fails)
> then it must be closed.
> 
> The semantics should probably be whichever are more convenient.
> Personally i would lean towards qmp_cmd not taking ownership, because
> if you do then someone who wants to keep a copy of the fd has to dup
> it and duping a carefd is quite a faff.

Sounds good, I'll do it this way (the caller will keep ownership).

> > > > +libxl__qmp_error_class = Enumeration("qmp_error_class", [
> > > > +    # No error
> > > > +    (0, "NONE"),
> > > > +    # Error generated by libxl (e.g. socket closed unexpectedly, no 
> > > > mem, ...)
> > > > +    (1, "libxl_error"),
> > > > +    # QMP error classes described in QEMU sources code (QapiErrorClass)
> > > > +    (2, "GenericError"),
> > > > +    (3, "CommandNotFound"),
> > > > +    (4, "DeviceNotActive"),
> > > > +    (5, "DeviceNotFound"),
> > > > +    # Unrecognized QMP error class
> > > > +    (6, "Unknown"),
> > > 
> > > Are these numbers from qmp ?  Why not assign a bunch of libxl error
> > > values instead ?
> > 
> > No, these are strings from QEMU, numbers doesn't matter. Also I don't
> > know how well those are going to fit into libxl errors.
> 
> I meant, invent a libxl error number (and corresponding ERROR_QMP_BLAH
> or whatever) for each one of these qmp errors.

That sounds good, I'll invent new libxl error numbers.

> > (qemu.git:qapi/common.json)
> > # @QapiErrorClass:
> > #
> > # QEMU error classes
> > #
> > # @GenericError: this is used for errors that don't require a specific error
> > #                class. This should be the default case for most errors
> > #
> > # @CommandNotFound: the requested command has not been found
> > #
> > # @DeviceNotActive: a device has failed to be become active
> > #
> > # @DeviceNotFound: the requested device has not been found
> > 
> > QEMU always associate an error messages when it send an error, so the
> > only thing t odo with GenericError is to log that message.
> 
> I guess you should alwyas log the error anyway.  But discriminating
> the different qmp errors will probably be useful ?

I do think both DeviceNotActive and DeviceNotFound can be usefull to
know to a caller, and in anycase, I will keep logging error messages
from qemu.

Thanks,

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.