[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 03/15] libxl_qmp: Implement fd callback and read data [and 1 more messages]



Anthony PERARD writes ("Re: [PATCH v5 03/15] libxl_qmp: Implement fd callback 
and read data [and 1 more messages]"):
> On Mon, Oct 15, 2018 at 05:35:36PM +0100, Ian Jackson wrote:
> > What are LIBXL__QMP_ERROR_CLASSes and why are they even different from
> > ERROR_* values ?  Maybe one of them is the QMP integer values and one
> > of them is the corresponding libxl integer values ?
> 
> The issue here is that, in QMP, the error is a string and in camel case,
> e.g. "GenericError".  I've use the idl to generate an enum from the
> different error strings.

You're using the idl generator to generate a table mapping
"GenericError" to LIBXL__QMP_ERROR_CLASS_GENERICERROR (by a series of
string lookups done in the libxl string to enum value function) and
then a handwritten function to convert that to
ERROR_QMP_GENERIC_ERROR, which is in turn defined in the ERROR
enum in the idl ?

I don't think the first use of the idl generator (for
LIBXL__QMP_ERROR_CLASS) is really helping very much here.

How about this: introduce in the IDL only ERROR_QMP_* values like
ERROR_QMP_GENERIC_ERROR.  The idl compiler will make a table mapping
"QMP_GENERIC_ERROR" to ERROR_QMP_GENERIC_ERROR.

You can then write a custom lookup function
  int libxl_qmp_error_class_to_libxl_error_code
      (libxl_gc *gc, const char *eclass)
which:
 * Iterates over that table, looking for strings in the table
   which start QMP_.  So it finds at "QMP_GENERIC_ERROR".
 * If an entry does start with QMP_, it uses a custom string
   comparison routine which ignores case, and skips underscores
   in the libxl error code string.  So it matches like this:
       QMP_GENERIC_ERROR
           Generic Error
 * If no matching entry is found it logs the real QMP
   error string and returns ERROR_UNKNOWN_QMP_ERROR
   (which must not be called LIBXL_QMP_UNKNOWN_ERROR in
   case QMP has or grows an "UnknownError" error code).

This is not the fastest approach imaginable but it's no worse than the
existing O(n) error string lookup and I hope this isn't a hot path.


> > > +/* return 1 when a user callback as been called */
> > > +static int qmp_ev_handle_response(libxl__egc *egc,
> > > +                                  libxl__ev_qmp *ev,
> > > +                                  const libxl__json_object *resp,
> > > +                                  libxl__qmp_message_type type)
> > > +{
> > > +    EGC_GC;
> > > +    const libxl__json_object *response;
> > > +    const libxl__json_object *o;
> > > +    int rc;
> > > +    int id;
> > > +
> > > +    o = libxl__json_map_get("id", resp, JSON_INTEGER);
> > > +    if (!o) {
> > > +        const char *error_desc;
> > > +
> > > +        /* unexpected message, attempt to find an error description */
> > > +        o = libxl__json_map_get("error", resp, JSON_MAP);
> > > +        o = libxl__json_map_get("desc", o, JSON_STRING);
> > 
> > What is the dead store from "error" doing ?
> 
> It's not dead, I reuse "o", as I get "desc" from "o".
> I could just create new variables here instead of reusing the same one
> (o) over and over again, e.g. obj_id, obj_error, obj_desc.

Oh.  Sorry.  Yes.  The idiom with the reuse of `o' is fine, I think,
but:

> Should I add here that we are parsing: {"error":{"desc": X }} ?

That would be lovely.  In general that would make these functions much
easier to read.

> > > +    case LIBXL__QMP_MESSAGE_TYPE_ERROR: {
> > > +        const char *s;
> > > +        const libxl__json_object *err;
> > > +        libxl__qmp_error_class error_class;
> > > +
> > > +        rc = ERROR_FAIL;
> > > +        response = NULL;
> > > +
> > > +        err = libxl__json_map_get("error", resp, JSON_MAP);
> > > +        o = libxl__json_map_get("class", err, JSON_STRING);
> > > +        s = libxl__json_object_get_string(o);
> > > +        if (s && !libxl__qmp_error_class_from_string(s, &error_class))
> > > +            rc = qmp_error_class_to_libxl_error_code(error_class);
> > > +
> > > +        o = libxl__json_map_get("desc", err, JSON_STRING);
> > > +        s = libxl__json_object_get_string(o);
> > > +        if (s)
> > > +            LOGD(ERROR, ev->domid, "%s", s);
> > > +
> > > +        break;
> > 
> > Surely this should print more debugging (or maybe even error log
> > messages) if the error json document was not in the expected form ?
> 
> Do you mean that if the QMP server doesn't respect the QMP protocol, I
> should print more debug here?
> 
> If we are at this point in the programme, we know that the server sent:
> {"error": X }
> 
> I'm tempted to say that if "class" or "desc" isn't present, we could say
> so and declare the server broken (and disconnect).

Yes.  Precisely.  But I think what happens with your current code is
that it falls through to the bottom with little in the way of
diagnostics.

One approach would be to add an error logging string parameter to
libxl__json_*_get, which (if non-null, and the input object is
non-null) logs an appropriate complaint including the logging string
parameter and the looked-up key.

> > > +        /* Findout how much can be parsed */
> > > +        size_t len = end - s;
> > > +
> > > +        LOG_QMP("parsing %luB: '%.*s'", len, (int)len, s);
> > > +
> > > +        /* Replace \n by \0 so that libxl__json_parse can use strlen */
> > > +        s[len - 1] = '\0';
> > 
> > Doesn't this feed the \r to libxl__json_parse ?
> > 
> > Also, why does this have to be a loop ?  Does qemu really send
> > multiple json documents end to end, and only sometimes with
> > intervening \r\n ?  Does it ever send a json document without \r\n at
> > the end and then stop speaking for a while ?
> 
> QEMU will send two messages at once, e.g. when asking to eject or change
> the cdrom, QEMU will often send the response to our command and an event
> at the same time. So we need a loop to parse all messages received.
> 
> But there is always a \r\n at the end of every commands.

So I think this is more complicated than it needs to be - and more
relaxed than it ought to be.

In my earlier comment I suggested that qmp_ev_callback_readable (or
its parent) needs to keep calling read() until it gets EWOULDBLOCK.  I
still think that's right.  If you do that, then you don't need this
loop either, because if you structure things right, the outer loop can
do it.

I tried to find an example of how to do this but I couldn't find a
simple one.  I guess it is OK to do it with nested loops.

But I think you should treat only `\n' as the delimiter.  This will
considerably simplify the buffer handling.  (You should check and trim
the preceding `\r' before passing things to libxl__json_parse of
course.)

> > > +        ev->buf_consumed += len;
> > > +
> > > +        if (ev->buf_consumed >= ev->buf_used) {
> > > +            free(ev->rx_buf);
> > 
> > Surely buf_consumed <= buf_used ?  Maybe you should assert that.
> 
> I'm attempting to detect here if there's something left in the buffer
> that hasn't been parsed yet. If there is nothing left, we can free the
> buffer.
> 
> Or do you mean that the condition should be buf_consumed == buf_used ?

It felt odd to me to read `ev->buf_consumed >= ev->buf_used' when I
knew that buf_consumed <= buf_used.  I am suggesting adding
  assert(buf_consumed <= buf_used);
and then writing
  if (ev->buf_consumed == ev->buf_used)

To be honest this condition is a bit fiddly.  Why not do this after
subtracting buf_consumed from buf_used and copying the old data down ?
Then it would read
  if (!ev->buf_used)

But even better, if you change your code to permit ev->rx_buf != NULL
when ev->rx_buf_used==0, this code goes away completely.

See my comments about permissible states of the buffer setup.

> > > +static int qmp_ev_callback_writable(libxl__gc *gc, libxl__ev_qmp *ev, 
> > > int fd)
> > > +{
> > > +    int rc;
> > > +    char *buf;
> > > +    size_t len;
> > > +    int send_fd = -1;
> > > +
> > > +    /* No need to call qmp_ev_callback_writable again, everything that 
> > > needs to
> > > +     * be send for now will be in this call. */
> > 
> > So you guarantee never to send any message which is too large to fit
> > into a socket buffer ?  Do you know what that length is ?
> 
> I'm using libxl_write_exactly().

Yes.  The fd is in nonblocking mode.  If you call libxl_write_exactly
with more data than will fit into the socket buffer,
libxl_write_exactly will bomb out with EWOULDBLOCK (which it treats as
a fatal error).  It can't do anything else.

IDK if this is going to be a problem, but it is a significant
limitation of this function.  At the very least you should add an
assert that the message is small, along with some commentary about the
size of socket buffers.

If that is not suitable, the usual approach is to prepare and store a
buffer containing the data to be sent, and then walk across it with a
series of write calls.  Like the aoutils datacopier.  (You could
perhaps use the aoutils datacopier with libxl__datacopier_prefixdata
if you wanted, but you'd have to find a /dev/null to pass for
readfd...)

> On the other end, I've just realize that I'm also using
> libxl__sendmsg_fds(), which I guess may not send everything. But the
> function also doesn't seems to the caller know how much have been sent.

libxl_write_exactly does not return until it has completed and if it
gets an errno other than EINTR it calls it failure.  Likewise
libxl_sendmsg_fds.

If you want to send fds in the middle of a long stream, where the
operation might block, you need to decide which byte the fds should be
associated with, and then call libxl__sendmsg_fds separately for just
that byte (and, presumably, write(2)) for the others.  You'll need to
enhance libxl__sendmsg_fds to understand that EWOULDBLOCK should be
handled by its caller (and maybe that in that case there should be a
check that datalen==1).

> Should I keep using libxl_write_exactly() ?
> I'm going to need to be able to track how much data have been sent for
> at least sendmsg() anyway, and try again later.

You can't use libxl_write_exactly unless you can prove it won't block.

> > > +static void qmp_ev_callback_error(libxl__egc *egc, libxl__ev_qmp *ev)
> > > +{
> > > +    EGC_GC;
> > > +
> > > +    LOGD(ERROR, ev->domid, "Error happend with the QMP connection to 
> > > QEMU");
> > > +
> > > +    /* On error, deallocate all private ressources */
> > > +    libxl__ev_qmp_dispose(gc, ev);
> > 
> > This surely needs to set the state.  Presumably that should be done in
> > libxl__ev_qmp_dispose but AFAICT it isn't.
> 
> It is done by libxl__ev_qmp_init, which _dispose calls, itn't that
> enough?

Maybe.  That depends on your table of permissible internal and
external states.

> > Going back a bit, earlier we had:
> > 
> > > +#define QMP_CAPABILITY_NEGOTIATION_MSGID 1
> > 
> > I recommend using a different value.  Can we safely let this wrap ?
> > If so maybe we could use 0x786c7100 which spells "xlq\0" or something.
> > This can make it easier to see where rogue values are coming from, to
> > distinguish arguments in debug output, etc.
> 
> ID can be string, or any json value.

Oh!

> Maybe we could have
> {"libxl-id": number} as "id"?

Making it a whole json object seems a bit overkill.  How about
  GCSPRINTF("libxl-id-%lu", id++)
?

Or did you intend to fish the id out and treat it, in the libxl code,
as just an integer ?

> But 0x786c7100 as starting point would work fine as well.
> I don't think it matter if the value wrap. Beside adding 1 to generate a
> new id, and compare, I don't think the id should be use for anything
> else.

Right.

I think TBH (uint32_t)0x786c7100 is probably good and simple.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.