Xen project Mailing List

Re: [Xen-devel] [PATCH v5 03/15] libxl_qmp: Implement fd callback and read data [and 1 more messages]

To: Anthony PERARD <anthony.perard@xxxxxxxxxx>

From: Ian Jackson <ian.jackson@xxxxxxxxxx>

Date: Mon, 29 Oct 2018 17:31:59 +0000

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Wei Liu <wei.liu2@xxxxxxxxxx>

Delivery-date: Mon, 29 Oct 2018 17:32:22 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Anthony PERARD writes ("Re: [PATCH v5 03/15] libxl_qmp: Implement fd callback and read data [and 1 more messages]"): > On Mon, Oct 15, 2018 at 05:35:36PM +0100, Ian Jackson wrote: > > What are LIBXL__QMP_ERROR_CLASSes and why are they even different from > > ERROR_* values ? Maybe one of them is the QMP integer values and one > > of them is the corresponding libxl integer values ? > > The issue here is that, in QMP, the error is a string and in camel case, > e.g. "GenericError". I've use the idl to generate an enum from the > different error strings. You're using the idl generator to generate a table mapping "GenericError" to LIBXL__QMP_ERROR_CLASS_GENERICERROR (by a series of string lookups done in the libxl string to enum value function) and then a handwritten function to convert that to ERROR_QMP_GENERIC_ERROR, which is in turn defined in the ERROR enum in the idl ? I don't think the first use of the idl generator (for LIBXL__QMP_ERROR_CLASS) is really helping very much here. How about this: introduce in the IDL only ERROR_QMP_* values like ERROR_QMP_GENERIC_ERROR. The idl compiler will make a table mapping "QMP_GENERIC_ERROR" to ERROR_QMP_GENERIC_ERROR. You can then write a custom lookup function int libxl_qmp_error_class_to_libxl_error_code (libxl_gc *gc, const char *eclass) which: * Iterates over that table, looking for strings in the table which start QMP_. So it finds at "QMP_GENERIC_ERROR". * If an entry does start with QMP_, it uses a custom string comparison routine which ignores case, and skips underscores in the libxl error code string. So it matches like this: QMP_GENERIC_ERROR Generic Error * If no matching entry is found it logs the real QMP error string and returns ERROR_UNKNOWN_QMP_ERROR (which must not be called LIBXL_QMP_UNKNOWN_ERROR in case QMP has or grows an "UnknownError" error code). This is not the fastest approach imaginable but it's no worse than the existing O(n) error string lookup and I hope this isn't a hot path. > > > +/* return 1 when a user callback as been called */ > > > +static int qmp_ev_handle_response(libxl__egc *egc, > > > + libxl__ev_qmp *ev, > > > + const libxl__json_object *resp, > > > + libxl__qmp_message_type type) > > > +{ > > > + EGC_GC; > > > + const libxl__json_object *response; > > > + const libxl__json_object *o; > > > + int rc; > > > + int id; > > > + > > > + o = libxl__json_map_get("id", resp, JSON_INTEGER); > > > + if (!o) { > > > + const char *error_desc; > > > + > > > + /* unexpected message, attempt to find an error description */ > > > + o = libxl__json_map_get("error", resp, JSON_MAP); > > > + o = libxl__json_map_get("desc", o, JSON_STRING); > > > > What is the dead store from "error" doing ? > > It's not dead, I reuse "o", as I get "desc" from "o". > I could just create new variables here instead of reusing the same one > (o) over and over again, e.g. obj_id, obj_error, obj_desc. Oh. Sorry. Yes. The idiom with the reuse of `o' is fine, I think, but: > Should I add here that we are parsing: {"error":{"desc": X }} ? That would be lovely. In general that would make these functions much easier to read. > > > + case LIBXL__QMP_MESSAGE_TYPE_ERROR: { > > > + const char *s; > > > + const libxl__json_object *err; > > > + libxl__qmp_error_class error_class; > > > + > > > + rc = ERROR_FAIL; > > > + response = NULL; > > > + > > > + err = libxl__json_map_get("error", resp, JSON_MAP); > > > + o = libxl__json_map_get("class", err, JSON_STRING); > > > + s = libxl__json_object_get_string(o); > > > + if (s && !libxl__qmp_error_class_from_string(s, &error_class)) > > > + rc = qmp_error_class_to_libxl_error_code(error_class); > > > + > > > + o = libxl__json_map_get("desc", err, JSON_STRING); > > > + s = libxl__json_object_get_string(o); > > > + if (s) > > > + LOGD(ERROR, ev->domid, "%s", s); > > > + > > > + break; > > > > Surely this should print more debugging (or maybe even error log > > messages) if the error json document was not in the expected form ? > > Do you mean that if the QMP server doesn't respect the QMP protocol, I > should print more debug here? > > If we are at this point in the programme, we know that the server sent: > {"error": X } > > I'm tempted to say that if "class" or "desc" isn't present, we could say > so and declare the server broken (and disconnect). Yes. Precisely. But I think what happens with your current code is that it falls through to the bottom with little in the way of diagnostics. One approach would be to add an error logging string parameter to libxl__json_*_get, which (if non-null, and the input object is non-null) logs an appropriate complaint including the logging string parameter and the looked-up key. > > > + /* Findout how much can be parsed */ > > > + size_t len = end - s; > > > + > > > + LOG_QMP("parsing %luB: '%.*s'", len, (int)len, s); > > > + > > > + /* Replace \n by \0 so that libxl__json_parse can use strlen */ > > > + s[len - 1] = '\0'; > > > > Doesn't this feed the \r to libxl__json_parse ? > > > > Also, why does this have to be a loop ? Does qemu really send > > multiple json documents end to end, and only sometimes with > > intervening \r\n ? Does it ever send a json document without \r\n at > > the end and then stop speaking for a while ? > > QEMU will send two messages at once, e.g. when asking to eject or change > the cdrom, QEMU will often send the response to our command and an event > at the same time. So we need a loop to parse all messages received. > > But there is always a \r\n at the end of every commands. So I think this is more complicated than it needs to be - and more relaxed than it ought to be. In my earlier comment I suggested that qmp_ev_callback_readable (or its parent) needs to keep calling read() until it gets EWOULDBLOCK. I still think that's right. If you do that, then you don't need this loop either, because if you structure things right, the outer loop can do it. I tried to find an example of how to do this but I couldn't find a simple one. I guess it is OK to do it with nested loops. But I think you should treat only `\n' as the delimiter. This will considerably simplify the buffer handling. (You should check and trim the preceding `\r' before passing things to libxl__json_parse of course.) > > > + ev->buf_consumed += len; > > > + > > > + if (ev->buf_consumed >= ev->buf_used) { > > > + free(ev->rx_buf); > > > > Surely buf_consumed <= buf_used ? Maybe you should assert that. > > I'm attempting to detect here if there's something left in the buffer > that hasn't been parsed yet. If there is nothing left, we can free the > buffer. > > Or do you mean that the condition should be buf_consumed == buf_used ? It felt odd to me to read `ev->buf_consumed >= ev->buf_used' when I knew that buf_consumed <= buf_used. I am suggesting adding assert(buf_consumed <= buf_used); and then writing if (ev->buf_consumed == ev->buf_used) To be honest this condition is a bit fiddly. Why not do this after subtracting buf_consumed from buf_used and copying the old data down ? Then it would read if (!ev->buf_used) But even better, if you change your code to permit ev->rx_buf != NULL when ev->rx_buf_used==0, this code goes away completely. See my comments about permissible states of the buffer setup. > > > +static int qmp_ev_callback_writable(libxl__gc *gc, libxl__ev_qmp *ev, > > > int fd) > > > +{ > > > + int rc; > > > + char *buf; > > > + size_t len; > > > + int send_fd = -1; > > > + > > > + /* No need to call qmp_ev_callback_writable again, everything that > > > needs to > > > + * be send for now will be in this call. */ > > > > So you guarantee never to send any message which is too large to fit > > into a socket buffer ? Do you know what that length is ? > > I'm using libxl_write_exactly(). Yes. The fd is in nonblocking mode. If you call libxl_write_exactly with more data than will fit into the socket buffer, libxl_write_exactly will bomb out with EWOULDBLOCK (which it treats as a fatal error). It can't do anything else. IDK if this is going to be a problem, but it is a significant limitation of this function. At the very least you should add an assert that the message is small, along with some commentary about the size of socket buffers. If that is not suitable, the usual approach is to prepare and store a buffer containing the data to be sent, and then walk across it with a series of write calls. Like the aoutils datacopier. (You could perhaps use the aoutils datacopier with libxl__datacopier_prefixdata if you wanted, but you'd have to find a /dev/null to pass for readfd...) > On the other end, I've just realize that I'm also using > libxl__sendmsg_fds(), which I guess may not send everything. But the > function also doesn't seems to the caller know how much have been sent. libxl_write_exactly does not return until it has completed and if it gets an errno other than EINTR it calls it failure. Likewise libxl_sendmsg_fds. If you want to send fds in the middle of a long stream, where the operation might block, you need to decide which byte the fds should be associated with, and then call libxl__sendmsg_fds separately for just that byte (and, presumably, write(2)) for the others. You'll need to enhance libxl__sendmsg_fds to understand that EWOULDBLOCK should be handled by its caller (and maybe that in that case there should be a check that datalen==1). > Should I keep using libxl_write_exactly() ? > I'm going to need to be able to track how much data have been sent for > at least sendmsg() anyway, and try again later. You can't use libxl_write_exactly unless you can prove it won't block. > > > +static void qmp_ev_callback_error(libxl__egc *egc, libxl__ev_qmp *ev) > > > +{ > > > + EGC_GC; > > > + > > > + LOGD(ERROR, ev->domid, "Error happend with the QMP connection to > > > QEMU"); > > > + > > > + /* On error, deallocate all private ressources */ > > > + libxl__ev_qmp_dispose(gc, ev); > > > > This surely needs to set the state. Presumably that should be done in > > libxl__ev_qmp_dispose but AFAICT it isn't. > > It is done by libxl__ev_qmp_init, which _dispose calls, itn't that > enough? Maybe. That depends on your table of permissible internal and external states. > > Going back a bit, earlier we had: > > > > > +#define QMP_CAPABILITY_NEGOTIATION_MSGID 1 > > > > I recommend using a different value. Can we safely let this wrap ? > > If so maybe we could use 0x786c7100 which spells "xlq\0" or something. > > This can make it easier to see where rogue values are coming from, to > > distinguish arguments in debug output, etc. > > ID can be string, or any json value. Oh! > Maybe we could have > {"libxl-id": number} as "id"? Making it a whole json object seems a bit overkill. How about GCSPRINTF("libxl-id-%lu", id++) ? Or did you intend to fish the id out and treat it, in the libxl code, as just an integer ? > But 0x786c7100 as starting point would work fine as well. > I don't think it matter if the value wrap. Beside adding 1 to generate a > new id, and compare, I don't think the id should be use for anything > else. Right. I think TBH (uint32_t)0x786c7100 is probably good and simple. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.