Xen project Mailing List

Re: [Xen-devel] [PATCH v6.2 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*

To: Anthony PERARD <anthony.perard@xxxxxxxxxx>

From: Ian Jackson <ian.jackson@xxxxxxxxxx>

Date: Thu, 15 Nov 2018 18:40:13 +0000

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Wei Liu <wei.liu2@xxxxxxxxxx>

Delivery-date: Thu, 15 Nov 2018 18:40:55 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Anthony PERARD writes ("[PATCH v6.2 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*"): > Signed-off-by: Anthony PERARD <anthony.perard@xxxxxxxxxx> Thanks for the additional comments. I got thoroughly stuck in. Overall the structure looks broadly right and my comments are generally about details. As might be expected, some of them are stylistic or matters of taste. Please feel free to push back if you disagree with me on anything. > struct libxl__ev_qmp { > /* caller should include this in their own struct */ > /* caller must fill these in, and they must all remain valid */ > libxl_domid domid; > libxl__ev_qmp_callback *callback; > int fd; /* set to send a fd with the command, -1 otherwise */ > + This name `fd' becomes very confusing, because most of the time with qmp the fd in question is our socket to qemu. Can you rename it ? payload_fd maybe ? > + /* > + * remaining fields are private to libxl_ev_qmp_* > + */ > + > + int id; > + libxl__carefd *qmp_cfd; > + libxl__ev_fd qmp_efd; > + libxl__qmp_state qmp_state; What purpose do the qmp_ prefixes on these serve ? I think if it were me I would drop them and simply call these `cfd' and `efd' and `state'. > +/* ------------ Implementation of libxl__ev_qmp ---------------- */ > + > +/* > + * Possible internal state compared to qmp_state: > + * > + * qmp_state External qmp_cfd qmp_efd id rx_buf* tx_buf* msg* > + * disconnected Idle close Idle reset free free free ^^^^^ YM `0' or `NULL'. Maybe this table would be easier to read if the headings were offset from the values. Eg: + * qmp_state External qmp_cfd qmp_efd id rx_buf* tx_buf* msg* + * disconnected Idle close Idle reset free free free ? Up to you. > + * connecting Active open Active set used free set You say Active for the qmp_efd but I think it would be better to say what you are waiting for ? Looking at qmp_ev_connect I think connecting has only POLLIN. You could write IN, IN[|OUT], etc., maybe. While reading the rest of the code I found that this was often a critical piece of state you are managing, so I think some improvement here is warranted. > + * capability_negotiation > + * Active open Active set used any set I would abbreviate this state name here, because it spoils the table. `cap.nego.' ? Or maybe you could just rename it to capab_nego or something. > + * connected Connected open Active prev[1] used free any What msg might there be in state `connected' ? According to the public interface there is no message yet. In general the movement of the caller's intended qmp command from msg to rx_buf through to qemu could perhaps do with some exposition (commentary). > + * waiting_reply Active open Active set used any free I am a bit confused about the semantics of tx_buf being free in state waiting_reply. Does that mean the caller's wanted command has been sent ? This seems like part of the same question as I just asked... Maybe you want to split this into two rows: + * waiting_reply Active open IN|OUT set used user's free + * waiting_reply Active open IN set used free free ? > + * broken[2] none[3] any Active any any any any ... > + * [1] id used on the previously sent command It would be better if the id column stated something more useful than `set'. `next' maybe, where applicable ? AIUI it is intended (important?) that ids should not be reused. > + * Possible buffers states: > + * - receiving buffer: > + * free used > + * rx_buf NULL allocated > + * rx_buf_size 0 allocation size of `rx_buf` > + * rx_buf_off 0 <= rx_buf_size > + * - transmiting buffer: Typo `transmitting'. A minor point, but indenting the rx_* things here would avoid them lining up with `transmitting buffer' and misleading the eye. Better still, have you considered moving this table into the struct itself ? You could put the table to the RHS of the actual member definitions. That gives slightly fewer places to look, although it would involve a cross-reference from this wider state description to the field's state tables. > + * free used > + * tx_buf NULL contain data > + * tx_buf_len 0 size of data > + * tx_buf_off 0 <= tx_buf_len You should explain the semantics of _off in both cases. It points into the buffer, but what is the meaning of the data (or the space) before and after it ? ... oh, I see for rx_buf_used this is documented in the struct itself. But not for tx_buf_off. Which leads me to say: the struct contains rx_buf_used but the comment here talks about _off. > + * - queued user command: > + * free set > + * msg NULL contain data YM `contains' data, in both cases. > + * msg_len 0 size of data > + * > + * - Allowed internal state transition: > + * disconnected -> connecting > + * connection -> capability_negotiation > + * capability_negotiation/waiting_reply -> connected > + * connected -> waiting_reply > + * any -> disconnected This does not mention the state `broken' and it should. > +static void qmp_ev_ensure_reading_writing(libxl__gc *gc, libxl__ev_qmp *ev) > + /* !disconnected -> same state */ This is not really true. This function is one to transform a transient, partial, internal state, into the corresponding proper internal state, by setting up the fd event. This will become more obvious if you add the IN etc. to the efd column in the internal state table. > + if (ev->tx_buf) { > + enable = true; > + } else { > + enable = (ev->qmp_state == qmp_state_connected) && ev->msg; > + } > + > + if (enable) > + events = ev->qmp_efd.events | POLLOUT; > + else > + events = ev->qmp_efd.events & ~POLLOUT; > + > + libxl__ev_fd_modify(gc, &ev->qmp_efd, events); This function seems more complicated than it needs to be ? AFAICT your code never clears POLLIN. And this function always unconditionally clears or sets POLLOUT. So the desired events do not in fact depend on qmp_efd.events ? Which makes sense, because from this name I would expect this function to figure out for itself exactly what events ought to be listened for. I think you want something like events = POLLIN; if (some stuff) { events |= POLLOUT; etc. > +static void qmp_ev_set_state(libxl__gc *gc, libxl__ev_qmp *ev, > + libxl__qmp_state new_state) > +{ Now that you have `broken' you should probably mention that this function is not used to enter the state `broken' (that is done willy-nilly), nor the state `disconnected' (that is done in _init). > +static int qmp_error_class_to_libxl_error_code(libxl__gc *gc, > + const char *eclass) > +{ > + const libxl_enum_string_table *t = libxl_error_string_table; > + > + /* compare "QMP_GENERIC_ERROR" from libxl_error to "GenericError" > + * generated by the QMP server */ > + > + for ( ; t->s; t++) { > + const char *s = eclass; > + const char *se = t->s; > + if (strncasecmp(t->s, "QMP_", 4)) > + continue; > + > + /* skip "QMP_" */ > + se += 4; This code contains or implies the value `4' a total of 4 times: three times in the code ("QMP_", 4, 4) and once in a comment. How about const char skip[] = "QMP_"; const size_t skipl = sizeof(skip)-1; ? > + while (*s && *se) { > + /* skip underscores */ > + if (*se == '_') { > + se++; > + continue; > + } > + if (tolower(*s) != tolower(*se)) > + break; > + s++, se++; > + } > + if (!*s && !*se) > + return t->v; I confess I find this way of constructing this loop a bit confusing. I would write this as a for(;;) loop and deal with each of the exit conditions in the loop body. But that is a matter of taste I think so if you prefer it like this, fine. > + } > + > + return ERROR_UNKNOWN_QMP_ERROR; I think you probably wanted to log the original error string here so that it does not become lost. > +static int qmp_ev_prepare_cmd(libxl__gc *gc, > + libxl__ev_qmp *ev, > + const char *cmd, > + const libxl__json_object *args) > + /* on entry: connecting/connected but with `msg` free > + * on return: same state but with `msg` set */ > +{ > + char *buf = NULL; > + size_t len; > + > + assert(!ev->tx_buf && !ev->tx_buf_len); > + assert(!ev->msg && !ev->msg_len); > + > + ev->id++; > + buf = qmp_prepare_cmd(gc, cmd, args, ev->id, &len); > + if (!buf) { > + return ERROR_FAIL; > + } There is no doc comment for qmp_prepare_cmd saying whether it logs on error. So it doesn't ? In which case maybe you should do so here. Alternatively if it can only really fail due to memory allocation failure, maybe it should return void (and call libxl__alloc_failed on allocation failure) instead ? If you don't need to log I think you should drop the { } here. See also my comments on the other patch, about len. > +/* Setup connection */ > + > +static int qmp_ev_connect(libxl__gc *gc, libxl__ev_qmp *ev) > + /* disconnected -> connecting but with `msg` free > + * on error: disconnected > + * If the initial state isn't disconnected, then nothing is done */ Maybe it be better for this function to return `broken' on error ? Also, I think it is anomalous that this function handles `broken' as a no-op and claims success. I think that is not really what a caller might expect. Maybe you want to forbid `broken' (with a corresponding assert) or to permit it (calling dispose before doing the rest of the work). Contrary to the state description, this function does not transition rx_buf from free to used. However, I think this would probably be more easily remedied by changing the definition of `used' to permit NULL/0/0. You might want to use a different word to `used', `inuse' perhaps ? > +out: > + libxl__carefd_close(ev->qmp_cfd); > + ev->qmp_cfd = NULL; > + return rc; If this function were permitted to leave the state `broken' on error, this separate error code would not be needed. Is there some reason not to do that ? > +static void qmp_ev_fd_callback(libxl__egc *egc, libxl__ev_fd *ev_fd, > + int fd, short events, short revents) > + /* On entry, ev_fd is (of course) Active. The ev_qmp may be in any > + * state where this is permitted. qmp_ev_fd_callback will do the work > + * necessary to make progress, depending on the current state, and make > + * the appropriate state transitions and callbacks. */ > +{ > + EGC_GC; > + int rc; > + libxl__json_object *o = NULL; > + libxl__ev_qmp *ev = CONTAINER_OF(ev_fd, *ev, qmp_efd); > + > + if (revents & (POLLHUP)) { > + LOGD(ERROR, ev->domid, "received POLLHUP from QMP socket"); > + rc = ERROR_PROTOCOL_ERROR_QMP; > + goto out; This approach, without more, loses the error code from connect(). I think you need to call getsockopt(fd, SOL_SOCKET, SO_ERROR, ...) and log the errno value if there is one. http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_10_16 > + if (revents & POLLOUT) { > + rc = qmp_ev_callback_writable(gc, ev, fd); ... > + if (revents & POLLIN) { > + rc = qmp_ev_callback_readable(egc, ev, fd); > + if (rc) > + goto out; > + > + /* parse input */ I find it odd that this input parsing is not part of qmp_ev_callback_readable. What do you think about moving it there ? > + qmp_ev_ensure_reading_writing(gc, ev); > + > +out: > + if (rc) { > + LOGD(ERROR, ev->domid, > + "Error happend with the QMP connection to QEMU"); Typo `happened'. > +static int qmp_ev_callback_writable(libxl__gc *gc, > + libxl__ev_qmp *ev, int fd) > + /* on entry: !disconnected > + * on return, one of these state transition: > + * connected -> waiting_reply > + * * -> state unchange > + * but with the state of tx_buf changed > + * on error: broken */ > +{ > + int rc; > + ssize_t r; > + > + if (ev->qmp_state == qmp_state_connected) { > + assert(!ev->tx_buf); > + if (ev->msg) { See my comments about the state `connected' and msg. > + /* > + * We will send a file descriptor associated with a command on the > + * first byte of this command. > + */ > + if (ev->qmp_state == qmp_state_waiting_reply && > + ev->fd >= 0 && > + ev->tx_buf_off == 0) { According to the doc comments, state waiting_reply might have tx_buf free, in which case it would try to execute this code. I don't think that can be right. > + if (ev->tx_buf_off == ev->tx_buf_len) { > + free(ev->tx_buf); > + ev->tx_buf = NULL; > + ev->tx_buf_len = ev->tx_buf_off = 0; This appears in _init too. Maybe you want a qmp_ev_tx_buf_clear function ? > +static int qmp_ev_callback_readable(libxl__egc *egc, > + libxl__ev_qmp *ev, int fd) > + /* !disconnected -> same state (with rx buffer updated) > + * on error -> broken */ > +{ > + EGC_GC; > + > + while (1) { > + ssize_t r; > + > + /* Check if the buffer still have space, or increase size */ > + if (ev->rx_buf_size - ev->rx_buf_used < QMP_RECEIVE_BUFFER_SIZE) { > + ev->rx_buf_size = max(ev->rx_buf_size * 2, > + (size_t)QMP_RECEIVE_BUFFER_SIZE * 2); How about + ev->rx_buf_size *= 2; + ev->rx_buf_size += QMP_RECEIVE_BUFFER_SIZE; or some such ? That's easier to read than a conditional. > + assert(ev->rx_buf_size <= QMP_MAX_SIZE_RX_BUF); > + if (ev->rx_buf_size > QMP_MAX_SIZE_RX_BUF) { I think the assert is redundant. > + LOGD(ERROR, ev->domid, > + "QMP receive buffer is too big (%ld > %lld)", > + ev->rx_buf_size, QMP_MAX_SIZE_RX_BUF); > + return ERROR_BUFFERFULL; > + } Putting this error check between the change to rx_buf_size and the change to rx_buf obscures the essential connection between both changes, which must be indivisible. It also results in a bug: if this error trips, rx_buf_size has been updated but rx_buf has not. This is a dangerously invalid state. This function tries to read everything that qemu produces and calls the whole thing a loss of qemu produces more than QMP_MAX_SIZE_RX_BUF. But perhaps qemu producing more than QMP_MAX_SIZE_RX_BUF output might not be the result of qemu going mad. It might be the result of us having ignored it for a while, and qemu filling a buffer with event notifications which we might want to discard ? Would it not be better to process messages as they arrive ? Ie to put the attempt to find valid messages here inside this loop ? That might also allow the reduction of the maximum message size from 8Mby. 8Mby seems like quite a lot. Of such a change would result in a possible return to the caller with the qmp fd having pending data to read. We would have to make sure to actually call read again after sending a new command, in case the efd is edge triggered. > + r = read(fd, ev->rx_buf + ev->rx_buf_used, > + ev->rx_buf_size - ev->rx_buf_used); > + if (r < 0) { > + if (errno == EINTR) { > + continue; > + } > + if (errno == EWOULDBLOCK) { > + break; > + } These { } around simple breaks, continues, etc. are strictly unnecessary FYI. Personally I find that less cluttered. But they're permitted by CODING_STYLE and you can keep them if you want them. > +/* Handle messages received from QMP server */ > + > +static int qmp_ev_get_next_msg(libxl__egc *egc, libxl__ev_qmp *ev, > + libxl__json_object **o_r) > + /* Find a JSON object and store it in o_r. > + * return ERROR_NOTFOUND if no object is found. > + * `o_r` is allocated within `egc`. > + * > + * !disconnected -> same state (with rx buffer updated) > + */ > +{ ... > + /* Search for the end of a QMP message: "\r\n" */ > + end = memmem(ev->rx_buf, ev->rx_buf_used, "\r\n", 2); > + if (!end) > + return ERROR_NOTFOUND; > + len = (end - ev->rx_buf) + 2; In: Subject: Re: [PATCH v5 03/15] libxl_qmp: Implement fd callback and read data [and 1 more messages] Date: Mon, 29 Oct 2018 17:31:59 +0000 I wrote: But I think you should treat only `\n' as the delimiter. This will considerably simplify the buffer handling. (You should check and trim the preceding `\r' before passing things to libxl__json_parse of course.) But you don't seem to have done that, or replied ? > +static int qmp_ev_handle_message(libxl__egc *egc, > + libxl__ev_qmp *ev, > + const libxl__json_object *resp) > + /* > + * This function will handle every messages sent by the QMP server. > + * Return values: > + * < 0 libxl error code > + * 0 success > + * 1 success, but a user callback has been called, > + * `ev` should not be used anymore. ... > + case LIBXL__QMP_MESSAGE_TYPE_QMP: ... > + assert(!ev->tx_buf); > + ev->tx_buf = qmp_prepare_cmd(gc, "qmp_capabilities", NULL, > + QMP_CAPABILITY_NEGOTIATION_MSGID, > + &ev->tx_buf_len); > + ev->tx_buf_off = 0; Why do you not use qmp_ev_prepare_cmd for this ? > + case LIBXL__QMP_MESSAGE_TYPE_RETURN: > + case LIBXL__QMP_MESSAGE_TYPE_ERROR: ... > + o = libxl__json_map_get("id", resp, JSON_INTEGER); > + if (!o) { > + /* > + * If "id" isn't present, an error occur on the server before > + * it has read the "id" provided by libxl. > + */ > + qmp_ev_parse_error_messages(egc, ev, resp); > + return ERROR_PROTOCOL_ERROR_QMP; I think you wanted to return the error code from qmp_ev_parse_error_messages ? > + id = libxl__json_object_get_integer(o); > + > + if (id == QMP_CAPABILITY_NEGOTIATION_MSGID) { > + /* We have a response to our qmp_capabilities cmd */ > + if (ev->qmp_state != qmp_state_capability_negotiation || > + type != LIBXL__QMP_MESSAGE_TYPE_RETURN) > + goto out_unknown_id; > + qmp_ev_set_state(gc, ev, qmp_state_connected); > + return 0; > + } I'm a bit puzzled about the capability negotation, its id, etc.: * Why does QMP_CAPABILITY_NEGOTIATION_MSGID exist at all ? You could just use a normal id, couldn't you ? And you'd be able to tell from your own state that it was the right value. * Why are we even doing capability negotation when we throw the answer away ? * I think it would be better to check id == ev->id first, rather than in each of these branches. > + if (ev->qmp_state == qmp_state_waiting_reply && > + id == ev->id) { > + if (type == LIBXL__QMP_MESSAGE_TYPE_RETURN) { > + response = libxl__json_map_get("return", resp, JSON_ANY); > + rc = 0; > + } else { > + /* error message */ > + response = NULL; > + rc = qmp_ev_parse_error_messages(egc, ev, resp); > + } > + qmp_ev_set_state(gc, ev, qmp_state_connected); > + ev->callback(egc, ev, response, rc); /* must be last */ > + return 1; > + } > + > +out_unknown_id: We're just falling through here. It seems weird to me. I would expect this sequence of checks: 1. Check id against our id; should be the same; if not, this is an out of sequence reply which is not possible, so call it a protocol error. 2. Check if the thing is an error response; if it is, parse and return the error message. 2. It must be a RETURN. (Maybe add an assert here.) Check our state, which should be capability_negotiation (in which case we go to the next thing) or waiting_reply (in which case we're done). Otherwise it is a protocol error. > +int libxl__ev_qmp_send(libxl__gc *gc, libxl__ev_qmp *ev, > + const char *cmd, libxl__json_object *args) > + /* disconnected -> connecting > + * connected -> waiting_reply > + * on error: disconnected */ > +{ > + int rc; > + > + LOGD(DEBUG, ev->domid, " ev %p, cmd '%s'", ev, cmd); > + > + assert(ev->qmp_state == qmp_state_disconnected || > + ev->qmp_state == qmp_state_connected); > + > + /* Connect to QEMU if not already connected */ > + rc = qmp_ev_connect(gc, ev); > + if (rc) > + goto out; > + > + rc = qmp_ev_prepare_cmd(gc, ev, cmd, args); > + if (rc) > + goto out; > + > + qmp_ev_ensure_reading_writing(gc, ev); > + > +out: Please add rc = 0; before out; But you might prefer just to return 0; since there is nothing allocated in this function that ought to be disposed (which is nice). Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.