[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH v2 for-4.12] libxl: When restricted, start QEMU paused
libxl runs the command "cont" later during guest creation; i.e. it is expecting that QEMU would not do any emulation. Use the "-S" command option to achieve this. Unfortunately, when QEMU is started with "-S", it won't write QEMU's readiness into xenstore. So only activate this option when we have a QEMU startup notification via QMP available, i.e. when dm_restrict is activated. The -S option has the side-effect of suppressing the startup notification via xenstore: libxl will only get the notification via QMP. It is important to rely only on QMP for notification when we have QMP available, as (due to a qemu bug) not waiting for that QMP notification may result in the QMP socket becoming blocked, so that QEMU stops responding to new connections even if no existing ones are active. When the QEMU bug happens, the actions taken by both libxl and QEMU are roughly as follows: - libxl connects and handshakes with QEMU, then sends the cmd "query-status". - QEMU prepares and maybe tries to send the response, while also writing "running" into xenstore. - libxl sees via xenstore that QEMU is running and disconnects from the QMP socket before receiving the response from the cmd. => The QMP socket (monitor) is thereby blocked and will never reply to commands on new connections. This is due to QEMU only responding to one command at a time, and suspending its monitor (QMP) until the command has been processed and sent. Disconnecting from the socket doesn't unsuspend the monitor. The race described here is very likely to happen with QEMU 3.1.50 (during 3.2 development), but can be reproduced with QEMU 3.1. Signed-off-by: Anthony PERARD <anthony.perard@xxxxxxxxxx> Release-acked-by: Juergen Gross <jgross@xxxxxxxx> --- v2: commit message reworked. --- tools/libxl/libxl_dm.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index b245956b77..2f19786bdd 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -1183,6 +1183,14 @@ static int libxl__build_device_model_args_new(libxl__gc *gc, flexarray_append(dm_args, GCSPRINTF("socket,id=libxl-cmd,fd=%d,server,nowait", state->dm_monitor_fd)); + + /* + * Start QEMU with its "CPU" paused, it will not start any emulation + * until the QMP command "cont" is used. This also prevent QEMU from + * writing "running" to the "state" xenstore node so we only use this + * flag when we have the QMP based startup notification. + * */ + flexarray_append(dm_args, "-S"); } else { flexarray_append(dm_args, GCSPRINTF("socket,id=libxl-cmd," @@ -2702,6 +2710,7 @@ static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev, libxl__dm_spawn_state *dmss = CONTAINER_OF(ev, *dmss, qmp); const libxl__json_object *o; const char *status; + const char *expected_state; libxl__ev_qmp_dispose(gc, ev); @@ -2717,7 +2726,11 @@ static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev, goto failed; } status = libxl__json_object_get_string(o); - if (strcmp(status, "running")) { + if (!dmss->build_state->saved_state) + expected_state = "prelaunch"; + else + expected_state = "paused"; + if (strcmp(status, expected_state)) { LOGD(ERROR, ev->domid, "Unexpected QEMU status: %s", status); rc = ERROR_NOT_READY; goto failed; -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |