[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-changelog] [xen master] libxl: When restricted, start QEMU paused
commit ae29aa0f8fdfbd41d5ea71a1338fc6330562cff3 Author: Anthony PERARD <anthony.perard@xxxxxxxxxx> AuthorDate: Thu Jan 31 10:57:48 2019 +0000 Commit: Wei Liu <wei.liu2@xxxxxxxxxx> CommitDate: Fri Feb 1 12:02:41 2019 +0000 libxl: When restricted, start QEMU paused libxl runs the command "cont" later during guest creation; i.e. it is expecting that QEMU would not do any emulation. Use the "-S" command option to achieve this. Unfortunately, when QEMU is started with "-S", it won't write QEMU's readiness into xenstore. So only activate this option when we have a QEMU startup notification via QMP available, i.e. when dm_restrict is activated. The -S option has the side-effect of suppressing the startup notification via xenstore: libxl will only get the notification via QMP. It is important to rely only on QMP for notification when we have QMP available, as (due to a qemu bug) not waiting for that QMP notification may result in the QMP socket becoming blocked, so that QEMU stops responding to new connections even if no existing ones are active. When the QEMU bug happens, the actions taken by both libxl and QEMU are roughly as follows: - libxl connects and handshakes with QEMU, then sends the cmd "query-status". - QEMU prepares and maybe tries to send the response, while also writing "running" into xenstore. - libxl sees via xenstore that QEMU is running and disconnects from the QMP socket before receiving the response from the cmd. => The QMP socket (monitor) is thereby blocked and will never reply to commands on new connections. This is due to QEMU only responding to one command at a time, and suspending its monitor (QMP) until the command has been processed and sent. Disconnecting from the socket doesn't unsuspend the monitor. The race described here is very likely to happen with QEMU 3.1.50 (during 3.2 development), but can be reproduced with QEMU 3.1. Signed-off-by: Anthony PERARD <anthony.perard@xxxxxxxxxx> Release-acked-by: Juergen Gross <jgross@xxxxxxxx> Acked-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> --- tools/libxl/libxl_dm.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index b245956b77..2f19786bdd 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -1183,6 +1183,14 @@ static int libxl__build_device_model_args_new(libxl__gc *gc, flexarray_append(dm_args, GCSPRINTF("socket,id=libxl-cmd,fd=%d,server,nowait", state->dm_monitor_fd)); + + /* + * Start QEMU with its "CPU" paused, it will not start any emulation + * until the QMP command "cont" is used. This also prevent QEMU from + * writing "running" to the "state" xenstore node so we only use this + * flag when we have the QMP based startup notification. + * */ + flexarray_append(dm_args, "-S"); } else { flexarray_append(dm_args, GCSPRINTF("socket,id=libxl-cmd," @@ -2702,6 +2710,7 @@ static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev, libxl__dm_spawn_state *dmss = CONTAINER_OF(ev, *dmss, qmp); const libxl__json_object *o; const char *status; + const char *expected_state; libxl__ev_qmp_dispose(gc, ev); @@ -2717,7 +2726,11 @@ static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev, goto failed; } status = libxl__json_object_get_string(o); - if (strcmp(status, "running")) { + if (!dmss->build_state->saved_state) + expected_state = "prelaunch"; + else + expected_state = "paused"; + if (strcmp(status, expected_state)) { LOGD(ERROR, ev->domid, "Unexpected QEMU status: %s", status); rc = ERROR_NOT_READY; goto failed; -- generated by git-patchbot for /home/xen/git/xen.git#master _______________________________________________ Xen-changelog mailing list Xen-changelog@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/xen-changelog
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |