[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH for-4.16] tools/xenstored: Ignore domain we were unable to restore
From: Julien Grall <jgrall@xxxxxxxxxx> Commit 939775cfd3 "handle dying domains in live update" was meant to handle gracefully dying domain. However, the @releaseDomain watch will end up to be sent as soon as we finished to restore Xenstored state. This may be before Xen reports the domain to be dying (such as if the guest decided to revoke access to the xenstore page). Consequently daemon like xenconsoled will not clean-up the domain and it will be left as a zombie. To avoid the problem, mark the connection as ignored. This also requires to tweak conn_can_write() and conn_can_read() to prevent dereferencing a NULL pointer (the interface will not mapped). The check conn->is_ignored was originally added after the callbacks because the helpers for a socket connection may close the fd. However, ignore_connection() will close a socket connection directly. So it is fine to do the re-order. Signed-off-by: Julien Grall <jgrall@xxxxxxxxxx> --- This issue was originally found when developping commit 939775cfd3. Unfortunately, due to some miscommunication the wrong patch was sent upstream. The approach is re-using the one we discussed back then. This was tested by modifying Linux to revoke the Xenstore grant during boot. Without this patch, the domain will be left after Live-Update. Note that on a basic setup (only xenconsoled and xl watch @releaseDomain), the domain may be cleaned on the next domain shutdown/start. Xenstore Live-Update is so far a tech preview feature. But I would still like to request this patch to be included in 4.16 as this was meant to be part of the original work. --- tools/xenstore/xenstored_core.c | 8 ++++---- tools/xenstore/xenstored_core.h | 1 + tools/xenstore/xenstored_domain.c | 21 ++++++++++++--------- 3 files changed, 17 insertions(+), 13 deletions(-) diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c index 0d4c73d6e20c..91d093a12ea6 100644 --- a/tools/xenstore/xenstored_core.c +++ b/tools/xenstore/xenstored_core.c @@ -338,10 +338,10 @@ static int destroy_conn(void *_conn) static bool conn_can_read(struct connection *conn) { - if (!conn->funcs->can_read(conn)) + if (conn->is_ignored) return false; - if (conn->is_ignored) + if (!conn->funcs->can_read(conn)) return false; /* @@ -356,7 +356,7 @@ static bool conn_can_read(struct connection *conn) static bool conn_can_write(struct connection *conn) { - return conn->funcs->can_write(conn) && !conn->is_ignored; + return !conn->is_ignored && conn->funcs->can_write(conn); } /* This function returns index inside the array if succeed, -1 if fail */ @@ -1466,7 +1466,7 @@ static struct { * * All watches, transactions, buffers will be freed. */ -static void ignore_connection(struct connection *conn) +void ignore_connection(struct connection *conn) { struct buffered_data *out, *tmp; diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h index 258f6ff38279..07d861d92499 100644 --- a/tools/xenstore/xenstored_core.h +++ b/tools/xenstore/xenstored_core.h @@ -206,6 +206,7 @@ struct node *read_node(struct connection *conn, const void *ctx, struct connection *new_connection(const struct interface_funcs *funcs); struct connection *get_connection_by_id(unsigned int conn_id); +void ignore_connection(struct connection *conn); void check_store(void); void corrupt(struct connection *conn, const char *fmt, ...); diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c index 47e9107c144e..d03c7d93a9e7 100644 --- a/tools/xenstore/xenstored_domain.c +++ b/tools/xenstore/xenstored_domain.c @@ -268,14 +268,7 @@ void check_domains(void) domain->shutdown = true; notify = 1; } - /* - * On Restore, we may have been unable to remap the - * interface and the port. As we don't know whether - * this was because of a dying domain, we need to - * check if the interface and port are still valid. - */ - if (!dominfo.dying && domain->port && - domain->interface) + if (!dominfo.dying) continue; } if (domain->conn) { @@ -1303,6 +1296,17 @@ void read_state_connection(const void *ctx, const void *state) if (!domain) barf("domain allocation error"); + conn = domain->conn; + + /* + * We may not have been able to restore the domain (for + * instance because it revoked the Xenstore grant). We need + * to keep it around to send @releaseDomain when it is + * dead. So mark it as ignored. + */ + if (!domain->port || !domain->interface) + ignore_connection(conn); + if (sc->spec.ring.tdomid != DOMID_INVALID) { tdomain = find_or_alloc_domain(ctx, sc->spec.ring.tdomid); @@ -1311,7 +1315,6 @@ void read_state_connection(const void *ctx, const void *state) talloc_reference(domain->conn, tdomain->conn); domain->conn->target = tdomain->conn; } - conn = domain->conn; } conn->conn_id = sc->conn_id; -- 2.32.0
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |