[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] systemd: Add hooks to stop/start xen-watchdog on suspend/resume



This patch adds a systemd sleep hook script to stop the xen-watchdog
service before system suspend and start it again after resume.

Stopping the watchdog before a system suspend operation may look unsafe.
Let’s imagine the following situation: 'systemctl suspend' does not
interact with the running service at all. In such a case, the Xen
watchdog daemon freezes just before suspend. If this happens, for
example, right before sending a ping, and Xen has not yet marked the
domain as suspended (is_shutting_down), the Xen watchdog timer may
trigger a false alert.

This is an almost impossible situation, because typically:
    ping time = watchdog timeout / 2

and the watchdog timeout is usually set to a relatively large value
(dozens of seconds).

Still, this is more likely with very short watchdog timeouts. It may
happen in the following scenarios:
    * Significant delays occur between freezing Linux tasks and
      triggering the ACPI or PSCI sleep request or handler.
    * Long delays happen inside Xen between the entrance to the sleep
      trigger and the actual forwarding of the sleep request further.

A similar situation may occur on resume with short timeouts. During the
resume operation, Xen restores timers and the domain context. The Xen
watchdog timer also resumes. If it schedules the domain right before the
watchdog timeout expires, and the daemon responsible for pinging is not
yet running, a timeout might occur.

Both scenarios are rare and typically require very small watchdog
timeouts combined with significant delays in Xen or the Linux kernel
during suspend/resume flows.

Conceptually, however, if activating and pinging the Xen watchdog is the
responsibility of the domain and its services, then the domain should
also manage the watchdog service/daemon lifecycle. This is similar to
what is already done by the Xen watchdog driver inside the Linux kernel.

Signed-off-by: Mykola Kvach <mykola_kvach@xxxxxxxx>
---
 tools/hotplug/Linux/systemd/Makefile          | 12 ++++-
 .../Linux/systemd/xen-watchdog-sleep.sh       | 45 +++++++++++++++++++
 2 files changed, 56 insertions(+), 1 deletion(-)
 create mode 100644 tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh

diff --git a/tools/hotplug/Linux/systemd/Makefile 
b/tools/hotplug/Linux/systemd/Makefile
index e29889156d..98d325cc5d 100644
--- a/tools/hotplug/Linux/systemd/Makefile
+++ b/tools/hotplug/Linux/systemd/Makefile
@@ -5,6 +5,9 @@ XEN_SYSTEMD_MODULES := xen.conf
 
 XEN_SYSTEMD_MOUNT := proc-xen.mount
 
+XEN_SYSTEMD_SLEEP_SCRIPTS := xen-watchdog-sleep.sh
+XEN_SYSTEMD_SLEEP_DIR := $(XEN_SYSTEMD_DIR)/../system-sleep
+
 XEN_SYSTEMD_SERVICE := xenstored.service
 XEN_SYSTEMD_SERVICE += xenconsoled.service
 XEN_SYSTEMD_SERVICE += xen-qemu-dom0-disk-backend.service
@@ -15,7 +18,8 @@ XEN_SYSTEMD_SERVICE += xendriverdomain.service
 
 ALL_XEN_SYSTEMD :=     $(XEN_SYSTEMD_MODULES)  \
                        $(XEN_SYSTEMD_MOUNT)    \
-                       $(XEN_SYSTEMD_SERVICE)
+                       $(XEN_SYSTEMD_SERVICE)  \
+                       $(XEN_SYSTEMD_SLEEP_SCRIPTS)
 
 .PHONY: all
 all:   $(ALL_XEN_SYSTEMD)
@@ -31,15 +35,21 @@ distclean: clean
 install: $(ALL_XEN_SYSTEMD)
        $(INSTALL_DIR) $(DESTDIR)$(XEN_SYSTEMD_DIR)
        $(INSTALL_DIR) $(DESTDIR)$(XEN_SYSTEMD_MODULES_LOAD)
+       $(INSTALL_DIR) $(DESTDIR)$(XEN_SYSTEMD_SLEEP_DIR)
        $(INSTALL_DATA) *.service $(DESTDIR)$(XEN_SYSTEMD_DIR)
        $(INSTALL_DATA) *.mount $(DESTDIR)$(XEN_SYSTEMD_DIR)
        $(INSTALL_DATA) *.conf $(DESTDIR)$(XEN_SYSTEMD_MODULES_LOAD)
+       set -e; for i in $(XEN_SYSTEMD_SLEEP_SCRIPTS); \
+           do \
+           $(INSTALL_PROG) $$i $(DESTDIR)$(XEN_SYSTEMD_SLEEP_DIR); \
+       done
 
 .PHONY: uninstall
 uninstall:
        rm -f $(DESTDIR)$(XEN_SYSTEMD_MODULES_LOAD)/*.conf
        rm -f $(DESTDIR)$(XEN_SYSTEMD_DIR)/*.mount
        rm -f $(DESTDIR)$(XEN_SYSTEMD_DIR)/*.service
+       rm -f $(addprefix $(DESTDIR)$(XEN_SYSTEMD_SLEEP_DIR)/, 
$(XEN_SYSTEMD_SLEEP_SCRIPTS))
 
 $(XEN_SYSTEMD_MODULES):
        rm -f $@.tmp
diff --git a/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh 
b/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh
new file mode 100644
index 0000000000..2b2f0e16d8
--- /dev/null
+++ b/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+# The first argument ($1) is:
+#     "pre" or "post"
+# The second argument ($2) is:
+#     "suspend", "hibernate", "hybrid-sleep", or "suspend-then-hibernate"
+
+. /etc/xen/scripts/hotplugpath.sh
+
+SERVICE_NAME="xen-watchdog.service"
+STATE_FILE="/run/xen-watchdog-sleep-marker"
+XEN_WATCHDOG_SLEEP_LOG="${XEN_LOG_DIR}/xen-watchdog-sleep.log"
+
+log_watchdog() {
+    echo "$1"
+    echo "$(date): $1" >> "${XEN_WATCHDOG_SLEEP_LOG}"
+}
+
+# Exit silently if Xen watchdog service is not present
+if ! systemctl show "${SERVICE_NAME}" > /dev/null 2>&1; then
+    exit 0
+fi
+
+case "$1" in
+pre)
+    if systemctl is-active --quiet "${SERVICE_NAME}"; then
+        touch "${STATE_FILE}"
+        log_watchdog "Stopping ${SERVICE_NAME} before $2."
+        systemctl stop "${SERVICE_NAME}"
+    fi
+    ;;
+post)
+    if [ -f "${STATE_FILE}" ]; then
+        log_watchdog "Starting ${SERVICE_NAME} after $2."
+        systemctl start "${SERVICE_NAME}"
+        rm "${STATE_FILE}"
+    fi
+    ;;
+*)
+    log_watchdog "Script called with unknown action '$1'. Arguments: '$@'"
+    exit 1
+    ;;
+esac
+
+exit 0
-- 
2.48.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.