[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xenstored: EACCESS error accessing control/feature-balloon 1



Hi Andrew,

On 12/04/2023 17:05, Andrew Cooper wrote:
On 12/04/2023 4:48 pm, zithro wrote:
Hi all,

this is what I have in "xenstored-access.log" in dom0 :

[20230411T23:48:27.917Z]  D5         write     control/feature-balloon 1
[20230411T23:48:27.917Z]  D5         error     EACCES
[20230411T23:48:27.923Z]  D5         write     data/updated Wed Apr 12
01:48:27 CEST 2023

It happens once each minute, on two different hosts, both amd64.
(both hosts using the same config, only the hardware differs).

I tried looking up for a similar bug, but didn't find one.
I apologize in advance if this error is known, and if this is not the
place to report this !

-----------------------
Technical infos
-----------------------
dom0s:

Debian stable, kernel 5.10.0-21-amd64
Xen 4.14.5
xl.conf has : autoballoon="0"
GRUB_CMDLINE_XEN="dom0_mem=2048M,max:2048M dom0_max_vcpus=4
dom0_vcpus_pin loglvl=all guest_loglvl=all ucode=scan iommu=verbose"
Running "xenstore-ls -f -p | grep balloon" returns no result
-----------------------
domUs (D5 in above logs):

HVM TrueNAS Core, based on FreeBSD 13.1-RELEASE-p7
(it happened also on previous FreeBSD realeases, but don't remember when
it started, logs have been filled and rotated).
In cfg files, using either the same value for "memory" and "maxmem" or
only setting "memory" give the same results.

What's strange is that I have xen* commands in FreeNAS :

xen-detect        xenstore-control  xenstore-ls       xenstore-watch
xenstore          xenstore-exists   xenstore-read     xenstore-write
xenstore-chmod    xenstore-list     xenstore-rm

root@truenas[~]# xenstore-ls
xenstore-ls: xs_directory (/): Permission denied

root@truenas[~]# ps aux
root   [...]     0:36.98 [xenwatch]
root   [...]     0:01.01 [xenstore_rcv]
root   [...]     0:00.00 [balloon]
root   [...]     0:01.74 /bin/sh /usr/local/sbin/xe-daemon -p
/var/run/xe-daemon.pid
[...]

The xe-daemon looks strange also, I don't use XenServer/XCP-ng, only
"raw" Xen.
And this script which hand

I also use PFsense domUs (based on FreeBSD), but they don't exhibit
this behaviour (ie. no xenstore access error in dom0, no xen*
commands in domU).

So is this a problem with TrueNAS rather than with Xen ?
If so I apologize for wasting your time.

Thanks, have a nice day !
(and as it's my first post here: thx for Xen, it rocks)

Hello,

(Leaving the full report intact so CC'd people can see it whole)

Yes, it is TrueNAS trying to re-write that file every minute.  It
appears that TrueNAS has inherited (from debian?) a rather old version
of https://github.com/xenserver/xe-guest-utilities/

https://xenbits.xen.org/docs/unstable/misc/xenstore-paths.html doesn't
list feature-balloon as a permitted feature node.

But, I suspect that it used to be the case that guests could write
arbitrary feature nodes, and I suspect we tightened the permissions in a
security fix to reduce worst-case memory usage of xenstored.

From a brief look, this is very similar to the patch below that was sent 3 years ago. I bet no-one ever tested the driver against libxl.

commit 30a970906038
Author: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
Date:   Tue Sep 4 13:39:29 2018 +0200

    libxl: create control/sysrq xenstore node

'xl sysrq' command doesn't work with modern Linux guests with the following
    message in guest's log:

     xen:manage: sysrq_handler: Error -13 writing sysrq in control/sysrq

    xenstore trace confirms:

     IN 0x24bd9a0 20180904 04:36:32 WRITE (control/sysrq )
     OUT 0x24bd9a0 20180904 04:36:32 ERROR (EACCES )

The problem seems to be in the fact that we don't pre-create control/sysrq xenstore node and libxl_send_sysrq() doing libxl__xs_printf() creates it as read-only. As we want to allow guests to clean 'control/sysrq' after the
    requested action is performed, we need to make this node writable.

    Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
    Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx>

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 60676304e9b5..dcfde7787e2c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -695,6 +695,9 @@ retry_transaction:
                         GCSPRINTF("%s/control/feature-s4", dom_path),
                         rwperm, ARRAY_SIZE(rwperm));
     }
+    libxl__xs_mknod(gc, t,
+                    GCSPRINTF("%s/control/sysrq", dom_path),
+                    rwperm, ARRAY_SIZE(rwperm));
     libxl__xs_mknod(gc, t,
GCSPRINTF("%s/device/suspend/event-channel", dom_path),
                     rwperm, ARRAY_SIZE(rwperm));


I suspect the best (/least bad) thing to do here is formally introduce
feature-ballon as a permitted node, and have the toolstack initialise it
to "" like we do with all other nodes, after which TrueNAS ought to be
able to set it successfully and not touch it a second time.

+1. This would match how libxl already deal "feature-s3" & co.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.