[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 4/9] xen: add basic hypervisor filesystem support



On 03.02.20 16:07, Jan Beulich wrote:
On 21.01.2020 09:43, Juergen Gross wrote:
---
  xen/arch/arm/traps.c         |   1 +
  xen/arch/x86/hvm/hypercall.c |   1 +
  xen/arch/x86/hypercall.c     |   1 +
  xen/arch/x86/pv/hypercall.c  |   1 +
  xen/common/Makefile          |   1 +
  xen/common/hypfs.c           | 365 +++++++++++++++++++++++++++++++++++++++++++
  xen/include/public/hypfs.h   | 124 +++++++++++++++
  xen/include/public/xen.h     |   1 +
  xen/include/xen/hypercall.h  |   8 +
  xen/include/xen/hypfs.h      |  89 +++++++++++
  10 files changed, 592 insertions(+)

Even if it's just two structures that you have in the public
header, your assertion of the interface being guest bitness
agnostic should be accompanied by actual proof, i.e. addition
to xen/include/xlat.lst.

Okay.


+static int add_entry(struct hypfs_entry_dir *parent, struct hypfs_entry *new)
+{
+    int ret = -ENOENT;
+    struct hypfs_entry *e;
+
+    write_lock(&hypfs_lock);
+
+    list_for_each_entry ( e, &parent->dirlist, list )
+    {
+        int cmp = strcmp(e->name, new->name);
+
+        if ( cmp > 0 )
+        {
+            ret = 0;
+            list_add_tail(&new->list, &e->list);
+            break;
+        }
+        if ( cmp == 0 )
+        {
+            ret = -EEXIST;
+            break;
+        }
+    }
+
+    if ( ret == -ENOENT )
+    {
+        ret = 0;
+        list_add_tail(&new->list, &parent->dirlist);
+    }
+
+    if ( !ret )
+    {
+        unsigned int sz = strlen(new->name) + 1;
+
+        parent->e.size += DIRENTRY_SIZE(sz);

Would DIRENTRY_SIZE() perhaps better include the "+ 1"?

Good idea.


+int hypfs_add_entry(struct hypfs_entry_dir *parent,
+                    struct hypfs_entry *entry, bool nofault)
+{
+    int ret;
+
+    ret = add_entry(parent, entry);
+    BUG_ON(nofault && ret);
+
+    return ret;
+}

While this and its two siblings have no caller, the one above
doesn't even have a declaration in the header file. What is
this intended to be used for (from external callers)?

Oh, this seems to be a leftover from development phase. I'll remove
it.

I also think the "nofault" aspect could do with discussing in
the commit message - it seems quite odd to me.

I realized I was repeating the same code pattern multiple times, so
I decidied to have a common function for it. Will add something to the
commit message.


+static int hypfs_get_path_user(char *buf, XEN_GUEST_HANDLE_PARAM(void) uaddr,
+                               unsigned long len)

For consistency with naming elsewhere as well as uaddr here -
ulen?

Okay.


+static struct hypfs_entry *hypfs_get_entry_rel(struct hypfs_entry_dir *dir,
+                                               const char *path)
+{
+    const char *end;
+    struct hypfs_entry *entry;
+    unsigned int name_len;
+
+    if ( !*path )
+        return &dir->e;
+
+    if ( dir->e.type != XEN_HYPFS_TYPE_DIR )
+        return NULL;

Don't these two need switching around, to make sure /a/b/c/
doesn't match a non-dir /a/b/c ?

Oh, indeed.


+    end = strchr(path, '/');
+    if ( !end )
+        end = strchr(path, '\0');
+    name_len = end - path;
+
+    list_for_each_entry ( entry, &dir->dirlist, list )
+    {
+        int cmp = strncmp(path, entry->name, name_len);
+       struct hypfs_entry_dir *d = container_of(entry,

A hard tab slipped in here.

Sorry for that.


+                                                 struct hypfs_entry_dir, e);
+
+        if ( cmp < 0 )
+            return NULL;
+        if ( !cmp && strlen(entry->name) == name_len )
+            return *end ? hypfs_get_entry_rel(d, end + 1) : entry;
+    }
+
+    return NULL;
+}
+
+struct hypfs_entry *hypfs_get_entry(const char *path)
+{
+    if ( path[0] != '/' )
+        return NULL;
+
+    return hypfs_get_entry_rel(&hypfs_root, path + 1);
+}
+
+int hypfs_read_dir(const struct hypfs_entry *entry,
+                   XEN_GUEST_HANDLE_PARAM(void) uaddr)
+{
+    const struct hypfs_entry_dir *d;
+    struct hypfs_entry *e;

const?

Yes.


+static int hypfs_read(const struct hypfs_entry *entry,
+                      XEN_GUEST_HANDLE_PARAM(void) uaddr, unsigned long ulen)
+{
+    struct xen_hypfs_direntry e;
+    long ret = -EINVAL;
+
+    if ( ulen < sizeof(e) )
+        goto out;
+
+    e.flags = entry->write ? XEN_HYPFS_WRITEABLE : 0;
+    e.type = entry->type;
+    e.encoding = entry->encoding;
+    e.content_len = entry->size;
+
+    ret = -EFAULT;
+    if ( copy_to_guest(uaddr, &e, 1) )
+        goto out;
+
+    ret = 0;
+    if ( ulen < entry->size + sizeof(e) )
+        goto out;

So you return "success" even if the operation didn't complete
successfully. This isn't very nice, plus ...

The direntry contains the needed size. The caller should know the
size he passed to Xen.


+    guest_handle_add_offset(uaddr, sizeof(e));
+
+    ret = entry->read(entry, uaddr);

... how is the caller to know whether direntry was at least
copied if this then fails?

Is this really important? Normally -EFAULT should just not happen. In
case it does I don't think the caller can make real use of the direntry.


+int hypfs_write_leaf(struct hypfs_entry_leaf *leaf,
+                     XEN_GUEST_HANDLE_PARAM(void) uaddr, unsigned long ulen)
+{
+    char *buf;
+    int ret;
+
+    if ( ulen > leaf->e.size )
+        ulen = leaf->e.size;

Silent truncation?

Hmm, true. I'll modify it to return an error in that case.

+    buf = xzalloc_array(char, ulen);

Why the z variant?

Okay, will use xmalloc().


+    if ( !buf )
+        return -ENOMEM;
+
+    ret = -EFAULT;
+    if ( copy_from_guest(buf, uaddr, ulen) )
+        goto out;
+
+    ret = 0;
+    if ( leaf->e.type == XEN_HYPFS_TYPE_STRING )
+        buf[leaf->e.size - 1] = 0;

And possible further silent truncation? And if the incoming
buffer has no nul byte at ulen-1, you'll then ...

+    memcpy(leaf->write_ptr, buf, ulen);

... produce a strange concatenation of new and tail of old
contents?

Will rework by returning an error if the size is too large or there is
no terminating nul byte.


Anyway, this and ...

+ out:
+    xfree(buf);
+    return ret;
+}
+
+int hypfs_write_bool(struct hypfs_entry_leaf *leaf,
+                     XEN_GUEST_HANDLE_PARAM(void) uaddr, unsigned long ulen)
+{

... this function aren't very helpful to review without there
being a caller. Could these be introduced at the time a first
caller appears?

Of course. Question is where to stop. I wanted to have the basic hypfs
support in one patch. Are you fine with just those two functions being
moved to the runtime parameter patch?


+    union {
+        char buf[8];
+        uint8_t u8;
+        uint16_t u16;
+        uint32_t u32;
+        uint64_t u64;
+    } u;
+
+    ASSERT(leaf->e.type == XEN_HYPFS_TYPE_UINT && leaf->e.size <= 8);
+
+    if ( ulen != leaf->e.size )
+        return -EDOM;

Is this restriction really necessary? Setting e.g. a 4-byte
field from 1-byte input is no problem at all. This being for
booleans I anyway wonder why input might be helpful to have
larger than a single byte. But maybe all of this is again a
result of not seeing what a user of the function would look
like.

I wanted to have as little functionality as possible in the hypervisor.
It is no problem for the library to pass a properly sized buffer.

Allowing larger variables for booleans is just a consequence of the
hypervisor parameters allowing that.


+long do_hypfs_op(unsigned int cmd,
+                 XEN_GUEST_HANDLE_PARAM(void) arg1, unsigned long arg2,
+                 XEN_GUEST_HANDLE_PARAM(void) arg3, unsigned long arg4)
+{
+    int ret;
+    struct hypfs_entry *entry;
+    static char path[XEN_HYPFS_MAX_PATHLEN];
+
+    if ( !is_control_domain(current->domain) &&
+         !is_hardware_domain(current->domain) )
+        return -EPERM;
+
+    if ( cmd == XEN_HYPFS_OP_get_version )
+        return XEN_HYPFS_VERSION;
+
+    if ( cmd == XEN_HYPFS_OP_write_contents )
+        write_lock(&hypfs_lock);
+    else
+        read_lock(&hypfs_lock);
+
+    ret = hypfs_get_path_user(path, arg1, arg2);
+    if ( ret )
+        goto out;
+
+    entry = hypfs_get_entry(path);
+    if ( !entry )
+    {
+        ret = -ENOENT;
+        goto out;
+    }
+
+    switch ( cmd )
+    {
+    case XEN_HYPFS_OP_read_contents:
+        ret = hypfs_read(entry, arg3, arg4);
+        break;
+
+    case XEN_HYPFS_OP_write_contents:
+        ret = hypfs_write(entry, arg3, arg4);
+        break;
+
+    default:
+        ret = -ENOSYS;

EINVAL or EOPNOTSUPP please. ENOSYS should be used only for not
implemented top level functions.

Okay, will use EOPNOTSUPP.


--- /dev/null
+++ b/xen/include/public/hypfs.h
@@ -0,0 +1,124 @@
+/******************************************************************************
+ * Xen Hypervisor Filesystem
+ *
+ * Copyright (c) 2019, SUSE Software Solutions Germany GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __XEN_PUBLIC_HYPFS_H__
+#define __XEN_PUBLIC_HYPFS_H__
+
+#include "xen.h"
+
+/*
+ * Definitions for the __HYPERVISOR_hypfs_op hypercall.
+ */
+
+/* Highest version number of the hypfs interface currently defined. */
+#define XEN_HYPFS_VERSION      1

For this and the accompanying XEN_HYPFS_OP_get_version, at least
the doc added by patch 3 could actually do with mentioning the
intentions you have with this.

Okay.


+/* Maximum length of a path in the filesystem. */
+#define XEN_HYPFS_MAX_PATHLEN 1024
+
+struct xen_hypfs_direntry {
+    uint16_t flags;
+#define XEN_HYPFS_WRITEABLE    0x0001
+    uint8_t type;
+#define XEN_HYPFS_TYPE_DIR     0x0000
+#define XEN_HYPFS_TYPE_BLOB    0x0001
+#define XEN_HYPFS_TYPE_STRING  0x0002
+#define XEN_HYPFS_TYPE_UINT    0x0003
+#define XEN_HYPFS_TYPE_INT     0x0004
+#define XEN_HYPFS_TYPE_BOOL    0x0005
+    uint8_t encoding;
+#define XEN_HYPFS_ENC_PLAIN    0x0000
+#define XEN_HYPFS_ENC_GZIP     0x0001

Meaning I can e.g. have a gzip-ed string or bool (or even dir)?
If this is just for "blob", why have separate fields instead of
e.g. BLOB_RAW and BLOB_GZIP or some such?

gzip-ed string or blob are the primary targets.

Maybe we want to have other encoding s later (Andrew asked for that
possibility when I posted the patch for retrieving the .config file
contents early last year).

Also - why 4 digits in the constants when the fields are uint8_t?
Since these are enum-like, I even wonder if hex is warranted
here.

I'm fine with switching to plain integers.


+    uint32_t content_len;
+};
+
+struct xen_hypfs_dirlistentry {
+    struct xen_hypfs_direntry e;
+    /* Offset in bytes to next entry (0 == this is the last entry). */
+    uint16_t off_next;
+    char name[XEN_FLEX_ARRAY_DIM];
+};

The interaction of the last two fields may want spelling out:
I _assume_ name[] is nul-terminated, and off_next exists to
potentially skip trailing padding?

Yes. Will add a comment.


+/*
+ * Hypercall operations.
+ */
+
+/*
+ * XEN_HYPFS_OP_get_version
+ *
+ * Read highest interface version supported by the hypervisor.
+ *
+ * Possible return values:
+ * >0: highest supported interface version
+ * <0: negative Xen errno value
+ */
+#define XEN_HYPFS_OP_get_version     0
+
+/*
+ * XEN_HYPFS_OP_read_contents
+ *
+ * Read contents of a filesystem entry.
+ *
+ * Returns the direntry and contents of an entry in the buffer supplied by the
+ * caller (struct xen_hypfs_direntry with the contents following directly
+ * after it).
+ * The data buffer must be at least the size of the direntry returned in order
+ * to have success. If the data buffer was not large enough for all the data
+ * no entry data is returned, but the direntry will contain the needed size
+ * for the returned data.
+ * The format of the contents is according to its entry type and encoding.

What specifically this means for a dir would be nice to be spelled
out.

Okay.


+ * arg1: XEN_GUEST_HANDLE(path name)
+ * arg2: length of path name (including trailing zero byte)
+ * arg3: XEN_GUEST_HANDLE(data buffer written by hypervisor)
+ * arg4: data buffer size
+ *
+ * Possible return values:
+ * 0: success (at least the direntry was returned)
+ * <0 : negative Xen errno value
+ */
+#define XEN_HYPFS_OP_read_contents     1
+
+/*
+ * XEN_HYPFS_OP_write_contents
+ *
+ * Write contents of a filesystem entry.
+ *
+ * Writes an entry with the contents of a buffer supplied by the caller.
+ * The data type and encoding can't be changed. The size can be changed only
+ * for blobs and strings.
+ *
+ * arg1: XEN_GUEST_HANDLE(path name)
+ * arg2: length of path name (including trailing zero byte)
+ * arg3: XEN_GUEST_HANDLE(content buffer read by hypervisor)
+ * arg4: content buffer size
+ *
+ * Possible return values:
+ * 0: success
+ * <0 : negative Xen errno value
+ */
+#define XEN_HYPFS_OP_write_contents    2

This one indeed accesses only the actual data (contents) of the
referenced entry. XEN_HYPFS_OP_read_contents hands back also
the dir entry. Should the latter then better be named just
XEN_HYPFS_OP_read?

Yes, this might be better.


--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -150,6 +150,14 @@ do_dm_op(
      unsigned int nr_bufs,
      XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs);
+extern long
+do_hypfs_op(
+    unsigned int cmd,
+    XEN_GUEST_HANDLE_PARAM(void) arg1,

Do you anticipate this parameter to be used for other than path
names? If not, perhaps better XEN_GUEST_HANDLE_PARAM(const_char)?

Yes, I think this can be changed.


--- /dev/null
+++ b/xen/include/xen/hypfs.h
@@ -0,0 +1,89 @@
+#ifndef __XEN_HYPFS_H__
+#define __XEN_HYPFS_H__
+
+#include <xen/list.h>
+#include <xen/string.h>
+#include <public/hypfs.h>
+
+struct hypfs_entry_leaf;
+
+struct hypfs_entry {
+    unsigned short type;
+    unsigned short encoding;
+    unsigned int size;
+    const char *name;
+    struct list_head list;
+    int (*read)(const struct hypfs_entry *entry,
+                XEN_GUEST_HANDLE_PARAM(void) uaddr);
+    int (*write)(struct hypfs_entry_leaf *leaf,
+                 XEN_GUEST_HANDLE_PARAM(void) uaddr, unsigned long ulen);
+};
+
+struct hypfs_entry_leaf {
+    struct hypfs_entry e;
+    union {
+        const void *content;
+        void *write_ptr;
+    };
+};
+
+struct hypfs_entry_dir {
+    struct hypfs_entry e;
+    struct list_head dirlist;
+};
+
+#define HYPFS_DIR_INIT(var, nam)                \
+    struct hypfs_entry_dir var = {              \
+        .e.type = XEN_HYPFS_TYPE_DIR,           \
+        .e.encoding = XEN_HYPFS_ENC_PLAIN,      \
+        .e.name = nam,                          \
+        .e.size = 0,                            \
+        .e.list = LIST_HEAD_INIT(var.e.list),   \
+        .e.read = hypfs_read_dir,               \
+        .dirlist = LIST_HEAD_INIT(var.dirlist), \
+    }
+
+/* Content and size need to be set via hypfs_string_set(). */
+#define HYPFS_STRING_INIT(var, nam)             \
+    struct hypfs_entry_leaf var = {             \
+        .e.type = XEN_HYPFS_TYPE_STRING,        \
+        .e.encoding = XEN_HYPFS_ENC_PLAIN,      \
+        .e.name = nam,                          \
+        .e.read = hypfs_read_leaf,              \
+    }
+
+static inline void hypfs_string_set(struct hypfs_entry_leaf *leaf,
+                                    const char *str)
+{
+    leaf->content = str;
+    leaf->e.size = strlen(str) + 1;
+}
+
+#define HYPFS_UINT_INIT(var, nam, uint)         \
+    struct hypfs_entry_leaf var = {             \
+        .e.type = XEN_HYPFS_TYPE_UINT,          \
+        .e.encoding = XEN_HYPFS_ENC_PLAIN,      \
+        .e.name = nam,                          \
+        .e.size = sizeof(uint),                 \
+        .e.read = hypfs_read_leaf,              \
+        .content = &uint,                       \
+    }

So you've got such helper macros for dir, string, and uint. Why
not e.g. int and bool?

There are no users in my series yet.


+
+
+extern struct hypfs_entry_dir hypfs_root;

No double blank lines please.

Oh, sorry.


+struct hypfs_entry *hypfs_get_entry(const char *path);

Does the only caller really need a non-const return type? Even
hypfs_write() doesn't look to modify what its leaf parameter
points at.

This might change when support for dynamically allocated strings or
blobs is added (I have no plans to do this right now, but its easy
to think about the need).


And is there indeed an expectation for this to be used from
outside of the source file it's defined in?

Yes. As soon as support for e.g. per-domain or per-cpupool nodes is
added this will be needed.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.