[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Device model operation hypercall (DMOP, re qemu depriv)

On 01/08/16 12:32, Ian Jackson wrote:
I think we need to introduce a new hypercall (which I will call DMOP
for now) which may augment or replace some of HVMCTL.  Let me explain:

I believe the new 'DMOP' hypercall is a good idea,  but following on
from discussions, I propose a revised design, which I present below.
Please let me know what you think.


DMOP  (multi-buffer variant)


A previous proposal for a 'DMOP' was put forward by Ian Jackson on the 1st
of August. This proposal seem very promising, however a problem was
identified with it, that this proposal addresses.

The aim of DMOP, as before, is to prevent a compromised device model from
compromising domains other then the one it is associated with. (And is
therefore likely already compromised)

The previous proposal adds a DMOP hypercall, for use by the device models,
which places the domain ID in a fixed place, within the calling args,
such that the priv driver can always find it, and not need to know any
further details about the particular DMOP in order to validate it against
previously set (vie ioctl) domain.

The problem occurs when you have a DMOP with references to other user memory
objects, such as with Track dirty VRAM (As used with the VGA buffer).
Is this case, the address of this other user object needs to be vetted,
to ensure it is not within a restricted address ranges, such as kernel
memory. The real problem comes down to how you would vet this address -
the idea place to do this is within the privcmd driver, since it would have
knowledge of the address space involved. However, with a principal goal
of DMOP to allow privcmd to be free from any knowledge of DMOP's sub ops,
it would have no way to identify any user buffer  addresses that need
checking.  The alternative of allowing the hypervisor to vet the address
is also problematic, since it has no knowledge of the guest memory layout.

The Design

As with the previous design, we provide a new restriction ioctl, which
takes a domid parameter.  After that restriction ioctl is called, the
privcmd driver will permit only DMOP hypercalls, and only with the
specified target domid.

In the previous design, a DMOP consisted of one buffer, containing all of
the DMOP parameters, which may include further explicit references to
more buffers.  In this design, an array of buffers with length is presented,
with the first one containing the DMOP parameters, which could implicitly
reference to further buffers from within in the array. Here, the only
user buffers passed, are that found with the array, and so all can be audited
from privcmd.  With the passing of the length of the buffers array, privcmd
does not need to know which DMOP it is to audit them.

If the hypervisor ends up with the wrong number of buffers, it can reject
the DMOP at that point.

The following code illustrates this idea:

typedef struct dm_op_buffer {
    XEN_GUEST_HANDLE(void) h;
    size_t len;
} dm_op_buffer_t;

    domid_t domid,
    unsigned int nr_buffers,
    XEN_GUEST_HANDLE_PARAM(dm_op_buffer_t) buffers)

@domid: the domain the hypercall operates on.
@nr_buffers; the number of buffers in the @buffers array.

@buffers: an array of buffers.  @buffers[0] contains device_model_op - the
structure describing the sub-op and its parameters. @buffers[1], @buffers[2]
etc. may be used by a sub-op for passing additional buffers.

struct device_model_op {
    uint32_t op;
    union {
         struct op_1 op1;
         struct op_2 op2;
         /* etc... */
    } u;

It is forbidden for the above struct (device_model_op) to contain any
guest handles - if they are needed, they should instead be in

Validation by privcmd driver

If the privcmd driver has been restricted to specific domain (using a
 new ioctl), when it received an op, it will:

1. Check hypercall is DMOP.

2. Check domid == restricted domid.

3. For each @nr_buffers in @buffers: Check @h and @len give a buffer
   wholey in the user space part of the virtual address space. (e.g.,
   on Linux use access_ok()).

Xen Implementation

Since a DMOP sub of may need to copy or return a buffer from the guest,
as well as the DMOP itself to git the initial buffer, functions for doing
this would be written as below.  Note that care is taken to prevent
damage from buffer under or over run situations.  If the DMOP is called
with insufficient buffers, zeros will be read, while extra is ignored.

int copy_dm_buffer_from_guest(
    void *dst,                        /* Kernel destination buffer      */
    size_t max_len,                   /* Size of destination buffer     */
    XEN_GUEST_HANDLE_PARAM(dm_op_buffer_t) buffers,
                                      /* dm_op_buffers passed into DMOP */
    unsigned int nr_buffers,          /* total number of dm_op_buffers  */
    unsigned int idx)                 /* Index of buffer we require     */
        struct dm_op_buffer buffer;
        int ret;
        size_t len;

        memset(dst, 0, max_len);

        if (idx>=nr_buffers)
            return -EFAULT;

        ret = copy_from_guest_offset(&buffer, buffers, idx, 1);
        if ( ret != sizeof(buffer) )
            return -EFAULT;

        len = min(max_len, buffer->len);

        ret = raw_copy_from_guest(dst, buffer->h, len);
        if ( ret != len )
            return -EFAULT;

        return 0;

int copy_dm_buffer_to_guest(...)
    /* Similar to the above, except copying the the other
       direction. */

This leaves do_device_model_op easy to implement as below:

int do_device_model_op(domid_t domid,
   unsigned int nr_buffers,
   XEN_GUEST_HANDLE_PARAM(dm_op_buffer_t) buffers)
    struct device_model_op op;

ret = copy_dm_buffer_from_guest(&op, sizeof(op), buffers, nr_buffers, 0);
    if ( ret < 0 )
        return ret;

    switch (op->op)
        case DMOP_sub_op1:
            /* ... */
        /* etc. */

    return 0;


Advantages of this system, over previouse DMOP proposals:

*  The validation of address ranges is easily done by the privcmd driver,
   using standard kernel mechanisms.  No need to get Xen thinking about
   guest memory layout, which it should be independent of, and potentially
   adding confusion.

*  Arbitrary number of additional address ranges validated with same
   mechanism as the initial parameter block.

*  No need for any new copy_from_guest() variants in the hypervisor, which
   among other things, prevents code using the wrong one by error,
   potentially bypassing security.

And as with the original DMOP proposal:

*  The Privcmd driver, or any other kernel parts will not need to be
   updated when new DMOPs are added or changed.

Disadvantages of this system:

* Minor stylistic issue relating to buffers being implicitly referred to.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.