[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 1/9] xen: vnuma topology and subop hypercalls



>>> On 26.08.14 at 09:43, <ufimtseva@xxxxxxxxx> wrote:
> +static struct vnuma_info *vnuma_alloc(unsigned int nr_vnodes,
> +                                      unsigned int nr_regions,
> +                                      unsigned int nr_vcpus)
> +{
> +
> +    struct vnuma_info *vnuma;
> +
> +    /*
> +     * Check if any of the allocations are bigger than PAGE_SIZE.
> +     * See XSA-77.
> +     */
> +    if ( nr_vnodes * nr_vnodes > (PAGE_SIZE / sizeof(unsigned int)) ||
> +         nr_regions > (PAGE_SIZE / sizeof(vmemrange_t)) )

I don't recall whether I said this in replies to earlier rounds of this
series, or just others recently, but please avoid type disconnects
like the ones above. You really want

    if ( nr_vnodes * nr_vnodes > (PAGE_SIZE / sizeof(*vnuma->vdistance)) ||
         nr_regions > (PAGE_SIZE / sizeof(*vnuma->vmemrange)) )

so that if the fields' types change (more likely for the former than
the latter), these expressions can't become stale.

> +int vnuma_init(const struct xen_domctl_vnuma *u_vnuma,
> +                const struct domain *d,
> +                struct vnuma_info **dst)
> +{
> +    unsigned int nr_vnodes;
> +    int ret = -EINVAL;
> +    struct vnuma_info *v = NULL;

Pointless initializer. Also I'm relatively certain I said before that "v"
is conventionally used an name for struct vcpu * instances, so
naming the variable e.g. "info" would be preferable.

Similar btw for the first function parameter: We commonly use
the u_ prefix for handles passed into hypercalls. Naming the
parameter e.g. "uinfo" would seem better.

> +    case XEN_DOMCTL_setvnumainfo:
> +    {
> +        struct vnuma_info *vnuma = NULL;
> +
> +        ret = vnuma_init(&op->u.vnuma, d, &vnuma);

So you nicely switch to using xen/err.h definitions for vnuma_alloc().
Why not at once for vnuma_init()?

> +    case XENMEM_get_vnumainfo:
> +    {
> +        struct vnuma_topology_info topology;
> +        struct domain *d;
> +        unsigned int dom_vnodes, dom_regions, dom_vcpus;
> +        struct vnuma_info tmp;
> +
> +        /*
> +         * guest passes nr_vnodes and nr_vcpus thus
> +         * we know how much memory guest has allocated.
> +         */

I think the comment is relatively pointless. But if you want to keep it,
it needs to start with a capital G.

> +        if ( copy_from_guest(&topology, arg, 1 ))
> +            return -EFAULT;
> +
> +        if ( (d = rcu_lock_domain_by_any_id(topology.domid)) == NULL )
> +            return -ESRCH;
> +
> +        read_lock(&d->vnuma_rwlock);
> +
> +        if ( d->vnuma == NULL )
> +        {
> +            read_unlock(&d->vnuma_rwlock);
> +            rcu_unlock_domain(d);
> +            return -EOPNOTSUPP;
> +        }
> +
> +        dom_vnodes = d->vnuma->nr_vnodes;
> +        dom_regions = d->vnuma->nr_regions;
> +        dom_vcpus = d->max_vcpus;
> +
> +        /*
> +         * guest nr_cpus and nr_nodes may differ from domain vnuma config.
> +         * Check here guest nr_nodes and nr_cpus to make sure we dont 
> overflow.
> +         */
> +        rc = -ENOBUFS;
> +
> +        if ( topology.nr_vnodes < dom_vnodes  ||
> +             topology.nr_vcpus < dom_vcpus    ||
> +             topology.nr_regions < dom_regions )
> +        {
> +            read_unlock(&d->vnuma_rwlock);
> +
> +            topology.nr_vnodes = dom_vnodes;
> +            topology.nr_vcpus = dom_vcpus;
> +            topology.nr_regions = dom_regions;
> +
> +            /* Copy back needed values. */
> +             __copy_to_guest(arg, &topology, 1);
> +
> +            rcu_unlock_domain(d);
> +
> +            break;

"return -ENOBUFS;" to match the earlier error exit path. Plus pull
ahead the rcu_unlock_domain() - it can come right after the
read_unlock().

> +        }
> +
> +        rc = -ENOMEM;
> +
> +        /* TODO: remove locking around xmalloc. */

Indeed, please, if at all possible. I know we have bad examples in
the code elsewhere (i.e. we can live with it), but it's not nice to add
more of them when they can be avoided.

Also please use the setting of error status codes prior to an if()
only when the body of that if() then really becomes a single
statement (i.e. not needing curly braces), or when the same
status code is to be used by more than one subsequent exit
paths. In all other cases please set the status inside the inner
block.

> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -35,6 +35,7 @@
>  #include "xen.h"
>  #include "grant_table.h"
>  #include "hvm/save.h"
> +#include "memory.h"
>  
>  #define XEN_DOMCTL_INTERFACE_VERSION 0x0000000a
>  
> @@ -934,6 +935,32 @@ struct xen_domctl_vcpu_msrs {
>  };
>  typedef struct xen_domctl_vcpu_msrs xen_domctl_vcpu_msrs_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vcpu_msrs_t);
> +
> +/*
> + * Use in XEN_DOMCTL_setvnumainfo to set
> + * vNUMA domain topology.
> + */
> +struct xen_domctl_vnuma {
> +    uint32_t nr_vnodes;
> +    uint32_t nr_regions;
> +    XEN_GUEST_HANDLE_64(uint) vdistance;
> +    XEN_GUEST_HANDLE_64(uint) vcpu_to_vnode;

So how is it again that caller and hypervisor agree on dimension of this
array?

> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -521,9 +521,59 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_sharing_op_t);
>   * The zero value is appropiate.
>   */
>  
> +/* vNUMA node memory ranges */
> +struct vmemrange {
> +    uint64_t start, end;
> +    unsigned int flags;

Iirc I asked this to be added _and_ checked to be zero; I don't think
I saw any zero check.

> +/*
> + * vNUMA topology specifies vNUMA node number, distance table,
> + * memory ranges and vcpu mapping provided for guests.
> + * XENMEM_get_vnumainfo hypercall expects to see from guest
> + * nr_vnodes and nr_vcpus to indicate available memory. After
> + * filling guests structures, nr_vnodes and nr_vcpus copied
> + * back to guest.
> + */

Stale comment (lacking mention of nr_regions)?

> +struct vnuma_topology_info {
> +    /* IN */
> +    domid_t domid;
> +    uint16_t pad;

Please also make sure you check this field to be zero when passed in,
so it can be made use of in the future.

> +    /* IN/OUT */
> +    unsigned int nr_vnodes;
> +    unsigned int nr_vcpus;
> +    /* for multi-region vnodes */

I think this comment is pretty pointless too.

> +    unsigned int nr_regions;

To match it up with the field it relates to (below), perhaps it would
better be named nr_vmemranges or at least nr_ranges?

Jan

> +    /* OUT */
> +    union {
> +        XEN_GUEST_HANDLE(uint) h;
> +        uint64_t pad;
> +    } vdistance;
> +    union {
> +        XEN_GUEST_HANDLE(uint) h;
> +        uint64_t pad;
> +    } vcpu_to_vnode;
> +    union {
> +        XEN_GUEST_HANDLE(vmemrange_t) h;
> +        uint64_t pad;
> +    } vmemrange;
> +};


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.