[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v10 6/9] libxl: build numa nodes memory blocks



On mer, 2014-09-03 at 00:24 -0400, Elena Ufimtseva wrote:
> Create the vmemrange structure based on the
> PV guests E820 map. Values are in in Megabytes.
> Also export the E820 filter code e820_sanitize
> out to be available internally.
>
Wait: how come we export something, so it will be available
internally? :-O

> As Xen can support mutiranges vNUMA nodes, for
> PV guest it is one range per one node. The other
> domain types should have their building routines
> implemented.
> 
> Changes since v8:
>     - Added setting of the vnuma memory ranges node ids;
> 
> Signed-off-by: Elena Ufimtseva <ufimtseva@xxxxxxxxx>
> ---

> --- a/tools/libxl/libxl_numa.c
> +++ b/tools/libxl/libxl_numa.c
> @@ -19,6 +19,10 @@
>  
>  #include "libxl_internal.h"
>  
> +#include "libxl_vnuma.h"
> +
> +#include "xc_private.h"
> +
>  /*
>   * What follows are helpers for generating all the k-combinations
>   * without repetitions of a set S with n elements in it. Formally
> @@ -508,6 +512,203 @@ int libxl__get_numa_candidate(libxl__gc *gc,
>  }
>  
>  /*
> + * Check if we can fit vnuma nodes to numa pnodes
> + * from vnode_to_pnode array.
> + */
> +bool libxl__vnodemap_is_usable(libxl__gc *gc,
> +                            libxl_domain_build_info *info)
> +{
I see why bool is being used, but I honestly think int, and then
returning proper LIBXL error codes, would be better, here.

Also, the name. You're checking/verifying that there is enough space, so
I'd make the function name reflect that (although, that is not something
super important).

Oh, and why is this function being defined in this patch? It's not used
anywhere in here and, although related, it seems to deal with something
quite different from what the rest of the code is about.

Perhaps you can move it somewhere else, and make this an x86 specific
patch, dealing with e820 and all the other stuff.

Finally, indentation (of the second parameters, in the new line) looks
wrong.

> +    unsigned int i;
> +    libxl_numainfo *ninfo = NULL;
> +    unsigned long long *claim;
>
In libxl_numainfo, free is an array of uint64_t. I think it'd be best to
make claim and free types match.

> +    unsigned int node;
>
This 'node' variable can live inside the first for.

> +    uint64_t *sz_array;
> +    int nr_nodes = 0;
> +
> +    /* Cannot use specified mapping if not NUMA machine. */
> +    ninfo = libxl_get_numainfo(CTX, &nr_nodes);
> +    if (ninfo == NULL)
> +        return false;
> +
For example, here you should return ERROR_FAIL, rather than just false.

> +    sz_array = info->vnuma_mem;
>
Do you really need this indirection? Why can't you use info->vnuma_mem
directly?

> +    claim = libxl__calloc(gc, info->vnodes, sizeof(*claim));
> +    /* Get total memory required on each physical node. */
> +    for (i = 0; i < info->vnodes; i++)
> +    {
> +        node = info->vnuma_vnodemap[i];
> +
> +        if (node < nr_nodes)
> +            claim[node] += (sz_array[i] << 20);
> +        else
> +            goto vnodemapout;
> +   }
> +   for (i = 0; i < nr_nodes; i++) {
> +       if (claim[i] > ninfo[i].free)
> +          /* Cannot complete user request, falling to default. */
> +          goto vnodemapout;
> +   }
>
Both these goto could well have been just 'return true'. If you're
making this returning int, they both can be 'return 0'.

> +
> + vnodemapout:
> +   return true;
>
BTW, how can you distinguish the cases where there is enough space on
the pnodes, from the one where there is not? I guess that will be more
clear when looking at the callsites of this function, but I'd put things
in such a way that makes it evident from just reading the code of this
very function...

> +}
> +

> +/*
> + * Returns number of absent pages within e820 map
> + * between start and end addresses passed. Needed
> + * to correctly set numa memory ranges for domain.
> + */
> +static unsigned long e820_memory_hole_size(unsigned long start,
> +                                            unsigned long end,
> +                                            struct e820entry e820[],
> +                                            unsigned int nr)
>
Indentation looks wrong here too.

Also, how about a more "talking" name for nr ?

Apart from these things, my e820-fu is very limited, so I'm not sure I
can say much about the function. FWIW, I agree with Ian that this should
live somewhere more architecture specific.

> +/*
> + * For each node, build memory block start and end addresses.
> + * Substract any memory hole from the range found in e820 map.
> + * vnode memory size are passed here in megabytes, the result is
> + * in memory block addresses.
> + * Linux kernel will adjust numa memory block sizes on its own.
> + * But we want to provide to the kernel numa block addresses that
> + * will be the same in kernel and hypervisor.
> + */
> +int libxl__vnuma_align_mem(libxl__gc *gc,
> +                            uint32_t domid,
> +                            /* IN: mem sizes in megabytes */
> +                            libxl_domain_build_info *b_info,
> +                            /* OUT: linux NUMA blocks addresses */
> +                            vmemrange_t *memblks)
> +{
> +    unsigned int i;
> +    int j, rc;
> +    uint64_t next_start_blk, end_max = 0, size;
> +    uint32_t nr;
> +    struct e820entry map[E820MAX];
> +
> +    errno = ERROR_INVAL;
> +    if (b_info->vnodes == 0)
> +        return -EINVAL;
>
Aren't we sure that the number of virtual nodes is at least 1 (or
perhaps even more!), at this point?

In any case, I think this should be ERROR_INVAL.

> +
> +    if (!memblks || !b_info->vnuma_mem)
> +        return -EINVAL;
> +
Same here.

> +    libxl_ctx *ctx = libxl__gc_owner(gc);
> +
> +    /* Retrieve e820 map for this host. */
> +    rc = xc_get_machine_memory_map(ctx->xch, map, E820MAX);
> +
> +    if (rc < 0) {
> +        errno = rc;
> +        return -EINVAL;
> +    }
>
I'm sure you should be returning a libxl error code. About errno, I
don't think that, as opposed to libxc, we ever set errno in libxl.

Actually, it should be xc_get_memory_map that, on error, sets errno and
returns -1. Whether that is what's happening or not, it's a different
story! :-)

> +    nr = rc;
>
I'd use nr directly, instead of fiddling with rc:

    nr = xc_get_machine_memory_map(...);
    if (nr < 0)
        return ERROR_FAIL;

Consider using a better name for nr here too (not super important, but
still...)

> +    rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,
> +                       (b_info->max_memkb - b_info->target_memkb) +
> +                       b_info->u.pv.slack_memkb);
> +    if (rc) {
> +        errno = rc;
> +        return -EINVAL;
> +    }
> +
Same here. Return a meaningful libxl error code (probably ERROR_FAIL),
and forget about errno.

And this is true throughout the rest of the patch.

> --- a/tools/libxl/libxl_x86.c
> +++ b/tools/libxl/libxl_x86.c
> @@ -1,5 +1,6 @@
>  #include "libxl_internal.h"
>  #include "libxl_arch.h"
> +#include "libxl_vnuma.h"
>  
>  static const char *e820_names(int type)
>  {
> @@ -14,7 +15,7 @@ static const char *e820_names(int type)
>      return "Unknown";
>  }
>  
> -static int e820_sanitize(libxl_ctx *ctx, struct e820entry src[],
> +int e820_sanitize(libxl_ctx *ctx, struct e820entry src[],
>                           uint32_t *nr_entries,
>                           unsigned long map_limitkb,
>                           unsigned long balloon_kb)
>
Doing it the other way around, i.e., not exporting this one, moving your
stuff in this file and figure out a way for calling it in common code,
is probably the way to go.

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.