Xen project Mailing List

[Xen-devel] [PATCH v10 10/11] libxl/xl: make it possible to specify soft-affinity in domain config file

From: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Date: Fri, 20 Jun 2014 18:20:12 +0200

Cc: Andrew.Cooper3@xxxxxxxxxx, Wei Liu <wei.liu2@xxxxxxxxxx>, Ian.Campbell@xxxxxxxxxx, Ian.Jackson@xxxxxxxxxx

Delivery-date: Fri, 20 Jun 2014 16:20:18 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

To do so, we add the vcpu_soft_affinity array to build_info, and treat it much like vcpu_hard_affinity. The new config option is called "cpus_soft". Note that the vcpu_hard_affinity array, introduced in a previous patch, and the vcpu_soft_affinity array, introduced here, share the same LIBXL_HAVE_xxx macro, in libxl.h. That is called LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS, and was introduced together with vcpu_hard_affinity, but only inside a comment. In this change, we uncomment, and hence properly define it. Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx> --- Changes from v9: * patch reworked again, due to changes in the preceding ones in the series. The structure is similar, it's still based on adding some indirection, so that the same code can be used to pars and enact both hard and soft affinity, but the code did change, I'm afraid. Changes from v8: * fix a type in the LIBXL_HAVE_xxx macro name. Changes from v7: * WARNING: this patch underwent quite a fundamental rework, given it's now building on top of Wei's "push vcpu affinity to libxl" patch. That's why I think it should be re-reviewed almost from scratch (sorry! :-P), and that's why I did not add IanC's ack, although he provided it to the v7 version of it. Changes from v6: * update and improve the changelog. Changes from v4: * fix typos and rephrase docs, as suggested during review; * more refactoring, i.e., more addressing factor of potential common code, as requested during review. Changes from v3: * fix typos and language issues in docs and comments, as suggested during review; * common code to soft and hard affinity parsing factored together, as requested uring review. Changes from v2: * use the new libxl API. Although the implementation changed only a little bit, I removed IanJ's Acked-by, although I am here saying that he did provided it, as requested. --- docs/man/xl.cfg.pod.5 | 23 ++++++++++++++++++++--- tools/libxl/libxl.h | 3 +-- tools/libxl/libxl_dom.c | 31 +++++++++++++++++++++++++------ tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c | 39 ++++++++++++++++++++++++++++++--------- 5 files changed, 77 insertions(+), 20 deletions(-) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index af48622..25a4ff7 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -152,19 +152,36 @@ run on cpus 3,4,6,7,8 of the host. =back If this option is not specified, no vcpu to cpu pinning is established, -and the vcpus of the guest can run on all the cpus of the host. +and the vcpus of the guest can run on all the cpus of the host. If this +option is specified, the intersection of the vcpu pinning mask, provided +here, and the soft affinity mask, provided via B<cpus\_soft=> (if any), +is utilized to compute the domain node-affinity, for driving memory +allocations. If we are on a NUMA machine (i.e., if the host has more than one NUMA node) and this option is not specified, libxl automatically tries to place the guest on the least possible number of nodes. That, however, will not affect vcpu pinning, so the guest will still be able to run on -all the cpus, it will just prefer the ones from the node it has been -placed on. A heuristic approach is used for choosing the best node (or +all the cpus. A heuristic approach is used for choosing the best node (or set of nodes), with the goals of maximizing performance for the guest and, at the same time, achieving efficient utilization of host cpus and memory. See F<docs/misc/xl-numa-placement.markdown> for more details. +=item B<cpus_soft="CPU-LIST"> + +Exactly as B<cpus=>, but specifies soft affinity, rather than pinning +(hard affinity). When using the credit scheduler, this means what cpus +the vcpus of the domain prefer. + +A C<CPU-LIST> is specified exactly as above, for B<cpus=>. + +If this option is not specified, the vcpus of the guest will not have +any preference regarding on what cpu to run. If this option is specified, +the intersection of the soft affinity mask, provided here, and the vcpu +pinning, provided via B<cpus=> (if any), is utilized to compute the +domain node-affinity, for driving memory allocations. + =back =head3 CPU Scheduling diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index d63cd11..e0720f1 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -361,8 +361,7 @@ typedef struct libxl__ctx libxl_ctx; * Each bitmap should be big enough to accommodate the maximum number of * PCPUs of the host. */ -/* to be uncommented when soft array added */ -/* #define LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS 1 */ +#define LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS 1 /* * LIBXL_HAVE_BUILDINFO_USBDEVICE_LIST diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 81e77c0..076573a 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -272,18 +272,37 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, if (info->nodemap.size) libxl_domain_set_nodeaffinity(ctx, domid, &info->nodemap); /* As mentioned in libxl.h, vcpu_hard_array takes precedence */ - if (info->num_vcpu_hard_affinity) { - int i; + if (info->num_vcpu_hard_affinity || info->num_vcpu_soft_affinity) { + libxl_bitmap *hard_affinity, *soft_affinity; + int i, n_vcpus; + + n_vcpus = info->num_vcpu_hard_affinity > info->num_vcpu_soft_affinity ? + info->num_vcpu_hard_affinity : info->num_vcpu_soft_affinity; + + for (i = 0; i < n_vcpus; i++) { + /* + * Prepare hard and soft affinity pointers in a way that allows + * us to issue only one call to libxl_set_vcpuaffinity(), setting, + * for each vcpu, both hard and soft affinity "atomically". + */ + hard_affinity = NULL; + if (info->num_vcpu_hard_affinity && + i < info->num_vcpu_hard_affinity) + hard_affinity = &info->vcpu_hard_affinity[i]; + + soft_affinity = NULL; + if (info->num_vcpu_soft_affinity && + i < info->num_vcpu_soft_affinity) + soft_affinity = &info->vcpu_soft_affinity[i]; - for (i = 0; i < info->num_vcpu_hard_affinity; i++) { if (libxl_set_vcpuaffinity(ctx, domid, i, - &info->vcpu_hard_affinity[i], - NULL)) { + hard_affinity, soft_affinity)) { LOG(ERROR, "setting affinity failed on vcpu `%d'", i); return ERROR_FAIL; } } - } else if (info->cpumap.size) { + } + if (info->cpumap.size && !info->num_vcpu_hard_affinity) { /* * Although info->cpumap is DEPRECATED, we still want old * applications that may be using it to continue working. diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 477b2a6..18355d9 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -310,6 +310,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ # will prevail. ("nodemap", libxl_bitmap), ("vcpu_hard_affinity", Array(libxl_bitmap, "num_vcpu_hard_affinity")), + ("vcpu_soft_affinity", Array(libxl_bitmap, "num_vcpu_soft_affinity")), ("numa_placement", libxl_defbool), ("tsc_mode", libxl_tsc_mode), ("max_memkb", MemKB), diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 8c94745..462da05 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -694,9 +694,10 @@ static void parse_top_level_sdl_options(XLU_Config *config, static void parse_vcpu_affinity(XLU_Config *config, XLU_ConfigList *cpus, libxl_domain_build_info *b_info, - const char *buf, int num_cpus) + const char *buf, int num_cpus, bool is_hard) { - const char *buf2 = NULL; //XXX Trick the compiler!!! + const char *buf2 = NULL; /* initialize to trick gcc >= 4.9.0. See below */ + libxl_bitmap *vcpu_affinity_array; int j = 0; /* @@ -714,24 +715,37 @@ static void parse_vcpu_affinity(XLU_Config *config, XLU_ConfigList *cpus, if (num_cpus > b_info->max_vcpus || buf) num_cpus = b_info->max_vcpus; - b_info->vcpu_hard_affinity = xmalloc(num_cpus * sizeof(libxl_bitmap)); + if (is_hard) { + b_info->num_vcpu_hard_affinity = num_cpus; + b_info->vcpu_hard_affinity = xmalloc(num_cpus * sizeof(libxl_bitmap)); + vcpu_affinity_array = b_info->vcpu_hard_affinity; + } else { + b_info->num_vcpu_soft_affinity = num_cpus; + b_info->vcpu_soft_affinity = xmalloc(num_cpus * sizeof(libxl_bitmap)); + vcpu_affinity_array = b_info->vcpu_soft_affinity; + } while ((buf || (buf2 = xlu_cfg_get_listitem(cpus, j)) != NULL) && j < num_cpus) { - libxl_bitmap_init(&b_info->vcpu_hard_affinity[j]); - if (libxl_cpu_bitmap_alloc(ctx, - &b_info->vcpu_hard_affinity[j], 0)) { + libxl_bitmap_init(&vcpu_affinity_array[j]); + if (libxl_cpu_bitmap_alloc(ctx, &vcpu_affinity_array[j], 0)) { fprintf(stderr, "Unable to allocate cpumap for vcpu %d\n", j); exit(1); } - if (vcpupin_parse(buf ? buf : buf2, &b_info->vcpu_hard_affinity[j])) + /* + * If buf2 is not initialized above, gcc >= 4.9.0 complains with + * a '[-Werror=maybe-uninitialized]'. That can't happen, though, since + * we use buf2 only if buf is NULL, and if we are here and buf is NULL, + * it means buf2 is non-NULL, and contains the j-eth element of the + * list, as per the condition of the while(). + * */ + if (vcpupin_parse(buf ? buf : buf2, &vcpu_affinity_array[j])) exit(1); j++; } - b_info->num_vcpu_hard_affinity = num_cpus; /* We have a list of cpumaps, disable automatic placement */ libxl_defbool_set(&b_info->numa_placement, false); @@ -847,7 +861,14 @@ static void parse_config_data(const char *config_source, buf = NULL; num_cpus = 0; if (!xlu_cfg_get_list (config, "cpus", &cpus, &num_cpus, 1) || !xlu_cfg_get_string (config, "cpus", &buf, 0)) - parse_vcpu_affinity(config, cpus, b_info, buf, num_cpus); + parse_vcpu_affinity(config, cpus, b_info, buf, + num_cpus, /* is_hard */ true); + + buf = NULL; num_cpus = 0; + if (!xlu_cfg_get_list (config, "cpus_soft", &cpus, &num_cpus, 1) || + !xlu_cfg_get_string (config, "cpus_soft", &buf, 0)) + parse_vcpu_affinity(config, cpus, b_info, buf, + num_cpus, /* is_hard */ false); if (!xlu_cfg_get_long (config, "memory", &l, 0)) { b_info->max_memkb = l * 1024; _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.