[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v10 9/9] libxl: vnuma topology configuration parser and doc
Hello Dario Thank you for such a thorough review. And you are right, it will be changed slightly as we agreed on the different approach to declare vnuma in libxl :) On Thu, Sep 11, 2014 at 1:13 PM, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote: > So, as far as I understood it, v11 did not include a toolstack part, so > this version is what I should be looking at, is this correct? > > I'm also looking at the commits in the github repo (for v11), and the > code seems to be the same as here. > > Please, do correct me if I'm wrong. Hopefully, the comments I'm leaving > about this version, will be useful anyway. > > Also, as we agreed upon changing the libxl data structures for this > feature (i.e., the vNUMA related field names and types), the review will > focus on the behavior of the xl parser (and on the structure of the > parsing code, of course), and neglect that part. > > So, first of all, about the subject: 'libxl: vnuma...'. This patch looks > more about xl than libxl (so I'd expect something like 'xl: vnuma...'). > > On mer, 2014-09-03 at 00:24 -0400, Elena Ufimtseva wrote: >> --- a/docs/man/xl.cfg.pod.5 >> +++ b/docs/man/xl.cfg.pod.5 >> @@ -264,6 +264,83 @@ if the values of B<memory=> and B<maxmem=> differ. >> A "pre-ballooned" HVM guest needs a balloon driver, without a balloon driver >> it will crash. >> > In general, about the config file interface, what I think what we want > an user to be able to specify is: > > 1. number of virtual NUMA nodes > (default: 1) > 2. amount of memory on each virtual NUMA node > (default: total tot_memory / nr_nodes) > 3. the 'distances' from each node to each other one > (default: something sane :-) ) > 4. what vcpus on each virtual NUMA node > (default: distribute evenly) > 5. virtual node to physical node mapping > (default: driven by the existing automatic placement logic, or > by pinning) > > Which is pretty much what this patch allows to do, so, good! :-D > > The syntax used for each of these parameters is also mostly fine, IMO, > with the only exception of the mapping between vCPUs and vNODEs (more on > that below). > > About the name of the parameters, in this patch, we have: > > 1. "vnuma_nodes" > 2. "vnuma_mem" > 3. "vdistance" > 4. "vnuma_cpumap" > 5. "vnuma_nodemap" > > I have to admit, I don't especially like these, so here's my proposal. > I'm of course open to other views here, I'm actually just tossing names > there (with a brief rationale and explanation)... :-) > > 1. "vnodes": for vCPUs, we have "vcpus", I think something along the > same line would make sense; > 2. "vnodes_memory" : same reasoning as above (BTW, we may want to go > for something like "vnodes_maxmem", as that is > probably what we're actually specifying, isn't it? > I'm not sure how important it is to stress that, > though); > 3. "vnodes_distance[s]": or just "vnodes_dist[s]" > 4. "vcpus_to_vnodes" > 5. "vnodes_to_pnodes": not particularly proud of this, but can't think > of anything better right now. > > Thoughts? Yep, totally fine with these names. > >> +=item B<vnuma_nodes=N> >> + >> +Number of vNUMA nodes the guest will be initialized with on boot. >> +PV guest by default will have one vnuma node. >> + > It's not only PV guests that will have, by default, just one node... > This is true for every guest (the reason being vNUMA is not even > implemented for such other guest types!! :-) ). > > As it stands, it looks to me like other kind of guest than PV can have a > different number of NUMA nodes by default. > >> +=item B<vnuma_mem=[vmem1, vmem2, ...]> >> + >> +List of memory sizes for each node, defined in MBytes. Number of items >> listed must >> +match nr_vnodes. >> > "must match the number of virtual NUMA nodes specified by > B<vnuma_nodes=>" > >> If the sum of all vnode memories does not match the domain memory >> > ", specified by B<memory=> and/or B<maxmem=>," > >> +or there are missing nodes, it will fail. >> +If not specified, memory will be equally split between vnodes. Current >> minimum >> +memory size for one node is limited by 32MB. >> + >> +Example: vnuma_mem=[1024, 1024, 2048, 2048] >> +Total amount of memory in guest: 6GB >> + > >> +=item B<vnuma_vcpumap=[node_nr, node_nr, ...]> >> + >> +Defines vcpu to vnode mapping as a list of integers. The position in the >> list >> +is a vcpu number, and the value is the vnode number to which the vcpu will >> be >> +assigned to. >> +Current limitations: >> +- vNUMA node must have at least one vcpu, otherwise default vcpu_to_vnode >> will be used. >> +- Total number of vnodes cannot be bigger then number of vcpus. >> + >> +Example: >> +Map of 4 vcpus to 2 vnodes: >> +0,1 vcpu -> vnode0 >> +2,3 vcpu -> vnode1: >> + >> +vnuma_vcpumap = [0, 0, 1, 1] >> + 4 vcpus here - 0 1 2 3 >> + >> > I wonder whether it would be better to turn this into something lieke > the below: > > ["0,1,2,3", "4-7", "8,10,12,14", "9,11,13,15"] Yep, I think its good to have more unified approach. > > With the following meaning: there are 4 nodes, and vCPUs 0,1,2,3 belongs > to node 0, vCPUs 4,5,6,7 belongs to node 1, etc. > > It looks a lot more easy to use to me. With the current syntax, the same > configuration would be expressed like this: > > [0,0,0,0,1,1,1,1,2,3,2,3,2,3,2,3] > > Quite hard to get right when both writing and reading it, isn't it? > > Elena's example looks like this: > > ["0,1", "2,3"] (with my notation) > [0, 0, 1, 1] (with this patch notation) > > From an implementation point of view, the code for parsing strings like > these (and also for parsing list of strings like this), is already there > in xl, so it shouldn't even be too big of a piece of work... > >> +=item B<vnuma_vnodemap=[p1, p2, ..., pn]> >> + >> +List of physical node numbers, position in the list represents vnode number. >> +Used for manual placement of vnuma nodes to physical NUMA nodes. >> +Will not be used if automatic numa placement is active. >> + >> +Example: >> +assume NUMA machine with 4 physical nodes. Placing vnuma node 0 to pnode 2, >> +vnuma node 1 to pnode 3: >> +vnode0 -> pnode2 >> +vnode1 -> pnode3 >> + >> +vnuma_vnodemap=[2, 3] >> +first vnode will be placed on node 2, second on node 3. >> + >> > I'm fine with this. > >> +=item B<vnuma_autoplacement=[0|1]> >> + > To be killed. :-) > >> =head3 Event Actions >> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c >> index 409a795..1af2250 100644 >> --- a/tools/libxl/xl_cmdimpl.c >> +++ b/tools/libxl/xl_cmdimpl.c >> @@ -40,6 +40,7 @@ >> #include "libxl_json.h" >> #include "libxlutil.h" >> #include "xl.h" >> +#include "libxl_vnuma.h" >> >> /* For calls which return an errno on failure */ >> #define CHK_ERRNOVAL( call ) ({ \ >> @@ -797,6 +798,432 @@ static void >> parse_vcpu_affinity(libxl_domain_build_info *b_info, >> } >> } >> >> +static unsigned int get_list_item_uint(XLU_ConfigList *list, unsigned int i) >> +{ >> + const char *buf; >> + char *ep; >> + unsigned long ul; >> + int rc = -EINVAL; >> + >> + buf = xlu_cfg_get_listitem(list, i); >> + if (!buf) >> + return rc; >> + ul = strtoul(buf, &ep, 10); >> + if (ep == buf) >> + return rc; >> + if (ul >= UINT16_MAX) >> + return rc; >> + return (unsigned int)ul; >> +} >> + > I see what you want to achieve here, but I don't particularly like > having this as an helper here. If we want something this generic, > perhaps put this in xlu? Although, I'm not sure about this either... > > It's probably not such a great deal, and it's a tool's maintainer call, > I think. > >> +static void vdistance_set(unsigned int *vdistance, >> + unsigned int nr_vnodes, >> + unsigned int samenode, >> + unsigned int othernode) >> +{ >> + unsigned int idx, slot; >> + for (idx = 0; idx < nr_vnodes; idx++) >> + for (slot = 0; slot < nr_vnodes; slot++) >> + *(vdistance + slot * nr_vnodes + idx) = >> + idx == slot ? samenode : othernode; >> +} >> + >> +static void vcputovnode_default(unsigned int *cpu_to_node, >> + unsigned int nr_vnodes, >> + unsigned int max_vcpus) >> +{ >> + unsigned int cpu; >> + for (cpu = 0; cpu < max_vcpus; cpu++) >> + cpu_to_node[cpu] = cpu % nr_vnodes; >> +} >> + >> +/* Split domain memory between vNUMA nodes equally. */ >> +static int split_vnumamem(libxl_domain_build_info *b_info) >> +{ >> + unsigned long long vnodemem = 0; >> + unsigned long n; >> + unsigned int i; >> + >> + if (b_info->vnodes == 0) >> + return -1; >> + > This is never called (and that's ok!) with when b_info->vnodes==0, so I > find the check more confusing than helpful. > > An assert(), maybe? > >> + vnodemem = (b_info->max_memkb >> 10) / b_info->vnodes; >> + if (vnodemem < MIN_VNODE_SIZE) >> + return -1; >> + /* reminder in MBytes. */ >> + n = (b_info->max_memkb >> 10) % b_info->vnodes; >> + /* get final sizes in MBytes. */ >> + for (i = 0; i < (b_info->vnodes - 1); i++) >> + b_info->vnuma_mem[i] = vnodemem; >> + /* add the reminder to the last node. */ >> + b_info->vnuma_mem[i] = vnodemem + n; >> + return 0; >> +} >> + >> +static void vnuma_vnodemap_default(unsigned int *vnuma_vnodemap, >> + unsigned int nr_vnodes) >> +{ >> + unsigned int i; >> + for (i = 0; i < nr_vnodes; i++) >> + vnuma_vnodemap[i] = VNUMA_NO_NODE; >> +} >> + >> +/* >> + * init vNUMA to "zero config" with one node and all other >> + * topology parameters set to default. >> + */ >> +static int vnuma_default_config(libxl_domain_build_info *b_info) >> +{ >> + b_info->vnodes = 1; >> + /* all memory goes to this one vnode, as well as vcpus. */ >> + if (!(b_info->vnuma_mem = (uint64_t *)calloc(b_info->vnodes, >> + sizeof(*b_info->vnuma_mem)))) >> + goto bad_vnumazerocfg; >> + >> + if (!(b_info->vnuma_vcpumap = (unsigned int *)calloc(b_info->max_vcpus, >> + sizeof(*b_info->vnuma_vcpumap)))) >> + goto bad_vnumazerocfg; >> + >> + if (!(b_info->vdistance = (unsigned int *)calloc(b_info->vnodes * >> + b_info->vnodes, >> sizeof(*b_info->vdistance)))) >> + goto bad_vnumazerocfg; >> + >> + if (!(b_info->vnuma_vnodemap = (unsigned int *)calloc(b_info->vnodes, >> + sizeof(*b_info->vnuma_vnodemap)))) >> + goto bad_vnumazerocfg; >> + >> + b_info->vnuma_mem[0] = b_info->max_memkb >> 10; >> + >> + /* all vcpus assigned to this vnode. */ >> + vcputovnode_default(b_info->vnuma_vcpumap, b_info->vnodes, >> + b_info->max_vcpus); >> + >> + /* default vdistance is 10. */ >> + vdistance_set(b_info->vdistance, b_info->vnodes, 10, 10); >> + >> + /* VNUMA_NO_NODE for vnode_to_pnode. */ >> + vnuma_vnodemap_default(b_info->vnuma_vnodemap, b_info->vnodes); >> + >> + /* >> + * will be placed to some physical nodes defined by automatic >> + * numa placement or VNUMA_NO_NODE will not request exact node. >> + */ >> + libxl_defbool_set(&b_info->vnuma_autoplacement, true); >> + return 0; >> + >> + bad_vnumazerocfg: >> + return -1; >> +} >> + > Can't this, and with it most of the above helpers, be implemented in > such a way that it fits inside libxl__domain_build_info_setdefault() > (and, if yes, moved in a libxl related patch)? > > I'm not 100% sure, but it really looks similar to the kind of things we > usually do there rather than in xl. > >> +static void free_vnuma_info(libxl_domain_build_info *b_info) >> +{ >> + free(b_info->vnuma_mem); >> + free(b_info->vdistance); >> + free(b_info->vnuma_vcpumap); >> + free(b_info->vnuma_vnodemap); >> + >> + b_info->vnuma_mem = NULL; >> + b_info->vdistance = NULL; >> + b_info->vnuma_vcpumap = NULL; >> + b_info->vnuma_vnodemap = NULL; >> + >> + b_info->vnodes = 0; >> + b_info->vmemranges = 0; >> +} >> + > This appears to be called only once, right before an "exit(1)", in xl. > Do we need it? For sure we don't need the =NULL and =0 thing, do we? > >> +static int parse_vnuma_mem(XLU_Config *config, >> + libxl_domain_build_info **b_info) >> +{ >> + libxl_domain_build_info *dst; >> > Why the **b_info and the *dst thing? > >> + XLU_ConfigList *vnumamemcfg; >> + int nr_vnuma_regions, i; >> + unsigned long long vnuma_memparsed = 0; >> + unsigned long ul; >> + const char *buf; >> + char *ep; >> + >> + dst = *b_info; >> + if (!xlu_cfg_get_list(config, "vnuma_mem", >> + &vnumamemcfg, &nr_vnuma_regions, 0)) { >> + >> + if (nr_vnuma_regions != dst->vnodes) { >> + fprintf(stderr, "Number of numa regions (vnumamem = %d) is \ >> + incorrect (should be %d).\n", nr_vnuma_regions, >> + dst->vnodes); >> > I find the term 'regions' rather confusing, both in the variable name > and in the error message. I'd go for something like nr_vnuma_mem, for > the variable. For the message, maybe something like this: > > "Specified the amount of memory for %d virtual nodes,\n" > "while number of virtual node is %d!\n" > > But maybe it's just me. :-) > > Oh, the "(vnumamem = %d)" part, at least, makes very few sense to me, as > I don't thing the word 'vnumamem', written like that and associated with > an integer, makes much sense to an user. At least that should change, I > think. > >> + goto bad_vnuma_mem; >> + } >> + >> + dst->vnuma_mem = calloc(dst->vnodes, >> + sizeof(*dst->vnuma_mem)); >> > If you go for the libxl__build_info_setdefault() route I suggested > above, this may need to change a bit, as you'll probably find > b_info->vnuma_mem allocated already. > > Anyway, I don't think it would be a big deal to, in that case, either > use realloc() here, or use NULL as default value (and hence set it to > that in libxl__build_info_sedefault() ). > >> + if (dst->vnuma_mem == NULL) { >> + fprintf(stderr, "Unable to allocate memory for vnuma >> ranges.\n"); >> > Ditto, about 'ranges'. > >> + goto bad_vnuma_mem; >> + } >> + >> + /* >> + * Will parse only nr_vnodes times, even if we have more/less >> regions. >> > How can it be that you have fewer or more entries than b_info->vnodes? > >> + * Take care of it later if less or discard if too many regions. >> > You took care of that above already... > >> + */ >> + for (i = 0; i < dst->vnodes; i++) { >> + buf = xlu_cfg_get_listitem(vnumamemcfg, i); >> + if (!buf) { >> + fprintf(stderr, >> + "xl: Unable to get element %d in vnuma memory >> list.\n", i); >> + goto bad_vnuma_mem; >> > The common pattern in xl seems to be to exit() right away, if this > happens (whether to exit with 1 or -1, that seems a bit inconsistent to > me, at a quick glance). > >> + } >> + >> + ul = strtoul(buf, &ep, 10); >> + if (ep == buf) { >> + fprintf(stderr, "xl: Invalid argument parsing vnumamem: >> %s.\n", buf); >> + goto bad_vnuma_mem; >> > And the same (i.e., just exit() ) applies here... And pretty much > everywhere, actually! :-) > >> + } >> + >> + /* 32Mb is a min size for a node, taken from Linux */ >> + if (ul >= UINT32_MAX || ul < MIN_VNODE_SIZE) { >> > What's the idea behing UINT32_MAX here? > > I guess what I mean is, if there is, for any reason, a maximum virtual > node size, then it may be useful to #define it, e.g., MAX_VNODE_SIZE > (just as you're doing for MIN_VNODE_SIZE). If there is not, then we can > just live with what the type can handle, and it will be lower layers > that will return appropriate errors if values are wrong/unrealistic. > > Like this, you're saying that such maximum virtual node size is 4096 TB > (if I'm not doing the math wrong), which I don't think it makes much > sense... :-P > >> + fprintf(stderr, "xl: vnuma memory %lu is not within %u - %u >> range.\n", >> + ul, MIN_VNODE_SIZE, UINT32_MAX); >> > "xl: memory for virtual node %lu is outside of [%u, %u] MB.\n" > >> + goto bad_vnuma_mem; >> + } >> + >> + /* memory in MBytes */ >> + dst->vnuma_mem[i] = ul; >> + } >> + >> + /* Total memory for vNUMA parsed to verify */ >> + for (i = 0; i < nr_vnuma_regions; i++) >> + vnuma_memparsed = vnuma_memparsed + (dst->vnuma_mem[i]); >> + > Not a big deal, but can't you accumulate while doing the parsing? > >> + /* Amount of memory for vnodes same as total? */ >> + if ((vnuma_memparsed << 10) != (dst->max_memkb)) { >> + fprintf(stderr, "xl: vnuma memory is not the same as domain \ >> + memory size.\n"); >> + goto bad_vnuma_mem; >> + } >> + } else { >> + dst->vnuma_mem = calloc(dst->vnodes, >> + sizeof(*dst->vnuma_mem)); >> + if (dst->vnuma_mem == NULL) { >> + fprintf(stderr, "Unable to allocate memory for vnuma >> ranges.\n"); >> + goto bad_vnuma_mem; >> + } >> + > The allocation is the same as above. It can live outside of the if, > can't it? Especially considering that, as said before, you can just > exit() on failure. > >> + fprintf(stderr, "WARNING: vNUMA memory ranges were not >> specified.\n"); >> + fprintf(stderr, "Using default equal vnode memory size %lu Kbytes \ >> + to cover %lu Kbytes.\n", >> + dst->max_memkb / dst->vnodes, dst->max_memkb); >> + > Does this deserve a warning? In the man page, we say it's legal not to > specify this, so I may well have read that, and may be doing this on > purpose... > >> + if (split_vnumamem(dst) < 0) { >> + fprintf(stderr, "Could not split vnuma memory into equal >> chunks.\n"); >> + goto bad_vnuma_mem; >> + } >> + } >> + return 0; >> + >> + bad_vnuma_mem: >> + return -1; >> +} >> + >> +static int parse_vnuma_distance(XLU_Config *config, >> + libxl_domain_build_info **b_info) >> +{ >> + libxl_domain_build_info *dst; >> > Same as above. > >> + XLU_ConfigList *vdistancecfg; >> + int nr_vdist; >> + >> + dst = *b_info; >> + dst->vdistance = calloc(dst->vnodes * dst->vnodes, >> + sizeof(*dst->vdistance)); >> + if (dst->vdistance == NULL) >> + goto bad_distance; >> + >> + if (!xlu_cfg_get_list(config, "vdistance", &vdistancecfg, &nr_vdist, >> 0)) { >> + int d1, d2, i; >> + /* >> + * First value is the same node distance, the second as the >> + * rest of distances. The following is required right now to >> > I'm not a native English speaker, but 'The latter' sounds more correct > than 'The following' (if that is what you mean; if it's not, then I > didn't understand what you mean! :-) ). > >> + * avoid non-symmetrical distance table as it may break latest >> kernel. >> + * TODO: Better way to analyze extended distance table, possibly >> + * OS specific. >> + */ >> + > Ok. So, AFAIUI, for now, you want the list to be two elements long, is > that right? If yes, should we check that nr_vdist==2 ? > > What happens if I have b_info->vnodes > 1, and I pass a list only one > element long ? > >> + for (i = 0; i < nr_vdist; i++) { >> + d1 = get_list_item_uint(vdistancecfg, i); >> + } >> + > Mmm... What's this? > >> + d1 = get_list_item_uint(vdistancecfg, 0); >> + if (dst->vnodes > 1) >> + d2 = get_list_item_uint(vdistancecfg, 1); >> + else >> + d2 = d1; >> + >> + if (d1 >= 0 && d2 >= 0) { >> + if (d1 < d2) >> + fprintf(stderr, "WARNING: vnuma distance d1 < d2, %u < >> %u\n", d1, d2); >> > I agree this looks weird, and deserves a warning. > > I'd mention why that is the case, in the warning message, i.e., because > the distance from a node to _itself_ is being set to a bigger value than > the istance between two different nodes. > >> + vdistance_set(dst->vdistance, dst->vnodes, d1, d2); >> + } else { >> + fprintf(stderr, "WARNING: vnuma distance values are >> incorrect.\n"); >> + goto bad_distance; >> > After printing this "WARNING", you goto bad_distance, which means this > returns -1, which means we below exit(1). > > So, this really looks like an error, rather than a warning to me. > >> + } >> + } else { >> + fprintf(stderr, "Could not parse vnuma distances.\n"); >> + vdistance_set(dst->vdistance, dst->vnodes, 10, 20); >> + } >> + return 0; >> + >> + bad_distance: >> + return -1; >> > As said before, xl often just exit on failures like these. > >> +} >> + >> +static int parse_vnuma_vcpumap(XLU_Config *config, >> + libxl_domain_build_info **b_info) >> +{ >> + libxl_domain_build_info *dst; >> > ... > >> + XLU_ConfigList *vcpumap; >> + int nr_vcpumap, i; >> + >> + dst = *b_info; >> + dst->vnuma_vcpumap = (unsigned int *)calloc(dst->max_vcpus, >> + sizeof(*dst->vnuma_vcpumap)); >> > You don't need the cast. > >> + if (dst->vnuma_vcpumap == NULL) >> + goto bad_vcpumap; >> + > exit(1); > >> + if (!xlu_cfg_get_list(config, "vnuma_vcpumap", >> + &vcpumap, &nr_vcpumap, 0)) { >> + if (nr_vcpumap == dst->max_vcpus) { >> > if (nr_vcpumap != b_info->max_vcpus) { > fprintf(stderr, "xl: mapping of vCPUs to vNODEs specified for " > "%d vCPUs, while the domain has %d", > nr_vcpumap, b_inf->max_vcpus); > exit(1) > } > > and then continue with one less level of indentation. :-) > >> + unsigned int vnode, vcpumask = 0, vmask; >> + >> + vmask = ~(~0 << nr_vcpumap); >> + for (i = 0; i < nr_vcpumap; i++) { >> + vnode = get_list_item_uint(vcpumap, i); >> + if (vnode >= 0 && vnode < dst->vnodes) { >> + vcpumask |= (1 << i); >> + dst->vnuma_vcpumap[i] = vnode; >> + } >> + } >> + >> + /* Did it covered all vnodes in the vcpu mask? */ >> + if ( !(((vmask & vcpumask) + 1) == (1 << nr_vcpumap)) ) { >> + fprintf(stderr, "WARNING: Not all vnodes were covered \ >> + in numa_cpumask.\n"); >> + goto bad_vcpumap; >> + } >> > As we settled on using libxl_bitmap for this, I expect this to change > quite a bit (e.g., we'd have to build a bitmap per each node, the > libxl_bitmap manipulating functions should be used, etc.). > >> + } else { >> + fprintf(stderr, "WARNING: Bad vnuma_vcpumap.\n"); >> + goto bad_vcpumap; >> + } >> + } >> + else >> + vcputovnode_default(dst->vnuma_vcpumap, >> + dst->vnodes, >> + dst->max_vcpus); >> + return 0; >> + >> + bad_vcpumap: >> + return -1; >> +} >> + >> +static int parse_vnuma_vnodemap(XLU_Config *config, >> + libxl_domain_build_info **b_info) >> +{ >> + libxl_domain_build_info *dst; > No need? > >> + XLU_ConfigList *vnodemap; >> + int nr_vnodemap, i; >> + >> + dst = *b_info; >> + >> + /* There is mapping to NUMA physical nodes? */ >> + dst->vnuma_vnodemap = (unsigned int *)calloc(dst->vnodes, >> + sizeof(*dst->vnuma_vnodemap)); >> > no cast. > >> + if (dst->vnuma_vnodemap == NULL) >> + goto bad_vnodemap; >> + >> + if (!xlu_cfg_get_list(config, "vnuma_vnodemap", >> + &vnodemap, &nr_vnodemap, 0)) { >> + /* >> + * If not specified or incorrect, will be defined >> + * later based on the machine architecture, configuration >> + * and memory availble when creating domain. >> + */ >> + libxl_defbool_set(&dst->vnuma_autoplacement, false); >> > As agreed in the other part of this thread, there is already an > automatic placement switch to act on, in this case. > > Note that the way that switch is being used right now in libxl code, may > need to change a bit, but that is, nevertheless, the right thing to do. > So have a look at that, please. :-) > >> + if (nr_vnodemap == dst->vnodes) { >> + unsigned int vnodemask = 0, pnode, smask; >> + smask = ~(~0 << dst->vnodes); >> + for (i = 0; i < dst->vnodes; i++) { >> + pnode = get_list_item_uint(vnodemap, i); >> + if (pnode >= 0) { >> + vnodemask |= (1 << i); >> + dst->vnuma_vnodemap[i] = pnode; >> + } >> + } >> + >> + /* Did it covered all vnodes in the mask? */ >> + if ( !(((vnodemask & smask) + 1) == (1 << nr_vnodemap)) ) { >> + fprintf(stderr, "WARNING: Not all vnodes were covered \ >> + vnuma_vnodemap.\n"); >> + fprintf(stderr, "Automatic placement will be used for >> vnodes.\n"); >> + libxl_defbool_set(&dst->vnuma_autoplacement, true); >> + vnuma_vnodemap_default(dst->vnuma_vnodemap, dst->vnodes); >> + } >> + } >> + else { >> + fprintf(stderr, "WARNING: Incorrect vnuma_vnodemap.\n"); >> + fprintf(stderr, "Automatic placement will be used for >> vnodes.\n"); >> + libxl_defbool_set(&dst->vnuma_autoplacement, true); >> + vnuma_vnodemap_default(dst->vnuma_vnodemap, dst->vnodes); >> + } >> + } >> > This is also about to change quite substantially, so I'm avoiding > commenting on it... > >> +static void parse_vnuma_config(XLU_Config *config, >> + libxl_domain_build_info *b_info) >> +{ >> + long l; >> + >> + if (!xlu_cfg_get_long (config, "vnodes", &l, 0)) { >> + if (l > MAX_VNUMA_NODES) { >> + fprintf(stderr, "Too many vnuma nodes, max %d is allowed.\n", >> + MAX_VNUMA_NODES); >> + goto bad_vnuma_config; >> + } >> + b_info->vnodes = l; >> + >> + if (!xlu_cfg_get_defbool(config, "vnuma_autoplacement", >> + &b_info->vnuma_autoplacement, 0)) >> + libxl_defbool_set(&b_info->vnuma_autoplacement, false); >> + > "vnuma_autoplacement" is, AFAIUI, going away, so no comments. ;-P > > > Regards, > Dario > > -- > <<This happens because I choose it to happen!>> (Raistlin Majere) > ----------------------------------------------------------------- > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) > Majority of comments make sense to me. Now I wanted to ask about the memory ranges :) I am not sure at this point I should have them parsed in config file (for the case where vNUMA node has multiple ranges) and as Ian has mentioned I will be moving these ranges away from libxl build info. I guess for now It will be better to leave these ranges out of scope of parsing code. What do you think? Thank you! -- Elena _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |