[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RESEND 05/12] xen: numa-sched: make space for per-vcpu node-affinity
On mar, 2013-11-05 at 17:16 +0000, George Dunlap wrote: > Just to outline what the alternative would look like: The hypervisor > would focus on the minimum mechanisms required to do something useful > for NUMA systems. The domain NUMA affinity would be only used for > memory allocation. vcpus would only have "hard" and "soft" affinities. > The toolstack (libxl? xl?) would be responsible for stitching these > together into a useable interface for NUMA: e.g., it would have the > concept of "numa affinity" for vcpus (or indeed, virtual NUMA > topologies), and would do things like update the domain NUMA affinity > based on vcpu affinities. > > This would mean the toolstack either assuming, when someone calls > vcpu_set_node_affinity, that soft_affinity == numa_affinity, or keeping > its own copy of numa_affinity for each vcpu around somewhere. > And to elaborate a bit more what I said yesterday night, now that I have the code in front of me, going for the above would actually mean the following. In domain.c we have domain_update_node_affinity(). What it does *before* this series is calculating d->node_affinity basing on all the vcpu's cpu_affinity (i.e., pinning). What it does *after* this series is calculating d->node_affinity besing on _vcpu's_ node_affinity. (*) Such function is currently called, basically, when a new vcpu is allocated (alloc_vcpu()), when a domain changes cpupool (sched_move_domain()), when the cpupool the domain is in changes (cpupool_assign_cpu_locked() or cpupool_unassign_cpu(). That means that all the above operations _automatically_ affect d->node_affinity. Now, we're talking about killing vc->cpu_affinity and not introducing vc->node_affinity and, instead, introduce vc->cpu_hard_affinity and vc->cpu_soft_affinity and, more important, not to link any of the above to d->node_affinity. That means all the above operations _will_NOT_ automatically affect d->node_affinity any longer, at least from the hypervisor (and, most likely, libxc) perspective. OTOH, I'm almost sure that I can force libxl (and xl) to retain the exact same behaviour it is exposing to the user (just by adding an extra call when needed). So, although all this won't be an issue for xl and libxl consumers (or, at least, that's my goal), it will change how the hypervisor used to behave in all those situations. This means that xl and libxl users will see no change, while folks issuing hypercalls and/or libxc calls will. Is that ok? I mean, I know there are no stability concerns for those APIs, but still, is that an acceptable change? Regards, Dario (*) yes, in both cases (before and after this series), it is possible already that d->node_affinity is not automatically calculated, but that it just stick to something the toolstack provided. That will stay, so it's pretty much irrelevant to this discussion... Actually, it won't just "stay", it will become the sole and only case! -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |