[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 00/15] Implement per-vcpu NUMA node-affinity for credit1

On 11/05/2013 11:29 AM, Dario Faggioli wrote:
On gio, 2013-10-03 at 19:45 +0200, Dario Faggioli wrote:
Hi everyone,


Despite being a rather big series, it shouldn't be too controversial.
For that reason, I was hoping hat it could go in before code freeze, but
I need some review ASAP. :-P

Some minor and independent patches have been already applied by IanC...
I'm happy to resend the series, if that would be helpful, just let me

This was basically next on my list. :-)

Have you rebased locally -- does this still apply fairly cleanly to tip? If not, a re-send would be nice; otherwise I can just pull your git tree and rebase myself.


Thanks and Regards,

So, this series introduces the concept of per-vcpu NUMA node-affinity. In fact,
up to now, node-affinity has only been "per-domain". That means it was the
domain that had a node-affinity and:
  - that node-affinity was used to decide where to allocate the memory for the
  - that node-affinity was used to decide on what nodes _all_ the vcpus of the
    domain prefer to be scheduled.

After this series this changes like this:
  - each vcpu of a domain has (well, may have) its own node-affinity, and that
    is what is used to determine (if the credit1 scheduler is used) where each
    specific vcpu prefers to run;
  - the node-affinity of the whole domain is the _union_ of all the
    node-affinities of the domain's vcpus;
  - the memory is still allocated following what the node-affinity of the whole
    domain (so, the union of vcpu node-affinities, as said above) says.

In practise, it's not such a big change, I'm just extending at the per-vcpu
level what we already had at the domain level. This is also making
node-affinity a lot more similar to vcpu-pinning, both in terms of functioning
and user interface. As a side efect, that simplify the scheduling code (at
least the NUMA-aware part) by quite a bit. Finally, and most important, this is
something that will become really important when we will start to support
virtual NUMA topologies, as, a that point, having the same node-affinity for
all the vcpus in a domain won't be enough any longer (we'll want the vcpus from
a particular vnode to have node-afinity with a particular pnode).

More detailed description of the mechanism and of the implementation choices
are provided in the changelogs and in the documentation (docs/misc and

One last thing is that this series relies on some other patches and series that
I sent on xen-devel already, but have not been applied yet.  I'm re-sending
them here, as a part of this series, so feel free to pick them up from here, if
wanting to apply them, or comment on them in this thread, if you want me to
change them.  In particular, patches 01 and 03, I already sent as single
patches, patches 04-07, I already sent them as a series. Sorry if that is a bit
clumsy, but I couldn't find a better way to do it. :-)

In the detailed list of patches below, 'x' means previously submitted, '*'
means already acked/reviewed-by.

Finally, Elena, that is not super important, but perhaps, in the next release
of your vNUMA series, you could try to integrate it with this (and of course,
ask if you need anything while trying to do that).

Matt, if/when you eventually get to release, even as RFC or something like
that, your HVM vNUMA series, we can try to figure out how to integrate that
with this, so to use node-affinity instead than pinning.

The series is also available at the following git coordinates:

  git://xenbits.xen.org/people/dariof/xen.git numa/per-vcpu-affinity-v1

Let me know what you think about all this.


PS. Someone of you probably received part of this series as a direct message
(i.e., with your address in 'To', rather than in 'Cc'). I'm sincerely sorry for
that, messed up with `stg mail'... Won't happen again, I promise! :-P


Dario Faggioli (15):
  x *  xl: update the manpage about "cpus=" and NUMA node-affinity
       xl: fix a typo in main_vcpulist()
  x *  xen: numa-sched: leave node-affinity alone if not in "auto" mode
  x *  libxl: introduce libxl_node_to_cpumap
  x    xl: allow for node-wise specification of vcpu pinning
  x *  xl: implement and enable dryrun mode for `xl vcpu-pin'
  x    xl: test script for the cpumap parser (for vCPU pinning)
       xen: numa-sched: make space for per-vcpu node-affinity
       xen: numa-sched: domain node-affinity always comes from vcpu 
       xen: numa-sched: use per-vcpu node-affinity for actual scheduling
       xen: numa-sched: enable getting/specifying per-vcpu node-affinity
       libxc: numa-sched: enable getting/specifying per-vcpu node-affinity
       libxl: numa-sched: enable getting/specifying per-vcpu node-affinity
       xl: numa-sched: enable getting/specifying per-vcpu node-affinity
       xl: numa-sched: enable specifying node-affinity in VM config file

  docs/man/xl.cfg.pod.5                           |   88 ++++
  docs/man/xl.pod.1                               |   25 +
  docs/misc/xl-numa-placement.markdown            |  124 ++++--
  tools/libxc/xc_domain.c                         |   90 ++++-
  tools/libxc/xenctrl.h                           |   19 +
  tools/libxl/check-xl-vcpupin-parse              |  294 +++++++++++++++
  tools/libxl/check-xl-vcpupin-parse.data-example |   53 +++
  tools/libxl/libxl.c                             |   28 +
  tools/libxl/libxl.h                             |   11 +
  tools/libxl/libxl_dom.c                         |   18 +
  tools/libxl/libxl_numa.c                        |   14 -
  tools/libxl/libxl_types.idl                     |    1
  tools/libxl/libxl_utils.c                       |   22 +
  tools/libxl/libxl_utils.h                       |   15 +
  tools/libxl/xl.h                                |    1
  tools/libxl/xl_cmdimpl.c                        |  458 +++++++++++++++++++----
  tools/libxl/xl_cmdtable.c                       |   11 -
  xen/common/domain.c                             |   97 ++---
  xen/common/domctl.c                             |   47 ++
  xen/common/keyhandler.c                         |    6
  xen/common/sched_credit.c                       |   63 ---
  xen/common/schedule.c                           |   55 +++
  xen/include/public/domctl.h                     |    8
  xen/include/xen/sched-if.h                      |    2
  xen/include/xen/sched.h                         |   13 +
  xen/xsm/flask/hooks.c                           |    2
  26 files changed, 1282 insertions(+), 283 deletions(-)
  create mode 100755 tools/libxl/check-xl-vcpupin-parse
  create mode 100644 tools/libxl/check-xl-vcpupin-parse.data-example

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.