[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] NUMA TODO-list for xen-devel



On 01/08/12 17:58, Dario Faggioli wrote:
> On Wed, 2012-08-01 at 17:32 +0100, Anil Madhavapeddy wrote:
>> On 1 Aug 2012, at 17:16, Dario Faggioli <raistlin@xxxxxxxx> wrote:
>>
>>>    - Inter-VM dependencies and communication issues. If a workload is
>>>      made up of more than just a VM and they all share the same (NUMA)
>>>      host, it might be best to have them sharing the nodes as much as
>>>      possible, or perhaps do right the opposite, depending on the
>>>      specific characteristics of he workload itself, and this might be
>>>      considered during placement, memory migration and perhaps
>>>      scheduling.
>>>
>>>    - Benchmarking and performances evaluation in general. Meaning both
>>>      agreeing on a (set of) relevant workload(s) and on how to extract
>>>      meaningful performances data from there (and maybe how to do that
>>>      automatically?).
>>
>> I haven't tried out the latest Xen NUMA features yet, but we've been
>> keeping track of the IPC benchmarks as we get newer machines here:
>>
> 
>> http://www.cl.cam.ac.uk/research/srg/netos/ipc-bench/results.html
>>
> Wow... That's really cool. I'll definitely take a deep look at all these
> data! I'm also adding the link to the wiki, if you're fine with that...

No problem with adding a link, as this is public data :) If possible,
it'd be splendid to put a note next to this link encouraging people to
submit their own results -- doing so is very simple, and helps us extend
the database. Instructions are at
http://www.cl.cam.ac.uk/research/srg/netos/ipc-bench/ (or, for a short
link, http://fable.io).

>> Happy to share the raw data if you have cycles to figure out the best
>> way to auto-place multiple VMs so they are near each other from a memory
>> latency perspective.  
>>
> I don't have anything precise in mind yet, but we need to think about
> this.

While there has been plenty of work on optimizing co-location of
different kinds of workloads, there's relatively little work (that I am
aware of) on VM scheduling in this environment. One (sadly somewhat
lacking) paper at HotCloud this year [1] looked at NUMA-aware VM
migration to balance memory accesses. Of greater interest is possibly
the Google ISCA paper on the detrimental effect of sharing
micro-architectural resources between different kinds of workloads,
although it is not explicitly focused on NUMA, and the metrics are
defined with regards to specific classes of latency-sensitive jobs [2].

One interesting thing to look at (that we haven't looked at yet) is what
memory allocators do about NUMA these days; there is an AMD whitepaper
from 2009 discussing the performance benefits of a NUMA-aware version of
tcmalloc [3], but I have found it hard to reproduce their results on
modern hardware. Of course, being virtualized may complicate matters
here, since the memory allocator can no longer freely pick and choose
where to allocate from.

Scheduling, notably, is key here, since the CPU a process is scheduled
on may determine where its memory is allocated -- frequent migrations
are likely to be bad for performance due to remote memory accesses,
although we have been unable to quantify a significant difference on
non-synthetic macrobenchmarks; that said, we did not try very hard so far.

Cheers,
Malte

[1] - Ahn et al., "Dynamic Virtual Machine Scheduling in Clouds for
Architectural Shared Resources", in Proceedings of HotCloud 2012,
https://www.usenix.org/conference/hotcloud12/dynamic-virtual-machine-scheduling-clouds-architectural-shared-resources

[2] - Tang et al., "The impact of memory subsystem resource sharing on
datacenter applications", in Proceedings of ISCA 2011,
http://dl.acm.org/citation.cfm?id=2000099

[3] -
http://developer.amd.com/Assets/NUMA_aware_heap_memory_manager_article_final.pdf

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.