[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions))

To: Jan Beulich <JBeulich@xxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>, Aravind Gopalakrishnan <Aravind.Gopalakrishnan@xxxxxxx>
From: Ian Campbell <ian.campbell@xxxxxxxxxx>
Date: Wed, 24 Jun 2015 10:38:43 +0100
Cc: Lars Kurth <lars.kurth@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Anthony Perard <anthony.perard@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 24 Jun 2015 09:44:26 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

Adding Boris+Suravee+Aravind (AMD/SVM maintainers), Dario (NUMA) and Jim
+Anthony (libvirt) to the CC.

TL;DR osstest is exposing issues running on "AMD Opteron(tm) Processor
6376" in at least a couple of test cases. It would be good if someone
from AMD could have a look.

The systems here == merlot[01], which seem to be having with win7 live
migration tests as well as libvirt when starting PV guests. They each
contain "AMD Opteron(tm) Processor 6376" processors with 32 threads in 4
nodes and seem to have a strange NUMA layout with no RAM on nodes 1 or
3.

The test history on these machines:
http://logs.test-lab.xenproject.org/osstest/results/host/merlot0.html
http://logs.test-lab.xenproject.org/osstest/results/host/merlot1.html

I just posted some analysis of the windows cases (including experiments
on the old Cambridge test infra with "AMD Opteron(tm) Processor 6168"
processes) in:
http://lists.xen.org/archives/html/xen-devel/2015-06/msg03713.html

I've also been investigating the libvirt guest-start failures. The
symptom is a 10s timeout starting qemu. Anthony is seeing this with
openstack too and did some analysis in
http://thread.gmane.org/gmane.comp.emulators.xen.devel/246473/focus=249172 
onwards, but it may be that this is unrelated to the osstest failures and that 
for Anthony's scenario the 10s timeout could be explained by the openstack 
tempest tests starting lots of VMs in parallel.

However for the osstests we are starting a single PV domain on an
otherwise idle host. There should be no reason for qemu to take as long
as 10s to come up in that case, even with pessimal NUMA layout (IMHO at
least). By comparison on other hosts starting qemu seems to take 2-4s,
so merlot is at least 2.5-5 times worse.

I tried running some adhoc tests on the old infra tied to the *-frog
machines (which are the Opteron 6168 ones):
http://xenbits.xen.org/people/ianc/tmp/adhoc/37623/
http://xenbits.xen.org/people/ianc/tmp/adhoc/37625/
The -xsm failures are because I botched the flight configuration, the
interesting information is that the other ones passed both times
(migrate-support is expected to fail at the moment).

Supposing that the NUMA oddities might be what is exposing this issue I
tried an adhoc run on the merlot machines where I specified
"dom0_max_vcpus=8 dom0_nodes=0" on the hypervisor command line:
http://logs.test-lab.xenproject.org/osstest/logs/58853/

Again, I messed up the config for the -xsm case, so ignore.

The interesting thing is that the extra NUMA settings were
apparently_not_ helpful. From
http://logs.test-lab.xenproject.org/osstest/logs/58853/test-amd64-amd64-libvirt/serial-merlot0.log
 I can see they were applied:
Jun 23 15:50:34.205057 (XEN) Command line: placeholder conswitch=x watchdog 
com1=115200,8n1 console=com1,vga gdb=com1 dom0_mem=512M,max:512M ucode=scan 
dom0_max_vcpus=8 dom0_nodes=0
[...]
Jun 23 15:50:38.309057 (XEN) Dom0 has maximum 8 VCPUs

The memory info
Jun 23 15:56:27.749008 (XEN) Memory location of each domain:
Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072):
Jun 23 15:56:27.756983 (XEN)     Node 0: 126905
Jun 23 15:56:27.756998 (XEN)     Node 1: 0
Jun 23 15:56:27.764952 (XEN)     Node 2: 4167
Jun 23 15:56:27.764969 (XEN)     Node 3: 0
suggests at least a small amount of cross-node memory allocation (16M
out of dom0s 512M total). That's probably small enough to be OK.

And it seems as if the 8 dom0 vcpus are correctly pinned to the first 8
cpus (the ones in node 0):
Jun 23 15:56:43.797055 (XEN) VCPU information and callbacks for domain 0:
Jun 23 15:56:43.797110 (XEN)     VCPU0: CPU4 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={4}
Jun 23 15:56:43.805078 (XEN)     cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.813121 (XEN)     pause_count=0 pause_flags=1
Jun 23 15:56:43.813157 (XEN)     No periodic timer
Jun 23 15:56:43.821050 (XEN)     VCPU1: CPU3 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={3}
Jun 23 15:56:43.829044 (XEN)     cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.829082 (XEN)     pause_count=0 pause_flags=1
Jun 23 15:56:43.837051 (XEN)     No periodic timer
Jun 23 15:56:43.837084 (XEN)     VCPU2: CPU5 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={5}
Jun 23 15:56:43.845102 (XEN)     cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.853035 (XEN)     pause_count=0 pause_flags=1
Jun 23 15:56:43.853071 (XEN)     No periodic timer
Jun 23 15:56:43.853099 (XEN)     VCPU3: CPU7 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={7}
Jun 23 15:56:43.861102 (XEN)     cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.869110 (XEN)     pause_count=0 pause_flags=1
Jun 23 15:56:43.869145 (XEN)     No periodic timer
Jun 23 15:56:43.877014 (XEN)     VCPU4: CPU0 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={}
Jun 23 15:56:43.877038 (XEN)     cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.885053 (XEN)     pause_count=0 pause_flags=1
Jun 23 15:56:43.885088 (XEN)     No periodic timer
Jun 23 15:56:43.893085 (XEN)     VCPU5: CPU0 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={}
Jun 23 15:56:43.901075 (XEN)     cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.901134 (XEN)     pause_count=0 pause_flags=1
Jun 23 15:56:43.909010 (XEN)     No periodic timer
Jun 23 15:56:43.909048 (XEN)     VCPU6: CPU2 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={2}
Jun 23 15:56:43.917065 (XEN)     cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.925055 (XEN)     pause_count=0 pause_flags=1
Jun 23 15:56:43.925074 (XEN)     No periodic timer
Jun 23 15:56:43.925095 (XEN)     VCPU7: CPU6 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={6}
Jun 23 15:56:43.933119 (XEN)     cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.941080 (XEN)     pause_count=0 pause_flags=1
Jun 23 15:56:43.941129 (XEN)     No periodic timer

So whatever the issue is it doesn't seem to be particularly related to
the strange NUMA layout.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions))
  - From: Ian Campbell
- Re: [Xen-devel] Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions))
  - From: Dario Faggioli

References:
- [Xen-devel] [xen-4.2-testing test] 58584: regressions - trouble: blocked/broken/fail/pass
  - From: osstest service user
- [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
  - From: Jan Beulich
- Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
  - From: Ian Jackson
- Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
  - From: Jan Beulich
- Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
  - From: Ian Campbell
- Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
  - From: Jan Beulich
- Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
  - From: Ian Campbell
- Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
  - From: Ian Campbell

Prev by Date: Re: [Xen-devel] [PATCH v2 03/12] x86/HVM: Hardware alternate p2m support detection.
Next by Date: Re: [Xen-devel] [PATCH v2 02/12] VMX: implement suppress #VE.
Previous by thread: Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
Next by thread: Re: [Xen-devel] Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions))
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.