[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split

To: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
From: Andre Przywara <andre.przywara@xxxxxxx>
Date: Tue, 1 Feb 2011 17:32:25 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Stephan, Diestelhorst <stephan.diestelhorst@xxxxxxx>
Delivery-date: Tue, 01 Feb 2011 08:35:35 -0800
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi folks,

I asked Stephan Diestelhorst for help and after I convinced him thatremoving credit and making SEDF the default again is not an option heworked together with me on that ;-) Many thanks for that!

We haven't come to a final solution but could gather some debug data.

I will simply dump some data here, maybe somebody has got a clue. Wewill work further on this tomorrow.


First I replaced the BUG_ON with some printks to get some insight:
(XEN) sdom->active_vcpu_count: 18
(XEN) sdom->weight: 256
(XEN) weight_left: 4096, weight_total: 4096
(XEN) credit_balance: 0, credit_xtra: 0, credit_cap: 0
(XEN) Xen BUG at sched_credit.c:591
(XEN) ----[ Xen-4.1.0-rc2-pre  x86_64  debug=y  Not tainted ]----

So that one shows that the number of VCPUs is not up-to-date with thecomputed weight sum, we have seen a difference of one or two VCPUs (inthis case here the weight has been computed from 16 VCPUs). Also itshows that the assertion kicks in in the first iteration of the loop,where weight_left and weight_total are still equal.

So I additionally instrumented alloc_pdata and free_pdata, theunprefixed lines come from a shell script mimicking the functionality ofcpupool-numa-split.

------------
Removing CPUs from Pool 0
Creating new pool
Using config file "cpupool.test"
cpupool name:   Pool-node6
scheduler:      credit
number of cpus: 1
(XEN) adding CPU 36, now 1 CPUs
(XEN) removing CPU 36, remaining: 17
Populating new pool
(XEN) sdom->active_vcpu_count: 9
(XEN) sdom->weight: 256
(XEN) weight_left: 2048, weight_total: 2048
(XEN) credit_balance: 0, credit_xtra: 0, credit_cap: 0
(XEN) adding CPU 37, now 2 CPUs
(XEN) removing CPU 37, remaining: 16
(XEN) adding CPU 38, now 3 CPUs
(XEN) removing CPU 38, remaining: 15
(XEN) adding CPU 39, now 4 CPUs
(XEN) removing CPU 39, remaining: 14
(XEN) adding CPU 40, now 5 CPUs
(XEN) removing CPU 40, remaining: 13
(XEN) sdom->active_vcpu_count: 17
(XEN) sdom->weight: 256
(XEN) weight_left: 4096, weight_total: 4096
(XEN) credit_balance: 0, credit_xtra: 0, credit_cap: 0
(XEN) adding CPU 41, now 6 CPUs
(XEN) removing CPU 41, remaining: 12
...
Two thing startled me:

1) There is quite some between the "Removing CPUs" message from thescript and the actual HV printk showing it's done, why is that notsynchronous? Looking at the code it shows that__csched_vcpu_acct_start() is eventually triggered by a timer, shouldn'tthat be triggered synchronously by add/removal events?2) It clearly shows that each CPU gets added to the new pool _before_ itgets removed from the old one (Pool-0), isn't that violating the "onlyone pool per CPU" rule? Even it that is fine for a short period of time,maybe the timer kicks in in this very moment resulting in violatedinvariants?


Yours confused,
Andre.

George Dunlap wrote:

On Mon, Jan 31, 2011 at 2:59 PM, Andre Przywara <andre.przywara@xxxxxxx> wrote:

Right, that was also my impression.

I seemed to get a bit further, though:
By accident I found that in c/s 22846 the issue is fixed, it works now
without crashing. I bisected it down to my own patch, which disables the
NODEID_MSR in Dom0. I could confirm this theory by a) applying this single
line (clear_bit(NODEID_MSR)) to 22799 and _not_ seeing it crash and b) by
removing this line from 22846 and seeing it crash.

So my theory is that Dom0 sees different nodes on its virtual CPUs via the
physical NodeID MSR, but this association can (and will) be changed every
moment by the Xen scheduler. So Dom0 will build a bogus topology based upon
these values. As soon as all vCPUs of Dom0 are contained into one node (node
0, this is caused by the cpupool-numa-split call), the Xen scheduler somehow
hicks up.
So it seems to be bad combination caused by the NodeID-MSR (on newer AMD
platforms: sockets C32 and G34) and a NodeID MSR aware Dom0 (2.6.32.27).
Since this is a hypervisor crash, I assume that the bug is still there, only
the current tip will make it much less likely to be triggered.

Hope that help, I will dig deeper now.


Thanks.  The crashes you're getting are in fact very strange.  They
have to do with assumptions that the credit scheduler makes as part of
its accounting process.  It would only make sense for those to be
triggered if a vcpu was moved from one pool to another pool without
the proper accounting being done.  (Specifically, each vcpu is
classified as either "active" or "inactive"; and each scheduler
instance keeps track of the total weight of all "active" vcpus.  The
BUGs you're tripping over are saying that this invariant has been
violated.)  However, I've looked at the cpupools vcpu-migrate code,
and it looks like it does everything right.  So I'm a bit mystified.
My only thought is if possibly a cpumask somewhere that wasn't getting
set properly, such that a vcpu was being run on a cpu from another
pool.

Unfortunately I can't take a good look at this right now; hopefully
I'll be able to take a look next week.

Andre, if you were keen, you might go through the credit code and put
in a bunch of ASSERTs that the current pcpu is in the mask of the
current vcpu; and that the current vcpu is assigned to the pool of the
current pcpu, and so on.

 -George



--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split
  - From: Stephan Diestelhorst
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split
  - From: Juergen Gross

Prev by Date: [Xen-devel] [PATCH] amd iommu: Fix a xen crash after pci-attach
Next by Date: [Xen-devel] [PATCH] libxl: fix unsafe subtraction in libxl_set_memory_target
Previous by thread: [Xen-devel] [PATCH] amd iommu: Fix a xen crash after pci-attach
Next by thread: Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.