[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Regression, host crash with 4.5rc1



On 11/25/2014 03:00 AM, Jan Beulich wrote:
Okay, so it's not really the mwait-idle driver causing the regression,
but it is C-state related. Hence we're now down to seeing whether all
or just the deeper C states are affected, i.e. I now need to ask you
to play with "max_cstate=". For that you'll have to remember that the
option's effect differs between the ACPI and the MWAIT idle drivers.
In the spirit of bisection I'd suggest using "max_cstate=2" first no
matter which of the two scenarios you pick. If that still hangs,
"max_cstate=1" obviously is the only other thing to try. Should that
not hang (and you left out "mwait-idle=0"), trying "max_cstate=3"
in that same scenario would be the other case to check.

No need for 'd' and 'a' output for the time being, but 'c' output would
be much appreciated for all cases where you observe hangs.


Okay, working through that now. I tried max_cstate=2 and got no hangs, whether with or without mwait-idle=0. However, I was puzzled by this:

(XEN) 'c' pressed -> printing ACPI Cx structures
(XEN) ==cpu0==
(XEN) active state:             C0
(XEN) max_cstate:               C2
(XEN) states:
(XEN) C1: type[C1] latency[003] usage[12219860] method[ FFH] duration[1190961948551] (XEN) C2: type[C1] latency[010] usage[10205554] method[ FFH] duration[2015393965907] (XEN) C3: type[C2] latency[020] usage[50926286] method[ FFH] duration[30527997858148]
(XEN)    *C0:   usage[73351700] duration[9974627547595]
(XEN) max=0 pwr=0 urg=0 nxt=0
(XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0]
(XEN) CC3[28794734145697] CC6[0] CC7[0]
(XEN) ==cpu1==
(XEN) active state:             C3
(XEN) max_cstate:               C2
(XEN) states:
(XEN) C1: type[C1] latency[003] usage[10699950] method[ FFH] duration[1141422044112] (XEN) C2: type[C1] latency[010] usage[06382904] method[ FFH] duration[1329739264322] (XEN) *C3: type[C2] latency[020] usage[44630764] method[ FFH] duration[31676618425954]
(XEN)     C0:   usage[61713618] duration[9561201640320]
(XEN) max=0 pwr=0 urg=0 nxt=0
(XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0]
(XEN) CC3[30066495105056] CC6[0] CC7[0]
(XEN) ==cpu2==
(XEN) active state:             C3
(XEN) max_cstate:               C2
(XEN) states:
(XEN) C1: type[C1] latency[003] usage[10829791] method[ FFH] duration[1145244102917] (XEN) C2: type[C1] latency[010] usage[06392468] method[ FFH] duration[1330830147023] (XEN) *C3: type[C2] latency[020] usage[44705668] method[ FFH] duration[31741190985486]
(XEN)     C0:   usage[61927927] duration[9491716216846]
(XEN) max=0 pwr=0 urg=0 nxt=0
(XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0]
(XEN) CC3[30117696095715] CC6[0] CC7[0]
(XEN) ==cpu3==
(XEN) active state:             C3
(XEN) max_cstate:               C2
(XEN) states:
(XEN) C1: type[C1] latency[003] usage[10692336] method[ FFH] duration[1144876437514] (XEN) C2: type[C1] latency[010] usage[06394051] method[ FFH] duration[1333961503379] (XEN) *C3: type[C2] latency[020] usage[44744178] method[ FFH] duration[31803488799434]
(XEN)     C0:   usage[61830565] duration[9426654792908]
(XEN) max=0 pwr=0 urg=0 nxt=0
(XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0]
(XEN) CC3[30191557548300] CC6[0] CC7[0]
(XEN) ==cpu4==
(XEN) active state:             C3
(XEN) max_cstate:               C2
(XEN) states:
(XEN) C1: type[C1] latency[003] usage[10746634] method[ FFH] duration[1144044534459] (XEN) C2: type[C1] latency[010] usage[06444054] method[ FFH] duration[1340637424913] (XEN) *C3: type[C2] latency[020] usage[44690901] method[ FFH] duration[31663207165902]
(XEN)     C0:   usage[61881589] duration[9561092487876]
(XEN) max=0 pwr=0 urg=0 nxt=0
(XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0]
(XEN) CC3[30049235012919] CC6[0] CC7[0]
(XEN) ==cpu5==
(XEN) active state:             C3
(XEN) max_cstate:               C2
(XEN) states:
(XEN) C1: type[C1] latency[003] usage[10694684] method[ FFH] duration[1140625901110] (XEN) C2: type[C1] latency[010] usage[06461563] method[ FFH] duration[1342115502967] (XEN) *C3: type[C2] latency[020] usage[44834522] method[ FFH] duration[31719560664023]
(XEN)     C0:   usage[61990769] duration[9506679619986]
(XEN) max=0 pwr=0 urg=0 nxt=0
(XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0]

Why would some of the cores be in C3 even though they list max_cstate as C2?

Steve

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.