[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Regression, host crash with 4.5rc1
(Please forgive my lack of Xen-fu knowledge in advance) If this issue were to happen on Linux/bare-metal, this is how I'd debug it. Hopefully some of this will translate to Xen in one way or another. dmesg | grep idle will tell us what idle driver is running (on Dom0 kernel) and if it is intel_idle, it will also tell us the supported sub-states (CPUID.MWAIT.EDX value) grep . /sys/devices/system/cpu/cpu0/cpuidle/*/* will tell us what states the OS is requesting, It will expand on the "FFH" bit here: > > (XEN) C1: type[C1] latency[003] usage[12219860] method[ FFH] > > duration[1190961948551] > > (XEN) C2: type[C1] latency[010] usage[10205554] method[ FFH] > > duration[2015393965907] > > (XEN) C3: type[C2] latency[020] usage[50926286] method[ FFH] > > duration[30527997858148] I'm hopeful that this information comes from the hardware's BIOS and not some hypervisor tricking out Dom0 with a fake BIOS, yes? If Xen doesn't have cpuidle, or its sysfs, then acpidump for the platform should be able to tell us what the platform is exporting. Next, hopefully the attached turbostat utility can be invoked on Dom0 and it can read the MSRs on at least 1 processor via the /dev/cpu interface. This will tell you what the hardware supports, and what HW states are actually being invoked. (which may be different from what the OS asks for...) It may tell us just the same thing I think we learned here: > > (XEN) PC2[0] PC3[8589642315848] PC6[0] PC7[0] > > (XEN) CC3[28794734145697] CC6[0] CC7[0] which I'm assuming are a dump of the MSR residency counters. If yes, it appears to be that this platform is not invoking c6 and pc6 at all, and that the deepest state being used is actually cc3 and pc3. I don't know if that is because you've booted the kernel with max_cstate=N of some kind, or if this is default. attached is turbostat, source and binary, run it this way and send the ts.out file: # ./turbostat --debug sleep 5 > ts.out 2>&1 Guessing... If no surprises in the debug stuff requested above, and If the XEN debug stuff above is with c6 explicitly disabled... Note that here are two kinds of c6 -- CC6 (core) and PC6 (package). If this box supports both, the next thing to try will be to keep CC6 enabled, but to just disable PC6. This is done via an MSR that turbostat dumps out (MSR_NHM_SNB_PKG_CST_CFG_CTL) via the wrmsr(8) utility. Though if that MSR is locked by the BIOS, then BIOS SETUP option may be the only way to disable the package C-state limit without also disabling the associated core C-state. cheers, -Len ps. Attachment:
turbostat-test.tar.gz _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |