Re: [Xen-devel] HVM domains crash after upgrade from XEN 4.5.1 to 4.5.2

Hi Andrew,
thanks for your reply. Answers are inline further down.

Am 12.11.15 um 14:01 schrieb Andrew Cooper:
On 12/11/15 12:52, Jan Beulich wrote:
On 12.11.15 at 02:08, <ariel.atom2@xxxxxxxxxx> wrote:
After the upgrade HVM domUs appear to no longer work - regardless of the
dom0 kernel (tested with both 3.18.9 and 4.1.7 as the dom0 kernel); PV
domUs, however, work just fine as before on both dom0 kernels.

xl dmesg shows the following information after the first crashed HVM
domU which is started as part of the machine booting up:
(XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest
state (0).
(XEN) ************* VMCS Area **************
(XEN) *** Guest State ***
(XEN) CR0: actual=0x0000000000000039, shadow=0x0000000000000011,
(XEN) CR4: actual=0x0000000000002050, shadow=0x0000000000000000,
(XEN) CR3: actual=0x0000000000800000, target_count=0
(XEN)      target0=0000000000000000, target1=0000000000000000
(XEN)      target2=0000000000000000, target3=0000000000000000
(XEN) RSP = 0x0000000000006fdc (0x0000000000006fdc)  RIP =
0x0000000100000000 (0x0000000100000000)
Other than RIP looking odd for a guest still in non-paged protected
mode I can't seem to spot anything wrong with guest state.
odd? That will be the source of the failure.

Out of long mode, the upper 32bit of %rip should all be zero, and it
should not be possible to set any of them.

I suspect that the guest has exited for emulation, and there has been a
bad update to %rip.  The alternative (which I hope is not the case) is
that there is a hardware errata which allows the guest to accidentally
get it self into this condition.

Are you able to rerun with a debug build of the hypervisor?
Given that I am compiling from source under gentoo and provided you lend me a helping hand in case I get stuck, I am confident that this is possible.

gentoo has three xen packages (they call those ebuilds) as follows
all of which are installed on my system. The former two offer a debug USE-flag and I assume that debug code for the latter is not required as this is for (the still working) PV domUs only. Furthermore as you are talking about the hypervisor, I guess it is safe to assume that it is app-emulation/xen and not xen-tools. Right?

BTW: The description of the debug USE flag reads as follows:
Enable extra debug codepaths, like asserts and extra output. If you want to get meaningful backtraces see https://wiki.gentoo.org/wiki/Project:Quality_Assurance/Backtraces
I assume that backtraces are probably not required to get things moving.

Another question is whether prior to enabling the debug USE flag it might make sense to re-compile with gcc-4.8.5 (please see my previous list reply) to rule out any compiler related issues. Jan, Andrew - what are your thoughts?

Many thanks Atom2

