Xen project Mailing List

Re: [Xen-devel] Xen HVM regression on certain Intel CPUs

On 28.03.2013 17:39, Stefan Bader wrote: > On 28.03.2013 16:02, Stefan Bader wrote: >> On 28.03.2013 14:34, Jan Beulich wrote: >>>>>> On 27.03.13 at 18:23, "H. Peter Anvin" <hpa@xxxxxxxxx> wrote: >>>> On 03/27/2013 10:17 AM, Stefan Bader wrote: >>>>>> What does x86info and /proc/cpuinfo show in HVM? >>>>> >>>>> x86info cpuid[7].ebx = 0xbbb and /proc/cpuinfo also shows smep >>>>> set. >>>> >>>> On all CPUs? >>>> >>>>>> The inbound %cr4 shouldn't matter at all, we try to not rely on >>>>>> it. >>>>>> >>>>>> If the hypervisor presents SMEP to the guest then the guest is >>>>>> pretty obviously going to try to use it. >>>>> >>>>> To me it looks like when bootstrapping the APs things are not yet >>>>> ready to use it. If I did not miss something, the only place that >>>>> the saved contents of cr4 are used is in startup_32 when the cpus >>>>> are brought up. And then just stop dead. Would need to read more >>>>> code but a bit weird why the BP is not affected. >>>> >>>> This feels like a bug in Xen, but I don't know for sure yet. Either >>>> which way, it is odd. That write to cr4 should be entirely legitimate. >>> >>> And I would guess one that got fixed already. >>> >>> Stefan, please try 4.2.2-rc1, or (separately) >>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=485f374230d39e153d7b9786e3d0336bd52ee661 >>> (which I think requires the immediately preceding >>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=1e6275a95d3e35a72939b588f422bb761ba82f6b >>> too). >> >> The backing explanation does make a lot of sense in reasoning what is going >> wrong. Unfortunately the two patches above on their own do not fix the >> problem >> (I will try to make another go with 4.2.2-rc1). > > The whole of 4.2.2-rc1 has the same (smep still present in > trampoline_cr4_features) outcome. >> >> For a bit more info I am running a kernel inside the HVM guest which shows >> the >> contents of the cr4 shadow used in the trampoline. Out of interest I compared >> those values to the ones used on a bare metal boot and both are identical >> (0x1407F0). >> >> That somehow gives some explanation for the patch above failing. Looking at >> the >> code for cr4 updates in vmx_update_guest_cr() a few lines above the new SMEP >> handling, there already was code which would clear the PAE flag when >> paging_mode_hap(v->domain) was true. And that would need to be true if the >> SMEP >> flag should get cleared. And the PAE flag was (and has to be) set before. >> > >> Will be looking into this further. > Going back to gather more info and to find some fix. > I added some more debugging output to the hypervisor to verify the state of HAP. This showed that while HAP is available on the system, it is not used for the HVM guests. It looks like this would require some flags to be set when creating the guest domains and I assume this is not happening because I have to stay with the xm stack for the libvirt setup for now (requires some repackaging which hasn't been done, yet). So the guest isn't using HAP but does seem to use some form of paging even if the guest VCPU is not using paging. So I changed the vmx_update_guest_cr() function in that way and that seems to prevent the hangs. Does this look like a reasonable upstream Xen change? From eccbc4cf0916c6d4388f658965c79770bd0ba10f Mon Sep 17 00:00:00 2001 From: Stefan Bader <stefan.bader@xxxxxxxxxxxxx> Date: Wed, 3 Apr 2013 12:06:24 +0200 Subject: [PATCH] VMX: Always disable SMEP when guest is in non-paging mode commit e7dda8ec9fc9020e4f53345cdbb18a2e82e54a65 VMX: disable SMEP feature when guest is in non-paging mode disabled the SMEP bit if a guest VCPU was using HAP and was not in paging mode. However I could observe VCPUs getting stuck in the trampoline after the following patch in the Linux kernel changed the way CR4 gets set up: x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline The change will set CR4 from already set flags which includes the SMEP bit. On bare metal this does not matter as the CPU is in non- paging mode at that time. But Xen seems to use the emulated non- paging mode regardless of HAP (I verified that on the guests I was seeing the issue, HAP was not used). Therefor it seems right to unset the SMEP bit for a VCPU that is not in paging-mode, regardless of its HAP usage. Signed-off-by: Stefan Bader <stefan.bader@xxxxxxxxxxxxx> --- xen/arch/x86/hvm/vmx/vmx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 04dbefb..a869ed4 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -1161,13 +1161,16 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr) if ( paging_mode_hap(v->domain) && !hvm_paging_enabled(v) ) { v->arch.hvm_vcpu.hw_cr[4] |= X86_CR4_PSE; v->arch.hvm_vcpu.hw_cr[4] &= ~X86_CR4_PAE; + } + if ( !hvm_paging_enabled(v) ) + { /* * SMEP is disabled if CPU is in non-paging mode in hardware. * However Xen always uses paging mode to emulate guest non-paging - * mode with HAP. To emulate this behavior, SMEP needs to be - * manually disabled when guest switches to non-paging mode. + * mode. To emulate this behavior, SMEP needs to be manually + * disabled when guest VCPU is in non-paging mode. */ v->arch.hvm_vcpu.hw_cr[4] &= ~X86_CR4_SMEP; } __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.