[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xsave=0 workaround needed on 3.2 kernels with Xen 4.1 or Xen-unstable.



On Thu, May 3, 2012 at 11:09 AM, AP <apxeng@xxxxxxxxx> wrote:
> On Thu, May 3, 2012 at 2:15 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>> On 02.05.12 at 20:42, AP <apxeng@xxxxxxxxx> wrote:
>>> On Wed, May 2, 2012 at 2:00 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>>>> On 30.04.12 at 21:37, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> 
>>>>>>> wrote:
>>>>> I somehow thought that this has been fixed but I've been
>>>>> getting reports that people are running into this.
>>>>
>>>> "this" being what? I too thought that all xsave related issues were
>>>> sorted out by now.
>>>
>>> I see the following crash if I run without xsave=0 with Ubuntu 11.10
>>> 3.0.0-17 kernel (Intel(R) Core(TM) i7-2620M). I don't see this with
>>> Xen 4.1.2. Looks like the OSXSAVE bit is not getting set in CR4.
>>
>> And in the thread starting at
>> http://lists.xen.org/archives/html/xen-devel/2012-04/msg00426.html
>> I gave debugging instructions that apparently no-one followed so
>> far. Without someone seeing the problem doing so I don't think we
>> will ever get anywhere with this (unless, as also indicated there,
>> someone can spot something wrong with the code that non-obvious
>> to everyone else).
>
> I missed that thread. I will add some debugging to
> pv_guest_cr4_fixup() and the XSETBV handling in
> emulate_privileged_op() and post the output.

I ran with the attached xsave_debug patch and saw the following output:
<snip>
(XEN) CPU: After generic identify, caps: bfebfbff 28100800 00000000
00000000 17bae3ff 00000000 00000001 00000000
(XEN) CPU: After vendor identify, caps: bfebfbff 28100800 00000000
00000000 17bae3ff 00000000 00000001 00000000
(XEN) xstate_init: using cntxt_size: 0x340 and states: 0x7
(XEN) CPU: After all inits, caps: bfebfbff 28100800 00000000 00003f40
17bae3ff 00000000 00000001 00000000
<snip>
(XEN) domain.c:704:d0 vcpu[0] hv_cr4: 0x2660 hv_cr4_mask:
0xfffffffffffbfff3 returning cr4: 0x2660
(XEN) traps.c:2409:d0 vcpu[0] pv cr4: 0x2660 write CR4: 0x426f0
(XEN) domain.c:704:d0 vcpu[0] hv_cr4: 0x2660 hv_cr4_mask:
0xfffffffffffbfff3 returning cr4: 0x2660
(XEN) traps.c:2409:d0 vcpu[0] pv cr4: 0x2660 write CR4: 0x426f0
(XEN) traps.c:874:d0 vcpu[0] cpuid XSAVE supported
(XEN) domain.c:704:d0 vcpu[0] hv_cr4: 0x2660 hv_cr4_mask:
0xfffffffffffbfff3 returning cr4: 0x2660
(XEN) traps.c:2409:d0 vcpu[0] pv cr4: 0x2660 write CR4: 0x426f0
(XEN) traps.c:2254:d0 XSETBV: lock: 0 rep_prefix: 0 opsize_prefix: 0 cr4: 0x2660
[    6.791945] invalid opcode: 0000 [#1] SMP
<snip>

>From the above I realized that X86_CR4_OSXSAVE was never getting set
in v->arch.pv_vcpu.ctrlreg[4]. So I tried the following patch:

diff -r 5a0d60bb536b xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c     Fri Apr 27 21:10:59 2012 -0700
+++ b/xen/arch/x86/domain.c     Fri May 04 12:23:57 2012 -0700
@@ -691,8 +691,6 @@ unsigned long pv_guest_cr4_fixup(const s
         hv_cr4_mask &= ~X86_CR4_DE;
     if ( cpu_has_fsgsbase && !is_pv_32bit_domain(v->domain) )
         hv_cr4_mask &= ~X86_CR4_FSGSBASE;
-    if ( xsave_enabled(v) )
-        hv_cr4_mask &= ~X86_CR4_OSXSAVE;

     if ( (guest_cr4 & hv_cr4_mask) != (hv_cr4 & hv_cr4_mask) )
         gdprintk(XENLOG_WARNING,
diff -r 5a0d60bb536b xen/include/asm-x86/domain.h
--- a/xen/include/asm-x86/domain.h      Fri Apr 27 21:10:59 2012 -0700
+++ b/xen/include/asm-x86/domain.h      Fri May 04 12:23:57 2012 -0700
@@ -530,7 +530,7 @@ unsigned long pv_guest_cr4_fixup(const s
      & ~X86_CR4_DE)
 #define real_cr4_to_pv_guest_cr4(c)                         \
     ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD        \
-             | X86_CR4_OSXSAVE | X86_CR4_SMEP))
+             | X86_CR4_SMEP))

 void domain_cpuid(struct domain *d,
                   unsigned int  input,

That allowed the system to boot successfully though I did see the
following message:
(XEN) domain.c:698:d0 Attempt to change CR4 flags 00042660 -> 00002660

Not sure if the above patch is right fix but I hope it was at least
helpful in pointing at where the problem might be.

BTW, I see the same invalid op issue with Xen 4.1.2 if I boot with xsave=1.

Thanks,
AP

Attachment: xsave_debug.patch
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.